Medidata Solutions
Interpretable Data Augmentation Framework for Clinical Trials
Pages
10
Time to read
38 mins
Publication
Language
English
Pages
10
Time to read
38 mins
Publication
Language
English
This document is a research article that presents an interpretable data augmentation framework aimed at enhancing generative modeling for synthetic clinical trial data. The framework addresses the challenge of small sample sizes commonly found in clinical trial datasets, which can hinder the quality of generative models. It outlines the application of this framework to three distinct clinical trial datasets, evaluating the impact of various factors such as dataset size, generative algorithms, and augmentation scale on the quality metrics of synthetic data, including fidelity, utility, and privacy. The authors discuss the importance of generating high-quality synthetic data, especially in contexts where patient privacy and proprietary information are concerns. The framework consists of two main phases: augmentation of clinical trial data and generation of synthetic data using the augmented datasets. The results indicate that the proposed framework can significantly improve the quality of synthetic data produced by generative algorithms, making it a valuable contribution to the field of clinical research.