data.demo_data
- pytrial.data.demo_data.load_synthetic_ehr_sequence(input_dir=None, n_sample=None)[source]
Load synthetic EHR patient sequence data, which was generated by PromptEHR (https://arxiv.org/pdf/2211.01761.pdf).
- Parameters
input_dir (str) – The folder that stores the demo data. If None, we will download the demo data and save it to ‘./demo_data/synthetic_ehr’. Make sure to remove this folder if it is empty.
n_sample (int) – The number of samples we want to load. If None, all data will be loaded.
- pytrial.data.demo_data.load_trial_patient_sequence(input_dir=None)[source]
Load synthetic sequential trial patient records.
- Parameters
input_dir (str) – The folder that stores the demo data. If None, we will download the demo data and save it to ‘./demo_data/demo_patient_sequence/trial’. Make sure to remove this folder if it is empty.
- pytrial.data.demo_data.load_trial_patient_tabular(input_dir=None)[source]
Load synthetic tabular trial patient records.
- Parameters
input_dir (str) – The folder that stores the demo data. If None, we will download the demo data and save it to ‘./demo_data/demo_trial_patient_data’. Make sure to remove this folder if it is empty.
- pytrial.data.demo_data.load_trial_outcome_data(input_dir=None, phase='I', split='train')[source]
Load trial outcome prediction (TOP) benchmark data.
- Parameters
input_dir (str) – The folder that stores the demo data. If None, we will download the demo data and save it to ‘./demo_data/demo_trial_data’. Make sure to remove this folder if it is empty.
phase ({'I','II','III'}) – The phase of the trial data. Can be ‘I’, ‘II’, ‘III’.
split ({'train', 'test', 'valid'}) – The split of the trial data. Can be ‘train’, ‘test’, ‘valid’.
- pytrial.data.demo_data.load_trial_document_data(input_dir=None, n_sample=None, source='preprocessed', date='20221001')[source]
Load trial document data obtained from ClinicalTrials.gov.
- Parameters
input_dir (str) – The folder that stores the demo data. If None, we will download the demo data and save it to ‘’./demo_data/demo_trial_document’. Make sure to remove this folder if it is empty.
n_sample (int) – The number of samples we want to load. If None, all data will be loaded.
source ({'clinicaltrials.gov', 'preprocessed'}) – The source of the data. If ‘clinicaltrials.gov’, we will download the raw data from that website and process it. If ‘preprocessed’, we will load the preprocessed data.
date (str) – The date of the clinicaltrials.gov copy. Only valid when
source='clinicaltrials.gov'
.
- pytrial.data.demo_data.load_mimic_ehr_sequence(input_dir=None, n_sample=None)[source]
Load EHR patient sequence data, which needs to be accessed via https://physionet.org/content/mimiciii/1.4/.
- Parameters
input_dir (str) – The folder that stores the demo data. If None, we will look for the demo data in ‘./demo_data/demo_patient_sequence/ehr’.
n_sample (int) – The number of samples we want to load. If None, all data will be loaded.