indiv_outcome.sequence.StageNet

class pytrial.tasks.indiv_outcome.sequence.stagenet.StageNet(vocab_size, orders, mode, output_dim=None, max_visit=None, hidden_size=384, conv_size=10, levels=3, dropconnect=0.3, dropout=0.3, dropres=0.3, learning_rate=0.0001, weight_decay=0.0001, batch_size=64, epochs=10, num_worker=0, device='cuda:0', experiment_id='test')[source]

Implement StageNet for longitudinal patient records predictive modeling 1.

Parameters
  • vocab_size (list[int]) – A list of vocabulary size for different types of events, e.g., for diagnosis, procedure, medication.

  • orders (list[str]) – A list of orders when treating inputs events. Should have the same shape of vocab_size.

  • mode (str) – Prediction traget in [‘binary’,’multiclass’,’multilabel’,’regression’].

  • output_dim (int) – If binary classification, output_dim=1; If multiclass/multilabel classification, output_dim=n_class If regression, output_dim=1.

  • max_visit (int) – The maximum number of visits for input event codes.

  • hidden_size (int, optional (default = 8)) – The number of features of the hidden state h.

  • conv_size (int, optional (default = 10)) – The number of convolution kernels.

  • levels (int, optional (default = 3)) – The number of levels for the master gate.

  • dropconnect (float, optional (default = 0.3)) – The dropout rate for the input of the convolutional layer.

  • dropout (float, optional (default = 0.3)) – The dropout rate for the output of the RNN layer.

  • dropres (float, optional (default = 0.3)) – The dropout rate for the residual connection.

  • learning_rate (float) – Learning rate for optimization based on SGD. Use torch.optim.Adam by default.

  • weight_decay (float) – Regularization strength for l2 norm; must be a positive float. Smaller values specify weaker regularization.

  • batch_size (int) – Batch size when doing SGD optimization.

  • epochs (int) – Maximum number of iterations taken for the solvers to converge.

  • num_worker (int) – Number of workers used to do dataloading during training.

  • device (str) – The model device.

Notes

1

Gao, J., Xiao, C., Wang, Y., Tang, W., Glass, L. M., & Sun, J. (2020, April). Stagenet: Stage-aware neural networks for health risk prediction. In Proceedings of The Web Conference 2020 (pp. 530-540).

fit(train_data, valid_data)[source]

Train model with sequential patient records.

Parameters
  • train_data (SequencePatientBase) – A SequencePatientBase contains patient records where ‘v’ corresponds to visit sequence of different events; ‘y’ corresponds to labels.

  • valid_data (SequencePatientBase) – A SequencePatientBase contains patient records used to make early stopping of the model.

load_model(checkpoint)[source]

Load pretrained model from the disk.

Parameters

checkpoint (str) – The input directory that stores the trained pytorch model and configuration.

predict(test_data)[source]

Predict patient outcomes using longitudinal trial patient sequences.

Parameters

test_data (SequencePatient) – A SequencePatient contains patient records where ‘v’ corresponds to visit sequence of different events.

save_model(output_dir)[source]

Save the pretrained model to the disk.

Parameters

output_dir (str) – The output directory that stores the trained pytorch model and configuration.