indiv_outcome.tabular.LogisticRegression

class pytrial.tasks.indiv_outcome.tabular.logistic_regression.LogisticRegression(weight_decay=1, dual=False, epochs=100, experiment_id='test')[source]

Bases: pytrial.tasks.indiv_outcome.tabular.base.TabularIndivBase

Implement Logistic Regression model for tabular individual outcome prediction in clinical trials. Now only support binary classification.

Parameters

weigth_decay (float) – Regularization strength for l2 norm; must be a positive float. Like in support vector machines, smaller values specify weaker regularization.
dual (bool) – Dual or primal formulation. Dual formulation is only implemented for l2 penalty with liblinear solver. Prefer dual=False when n_samples > n_features.
epochs (int) – Maximum number of iterations taken for the solvers to converge.
experiment_id (str, optional (default='test')) – The name of current experiment. Decide the saved model checkpoint name.

fit(train_data, valid_data=None)[source]

Train logistic regression model to predict patient outcome with tabular input data.

Parameters

train_data (dict) –
{

‘x’: TabularPatientBase or pd.DataFrame,

’y’: pd.Series or np.ndarray

}
- ’x’ contain all patient features;
- ’y’ contain labels for each row.
valid_data (Ignored.) – Not used, present heare for API consistency by convention.

load_model(checkpoint=None)[source]

Save the learned logistic regression model to the disk.

Parameters

checkpoint (str or None) –

If a directory, the only checkpoint file .model will be loaded.
If a filepath, will load from this file;
If None, will load from self.checkout_dir.

predict(test_data)[source]

Make prediction probability based on the learned model. Save to self.result_dir.

Parameters

test_data (dict) –

{

‘x’: TabularPatientBase or pd.DataFrame,

’y’: pd.Series or np.ndarray

}

’x’ contain all patient features;
’y’ contain labels for each row. Ignored for prediction function.

Returns

ypred – The predicted probability for each patient.

For binary classification, return shape (n, );
For multiclass classification, return shape (n, n_class).

Return type

np.ndarray

save_model(output_dir=None)[source]

Save the learned logistic regression model to the disk.

Parameters: output_dir (str or None) – The dir to save the learned model. If set None, will save model to self.checkout_dir.