ml.rf

Functions for the random forest classifier.

Functions

evaluate(rf_cls, data_split, mimos)

Evaluate the random forest classifiers and return confusion matrices for both features.

format_plots()

General plotting parameters for the Kulik Lab.

hyperparam_opt(data_split_type, include_esp, ...)

load_data(mimos, include_esp, data_loc)

Load data from CSV files for each mimo in the given list.

plot_confusion_matrices(cms, mimos)

Plot confusion matrices for distance and charge features.

plot_data(df_charge, df_dist, mimos)

Plot the average charge and distance data for the given MIMO types.

plot_gini_importance(rf_cls, df_dist, df_charge)

Plot Gini importance bar plots for the top 20 features for each feature type.

plot_roc_curve(y_true, y_pred_proba, mimos, ...)

Plot the ROC curve for the test data of the charge and distance features.

preprocess_data(df_charge, df_dist, mimos, ...)

Preprocess data for training and testing by splitting it based on the given test fraction.

rf_analysis(data_split_type, include_esp, ...)

train_random_forest(feature, data_split, ...)

Train random forest classifiers for the distance and charge features.

train_random_forest_with_optimization(...)