00f259b079ed48ce91e3cf8b860e18c4

import shap
import pandas as pd
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import train_test_split
import eli5
from eli5.sklearn import PermutationImportance

Reading in the data

data = pd.read_csv('../input/hospital-readmissions/train.csv')

Training our model

y = data.readmitted

base_features = [c for c in data.columns if c != "readmitted"]

X = data[base_features]

train_X, val_X, train_y, val_y = train_test_split(X, y, random_state=1)
my_model = RandomForestClassifier(n_estimators=30, random_state=1).fit(train_X, train_y)

Perfoming some Permutation importance first

perm1 = PermutationImportance(my_model, random_state=1).fit(val_X, val_y)
eli5.show_weights(perm1, feature_names = val_X.columns.tolist())

Weight	Feature
0.0451 ± 0.0068	number_inpatient
0.0087 ± 0.0046	number_emergency
0.0062 ± 0.0053	number_outpatient
0.0033 ± 0.0016	payer_code_MC
0.0020 ± 0.0016	diag_3_401
0.0016 ± 0.0031	medical_specialty_Emergency/Trauma
0.0014 ± 0.0024	A1Cresult_None
0.0014 ± 0.0021	medical_specialty_Family/GeneralPractice
0.0013 ± 0.0010	diag_2_427
0.0013 ± 0.0011	diag_2_276
0.0011 ± 0.0022	age_[50-60)
0.0010 ± 0.0022	age_[80-90)
0.0007 ± 0.0006	repaglinide_No
0.0006 ± 0.0010	diag_1_428
0.0006 ± 0.0022	payer_code_SP
0.0005 ± 0.0030	insulin_No
0.0004 ± 0.0028	diabetesMed_Yes
0.0004 ± 0.0021	diag_3_250
0.0003 ± 0.0018	diag_2_250
0.0003 ± 0.0015	glipizide_No
… 44 more …

data_for_prediction = val_X.iloc[0,:]

# Creating an object that can calculate shap values
explainer = shap.TreeExplainer(my_model)
shap_values = explainer.shap_values(data_for_prediction)
shap.initjs()
shap.force_plot(explainer.expected_value[0], shap_values[0], data_for_prediction)

Visualization omitted, Javascript library not loaded!
Have you run `initjs()` in this notebook? If this notebook was from another user you must also trust this notebook (File -> Trust notebook). If you are viewing this notebook on github the Javascript has been stripped for security. If you are using JupyterLab this error is because a JupyterLab extension has not yet been written.

Looking at both the permutation importance table and the shaply values it would be safe to conclude that "number_inpatient" is a really important feature