When instantiating a Scikit-learn estimator, it will use default values for the hyperparameters that are not specified. Relying on the default
values can lead to non-reproducible results across diffferent versions of the library.
Furthermore, the default values might not be the best choice for the specific problem at hand and can lead to suboptimal performance.
Here are the estimators and the parameters considered by this rule :
Estimator |
Hyperparameters |
AdaBoostClassifier |
learning_rate |
AdaBoostRegressor |
learning_rate |
GradientBoostingClassifier |
learning_rate |
GradientBoostingRegressor |
learning_rate |
HistGradientBoostingClassifier |
learning_rate |
HistGradientBoostingRegressor |
learning_rate |
RandomForestClassifier |
min_samples_leaf, max_features |
RandomForestRegressor |
min_samples_leaf, max_features |
ElasticNet |
alpha, l1_ratio |
NearestNeighbors |
n_neighbors |
KNeighborsClassifier |
n_neighbors |
KNeighborsRegressor |
n_neighbors |
NuSVC |
nu, kernel, gamma |
NuSVR |
C, kernel, gamma |
SVC |
C, kernel, gamma |
SVR |
C, kernel, gamma |
DecisionTreeClassifier |
ccp_alpha |
DecisionTreeRegressor |
ccp_alpha |
MLPClassifier |
hidden_layer_sizes |
MLPRegressor |
hidden_layer_sizes |
PolynomialFeatures |
degree, interaction_only |