You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Operating System (python -c 'import platform;print(platform.platform())'): Darwin-19.6.0-x86_64-i386-64bit
Description
Documentation has this claim:
For example if you want to search for the best
$ abz search -i /path/to/your/datasets/folder name_of_your_dataset
This will evaluate the default pipeline without performing additional tuning iteration on it.
This seems to be misleading, as running the search with no arguments actually evaluates 1000+ iterations before I killed it.
What I Did
$ time abz search 196_autoMpg
Using TensorFlow backend.
20201015192335979857 - Processing Datasets: ['196_autoMpg']
###############################
#### Searching 196_autoMpg ####
###############################
[15:23:37] WARNING: src/objective/regression_obj.cu:152: reg:linear is now deprecated in favor of reg:squarederror.
<repeated 8000 times>
^C
###############################
#### Executing 196_autoMpg ####
###############################
[16:23:50] WARNING: src/objective/regression_obj.cu:152: reg:linear is now deprecated in favor of reg:squarederror.
Executing best pipeline ABPipeline({
"primitives": [
"mlprimitives.custom.feature_extraction.CategoricalEncoder",
"sklearn.impute.SimpleImputer",
"sklearn.preprocessing.RobustScaler",
"xgboost.XGBRegressor"
],
"init_params": {},
"input_names": {},
"output_names": {},
"hyperparameters": {
"mlprimitives.custom.feature_extraction.CategoricalEncoder#1": {
"keep": false,
"copy": true,
"features": "auto",
"max_unique_ratio": 0,
"max_labels": 25
},
"sklearn.impute.SimpleImputer#1": {
"missing_values": NaN,
"fill_value": null,
"verbose": false,
"copy": true,
"strategy": "median"
},
"sklearn.preprocessing.RobustScaler#1": {
"quantile_range": [
25.0,
75.0
],
"copy": true,
"with_centering": true,
"with_scaling": true
},
"xgboost.XGBRegressor#1": {
"n_jobs": -1,
"n_estimators": 617,
"max_depth": 9,
"learning_rate": 0.03240539972838852,
"gamma": 0.27690923264683187,
"min_child_weight": 5
}
},
"tunable_hyperparameters": {
"mlprimitives.custom.feature_extraction.CategoricalEncoder#1": {
"max_labels": {
"type": "int",
"default": 0,
"range": [
0,
100
]
}
},
"sklearn.impute.SimpleImputer#1": {
"strategy": {
"type": "str",
"default": "mean",
"values": [
"mean",
"median",
"most_frequent",
"constant"
]
}
},
"sklearn.preprocessing.RobustScaler#1": {
"with_centering": {
"description": "If True, center the data before scaling. This will cause transform to raise an exception when attempted on sparse matrices, because centering them entails building a dense matrix which in common use cases is likely to be too large to fit in memory",
"type": "bool",
"default": true
},
"with_scaling": {
"description": "If True, scale the data to interquartile range",
"type": "bool",
"default": true
}
},
"xgboost.XGBRegressor#1": {
"n_estimators": {
"type": "int",
"default": 100,
"range": [
10,
1000
]
},
"max_depth": {
"type": "int",
"default": 3,
"range": [
3,
10
]
},
"learning_rate": {
"type": "float",
"default": 0.1,
"range": [
0,
1
]
},
"gamma": {
"type": "float",
"default": 0.1,
"range": [
0,
1
]
},
"min_child_weight": {
"type": "int",
"default": 1,
"range": [
1,
10
]
}
}
},
"outputs": {
"default": [
{
"name": "y",
"type": "array",
"variable": "xgboost.XGBRegressor#1.y"
}
]
},
"id": "e168ec26-31f0-4e78-a3a7-3ef18bf432c8",
"name": "single_table/regression/default",
"template": null,
"loader": {
"data_modality": "single_table",
"task_type": "regression"
},
"score": 8.4004691556447,
"rank": 8.400469155645126,
"metric": "meanSquaredError"
})
#############################
#### Scoring 196_autoMpg ####
#############################
Score: 7.041906911649814
predictions targets
count 100.000000 100.000000
mean 23.589642 23.478000
std 7.581228 7.573446
min 10.351545 10.000000
25% 17.002141 17.375000
50% 24.067155 23.250000
75% 29.522121 28.000000
max 38.241291 44.000000
pipeline score rank cv_score metric data_modality task_type task_subtype elapsed iterations load_time trivial_time cv_time error step
dataset
196_autoMpg e168ec26-31f0-4e78-a3a7-3ef18bf432c8 7.041907 8.400469 8.400469 meanSquaredError single_table regression univariate 3613.11274 1693.0 0.059046 1.091654 3307.688052 None None
real 60m17.985s
user 61m12.325s
sys 50m16.661s
The text was updated successfully, but these errors were encountered:
python -c 'import platform;print(platform.platform())'
): Darwin-19.6.0-x86_64-i386-64bitDescription
Documentation has this claim:
This seems to be misleading, as running the search with no arguments actually evaluates 1000+ iterations before I killed it.
What I Did
The text was updated successfully, but these errors were encountered: