Umhlahlandlela obonakalayo wokuhlekisa ngezihlahla ezikhulisa i-gradient

Ukuqalisa
Okuthunyelwe kwami kwangaphambilini kwabheka isihlahla sesinqumo se-bog-standard kanye nokumangala kwehlathi elingahleliwe. Manje, ukuqedela i-triplet, ngizohlola kahle!
Kunenqwaba yemitapo yolwazi yesihlahla ekhulisa i-gradient, kufaka phakathi i-XGBoost, i-Catboost, ne-Lightgbm. Kodwa-ke, kulokhu ngizosebenzisa eyodwa kaSkLearn. Ngani? Simply ngoba, uma kuqhathaniswa nabanye, kwangivumela ukuba ngibone ngeso lelo kube lula. Ekusebenzeni ngijwayele ukusebenzisa eminye imitapo yolwazi ngaphezu kwe-skurfing eyodwa; Kodwa-ke, le projekthi imayelana nokufunda okubonakalayo, hhayi ukusebenza okumsulwa.
Ngokuyisisekelo, i-GBT inhlanganisela yezihlahla lokho Sebenza kuphela ngokubambisana. Ngenkathi isihlahla sesinqumo esisodwa (kufaka phakathi eyodwa ekhishwe ehlathini elingahleliwe) kungenza isibikezelo esihle sodwa, ukuthatha isihlahla esisodwa esivela ku-GBT cishe akunakwenzeka ukuthi sinikeze noma yini esebenzayo.
Ngaphandle kwalokhu, njengakuhlala kunjalo, akukho mbono, akukho matri – iziza nje nama-hyperparameter. As before, I'll be using the California housing dataset via scikit-learn (CC-BY), the same general process as described in my previous posts, the code is at and all images below are created by me (apart from the GIF, which is from Tenor).
Isihlahla esisisekelo esiyisisekelo esikhuthazayo
Ukuqala nge-GBT eyisisekelo: gb = GradientBoostingRegressor(random_state=42). Okufanayo nezinye izinhlobo zezihlahla, izilungiselelo ezizenzakalelayo ze min_samples_split, min_samples_leaf, max_leaf_nodes ngu-2, 1, None ngokulandelana. Kuyathakazelisa, okuzenzakalelayo max_depth 3, hhayi None Njengoba inezihlahla zezinqumo / amahlathi angahleliwe. Ama-hyperpameters aphawulekile, engizowabheka kamuva, afake learning_rate (I-gradient iyini, okuzenzakalelayo 0.1), futhi n_estimators (okufana nehlathi elingahleliwe – inani lezihlahla).
Ukufaneleka kuthathe ama-2.2s, ukubikezela kwathatha ama-0.005s, kanye nemiphumela:
| Metric | max_depth = akekho |
|---|---|
| Umzali | 0.369 |
| Umvini | 0.216 |
| Umnsifo | 0.289 |
| Uhlobo luka-RMSE | 0.538 |
| ² | 0.779 |
Ngakho-ke, ngokushesha kunehlathi elizenzakalelayo elizenzakalelayo, kepha ukusebenza okubi kakhulu. Ngebhulokhi lami elikhethiwe, kwabikezela u-0.803 (okungokoqobo 0.894).
-Dulobha
Yingakho ulapha, ngakwesokudla?
Isihlahla
Ifana ngaphambili, singahlela isihlahla esisodwa. Lokhu kungokokuqala, kufinyelelwe nge gb.estimators_[0, 0]:
Ngikuchazele lokhu kokuthunyelwe kwangaphambilini, ngakho-ke ngeke ngiphinde ngikwenze lapha. Into eyodwa engizoyifaka ekunakekelweni kwakho yize: Qaphela ukuthi amanani ambi kangakanani! Ama-Stufts amathathu aze abe nezindinganiso ezingezinhle, esikwaziyo ngeke zibe njalo. Kungaleso sizathu-ke i-GBT isebenza kuphela njengeqembu elihlanganisiwe, hhayi njengezihlahla ezihlukile ze-Standalone ezinjengehlathi elingahleliwe.
Izibikezelo namaphutha
Indlela engiyithandayo yokubona ngeso lengqondo i-GBTS inesiqubulo se-vs iteration iziza, isebenzisa gb.staged_predict. Ngebhulokhi lami elikhethiwe:

Khumbula imodeli ezenzakalelayo inama-100 escators? Yebo, nakhu. Ukubikezela kokuqala kwaba kude – 2! Kepha isikhathi ngasinye safunda (khumbula learning_rate?), futhi wasondelana nenani langempela. Vele, yaqeqeshwa kwimininingwane yokuqeqesha, hhayi le datha ethile, ngakho-ke inani lokugcina lalivaliwe (0.803, ngakho-ke cishe i-10% avaliwe), kepha ungayibona kahle inqubo.
Kulokhu, ifinyelele esimweni esiqinile ngemuva kwama-20 ama-iterations. Kamuva sizobona ukuthi ungayeka kanjani ukufeza kulesi sigaba, ukugwema ukuchitha isikhathi nemali.
Ngokufanayo, iphutha (okusho ukuthi isibikezelo sisusa inani leqiniso) lingahlelwa. Vele, lokhu kusinika isiza esifanayo, ngamanani ahlukile we-y-axis:

Ake sithathe lesi sinyathelo esisodwa! Imininingwane yokuhlola inamabhulokhi angaphezu kuka-5000 ukubikezela; Singakwazi ukugoba ngakunye, futhi sibabikezele bonke, nge-iteration ngayinye!

Ngiyayithanda le nhle.

Bonke baqala cishe ngo-2, kepha baqhuma ngaphesheya kwezimpawu. Siyazi wonke amanani weqiniso ahlukahluka kusuka ku-0,15 kuye ku-5, anencazelo engu-2,1 (hlola okuthunyelwe kwami kokuqala), ngakho-ke lokhu kusakazeka kokubikezela (kusuka ku- ~ 0.3 kuya ku-5.5) kulindelekile.
Futhi singahlela amaphutha:

Uma uqala ukubheka nje, kubukeka sengathi kuyamangaza – silindele ukuthi baqale, bathi, ± 2, bese beshintshana ngo-0 Inkinga ukuthi, ngemigqa engaphezu kuka-5000 kulesi sakhiwo, kunezinto eziningi ezigcwele ngokweqile, okwenza abathengisi bavelele ngaphezulu. Mhlawumbe kunendlela engcono yokubona ngeso lengqondo lezi? Kuthiwani …

Iphutha le-Median lingu-0.05 – okuyinto enhle kakhulu! I-IQR ingaphansi kuka-0.5, futhi ohlonishwayo. Ngakho-ke, ngenkathi kunezibikezelo ezimbi kakhulu, iningi lihloniphekile.
I-Hyperparameter Tuning
Isinqumo esinqunyelwe esihlahleni
Okufanayo njengakuqala, ake siqhathanise ukuthi ama-hyperpasmeter ahlolwe ngayo endaweni yokuqala yesinqumo esifaka isicelo ku-GBTS, ngama-hyperpareter azenzakalelayo we learning_rate = 0.1, n_estimators = 100. Le khasi min_samples_leaf, min_samples_splitfuthi max_leaf_nodes eyodwa nayo inayo max_depth = 10ukulenza ukuqhathanisa okuhle kokuthunyelwe kwangaphambilini nakubo.
| Isifanekiso | max_depth = akekho | max_depth = 10 | Min_Samples_leaf = 10 | Min_Samples_Split = 10 | max_leaf_nodes = 100 |
|---|---|---|---|---|---|
| Isikhathi (s) | 10.889 | 7.009 | 7.101 | 7.015 | 6.167 |
| Qagela isikhathi (s) | 0.089 | 0.019 | 0.015 | 0.018 | 0.013 |
| Umzali | 0.454 | 0.304 | 0.301 | 0.302 | 0.301 |
| Umvini | 0.253 | 0.1777 | 0.1774 | 0.1774 | 0.175 |
| Umnsifo | 0.496 | 0.222 | 0.212 | 0.217 | 0.210 |
| Uhlobo luka-RMSE | 0.704 | 0.471 | 0.46 | 0.466 | 0.458 |
| ² | 0.621 | 0.830 | 0.838 | 0.834 | 0.840 |
| Ukubikezela okukhethiwe | 0.885 | 0.906 | 0.962 | 0.918 | 0.923 |
| Iphutha elikhethiwe | 0.009 | 0.012 | 0.068 | 0.024 | 0.029 |
Ngokungafani nezihlahla zesinqumo namahlathi angahleliwe, umuthi ojulile wenza okubi kakhulu! Futhi wathatha isikhathi eside ukulingana. Kodwa-ke, ukwandisa ukujula kusuka ku-3 (okuzenzakalelayo) kuya ku-10 sekuthuthukise izikolo. Ezinye izingqinamba ziholele ekuthuthukisweni okwengeziwe – futhi zibonisa ukuthi wonke ama-hyperpasmeter angadlala ngayo indima.
Ukufunda_yakubhala
I-GBTS isebenza ngokubikezela okuvelayo ngemuva kwe-iteration ngayinye kususelwa kwiphutha. Lapho kukhuphuka ukulungiswa (aku-gradient, aka inani lokufunda), kulapho ukubikezela kushintsha phakathi kwama-iterations.
Kukhona ukuhweba okucacile kwenani lokufunda. Ukuqhathanisa amanani wokufunda we-0.01 (kancane), 0.1 (okuzenzakalelayo), kanye no-0.5 (ngokushesha), ngaphezulu kwe-100 ama-Iterations:

Amanani wokufunda asheshayo angafika kwinani elifanele ngokushesha, kepha maningi amathuba okuthi aqine futhi agxume edlula inani leqiniso (cabanga ukudonswa kwe-shcillation emotweni), futhi kungaholela kuma-oscillations. Amanani wokufunda kancane angasoze afinyelela inani elifanele (cabanga … hhayi ukuguqula isondo lokuqondisa ngokwanele futhi ushayela ngqo esihlahleni). Ngokuqondene nezibalo:
| Isifanekiso | Phutha | Sheshayo | -Kancane |
|---|---|---|---|
| Isikhathi (s) | 2.159 | 2.288 | 2.166 |
| Qagela isikhathi (s) | 0.005 | I-0.004 | 0.015 |
| Umzali | 0.370 | 0.338 | 0.629 |
| Umvini | 0.216 | 0.197 | 0.427 |
| Umnsifo | 0.289 | 0.247 | 0.661 |
| Uhlobo luka-RMSE | 0.538 | 0.497 | 0.813 |
| ² | 0.779 | 0.811 | 0.495 |
| Ukubikezela okukhethiwe | 0.803 | 0.949 | I-1.44 |
| Iphutha elikhethiwe | 0.091 | 0.055 | 0.546 |
Ngokusobala, imodeli yokufunda ehamba kancane yayiyesabeka. Kulesi block, ngokushesha kwakungcono kancane kune-default ephelele. Kodwa-ke, singabona kwisakhiwo ukuthi, okungenani nge-block ekhethiwe, bekuyinto yokugcina engu-90 ethola imodeli esheshayo ukuze inembe kakhudlwana kune-One ezenzakalelayo – uma singama ema-Interations angama-40 okungenani, imodeli ezenzakalelayo ngabe ingcono kakhulu. Injabulo Yokubona!
n_etimators
Njengoba kushiwo ngenhla, inani labalinganisela lihambisana nenani lokufunda. Ngenjwayeloukulinganisa okuningi okungcono, njengoba kunikeza okuningi okuningana ukukala futhi kuguqule iphutha – yize lokhu kuza ngezindleko zesikhathi esengeziwe.
Njengoba kubonakala ngenhla, inani eliphakeme eliphakeme labalinganiswayo libaluleke kakhulu ngenani lokufunda eliphansi, ukuqinisekisa ukuthi inani elifanele lifinyelelwe. Kukhulisa inani labalinganiselwa ku-500:

Ngokwanele, i-GBT yokufunda kancane yafinyelela enanini leqiniso. Eqinisweni, bonke bagcina basondela kakhulu. Izibalo ziqinisekisa lokhu:
| Isifanekiso | Ukungazeki | Osheshayo | I-Slowmore |
|---|---|---|---|
| Isikhathi (s) | 12.254 | 12.489 | 11.918 |
| Qagela isikhathi (s) | 0.018 | 0.014 | 0.022 |
| Umzali | 0.323 | 0.319 | 0.410 |
| Umvini | 0.187 | 0.185 | 0.248 |
| Umnsifo | 0.232 | 0.228 | 0.338 |
| Uhlobo luka-RMSE | I-0.482 | 0.477 | 0.581 |
| ² | 0.823 | 0.826 | 0.742 |
| Ukubikezela okukhethiwe | 0.841 | 0.921 | 0.858 |
| Iphutha elikhethiwe | 0.053 | 0.027 | 0.036 |
Ngokusobala, ukukhulisa inani lezilinganiso ezi-kuqubukayo ezinhlanu zandisa isikhathi ukuze zilingane kakhulu (kulokhu ngokuzithoba okuyisithupha, kepha lokho kungahle kube okukodwa). Kodwa-ke, asikade sidlula izikolo ezicindezelwe kwezihlahla ezicindezelwe ngaphezulu – ngicabanga ukuthi sizodinga ukwenza usesho lwe-hyperparameter ukubona ukuthi singabashaya yini. Futhi, ngebhulokhi ekhethiwe, njengoba kungabonakala esizeni, ngemuva kokuthi ama-300 amamodeli akekho amamodeli athuthukile ngempela. Uma lokhu kuguquguqukayo kuyo yonke imininingwane, khona-ke ama-700 ama-iterations ayengadingekile. Ngishilo ekuqaleni mayelana nokuthi kungenzeka kanjani ukugwema ukuchitha isikhathi ngaphandle kokuthuthuka; Manje isikhathi sokubheka lokho.
n_iter_no_change, ukuqinisekiswa_
Kungenzeka ukutholakala okwengeziwe ukuze ungathuthuki umphumela wokugcina, nokho kusathatha isikhathi ukuwaqhuba. Yilapho kufika khona ukuma kusenesikhathi.
Kunama-hyperpaseter amathathu afanele. Okokuqala, n_iter_no_changeingabe zingaki izinto zokuthi zikhona “aziguquli” ngaphambi kokwenza ezinye izinto. tol[erance] Ngabe kukhulu kangakanani ushintsho kumaphuzu wokuqinisekiswa kufanele ahlukaniswe ngokuthi “alukho ushintsho”. Na- validation_fraction Ngabe imali engakanani yokuqeqeshwa okufanele isetshenziswe njenge-service ebekwe ukukhiqiza amaphuzu okuqinisekisa (Qaphela lokhu kuhlukile kwidatha yokuhlola).
Ukuqhathanisa i-GBT-Estimator GBT neyodwa enolaka ngokuhlukumeza ekuqaleni – n_iter_no_change=5, validation_fraction=0.1, tol=0.005 – Okokugcina amenye ngemuva kokuphakama kuka-61 (futhi ngenxa yalokho athathe kuphela ama-5 ~ 6% wesikhathi ukuze alingane):

Njengoba kulindeleke ukuthi, imiphumela yayibi kakhulu:
| Isifanekiso | Phutha | Ukuma Kwasekuqaleni |
|---|---|---|
| Isikhathi (s) | 24.843 | I-1.304 |
| Qagela isikhathi (s) | 0.042 | 0.003 |
| Umzali | 0.313 | 0.396 |
| Umvini | 0.181 | 0.236 |
| Umnsifo | 0.222 | 0.321 |
| Uhlobo luka-RMSE | 0.471 | 0.566 |
| ² | 0.830 | 0.755 |
| Ukubikezela okukhethiwe | 0.837 | 0.805 |
| Iphutha elikhethiwe | 0.057 | 0.089 |
Kepha njengohlale, umbuzo okufanele ubuze: Kufanelekile ukutshala imali 20x isikhathi sokuthuthukisa i-Rą nge-10%, noma ukunciphisa iphutha nge-20%?
I-Babes Search
Cishe ubulindele lokhu. Izikhala zokucinga:
search_spaces = {
'learning_rate': (0.01, 0.5),
'max_depth': (1, 100),
'max_features': (0.1, 1.0, 'uniform'),
'max_leaf_nodes': (2, 20000),
'min_samples_leaf': (1, 100),
'min_samples_split': (2, 100),
'n_estimators': (50, 1000),
}
Iningi lifana nokuthunyelwe kwami kwangaphambili; okuwukuphela kwe-hyperparameter eyengeziwe learning_rate.
Kuthathe isikhathi eside kakhulu kuze kube manje, ngamaminithi angama-96 (~ 50% angaphezu kwehlathi elingahleliwe!) Ama-hyperpameter amahle kakhulu:
best_parameters = OrderedDict({
'learning_rate': 0.04345459461297153,
'max_depth': 13,
'max_features': 0.4993693929975871,
'max_leaf_nodes': 20000,
'min_samples_leaf': 1,
'min_samples_split': 83,
'n_estimators': 325,
})
max_features, max_leaf_nodesfuthi min_samples_leafzifana kakhulu nehlathi elihleliwe. n_estimators Futhi, futhi kuyaqondanisa nalokho okushiwo yi-block ekhethiwe ngaphezulu – ama-700 ama-iterations ayengadingekile. Kodwa-ke, uma kuqhathaniswa nehlathi elihleliwe elingahleliwe, izihlahla zingamathathu kuphela ajule, futhi min_samples_split iphakeme kakhulu kunalokho esikubonile kuze kube manje. Inani le learning_rate kwakungekhona okumangazayo kakhulu ngokusekelwe kulokho esikubonile ngenhla.
Kanye nezikolo eziqinisekiswe yisikhathi:
| Metric | -Ncishana | Isitsheke |
|---|---|---|
| Umzali | -0.289 | 0.005 |
| Umvini | -0.161 | I-0.004 |
| Umnsifo | -0.200 | 0.008 |
| Uhlobo luka-RMSE | -0.448 | 0.009 |
| ² | 0.849 | 0.006 |
Kuwo wonke amamodeli kuze kube manje, lokhu kuhamba phambili, ngamaphutha amancane, okuphezulu kwe-R², kanye nokwehluka okuphansi!
Ekugcineni, umngani wethu wakudala, amabhokisi amabomu:

Ukugcina
Futhi ngakho-ke sifika ekugcineni kochungechunge lwami lwe-mini ngezinhlobo ezintathu ezivame kakhulu zamamodeli asuselwa emithini.
Ithemba lami ukuthi, ngokubona izindlela ezahlukahlukene zokubona izihlahla, manje (a) ziqonde kangcono ukuthi amamodeli ahlukile asebenza kanjani, ngaphandle kokubheka iziqubulo zakho, futhi (b) angasebenzisa amapulazi akho. Kungasiza futhi ngokuphathwa kwababambiqhaza – Amanzi ancamela izithombe ezinhle kumatafula ezinombolo, ngakho-ke ubakhombise ukuthi isicucu sesihlahla singabasiza baqonde ukuthi kungani akunakwenzeka.
Ngokusekelwe kule datha, futhi la mamodeli, i-gradient yakhulisa eyodwa yayiphakeme kakhulu kunehlathi elingahleliwe, futhi bobabili babephakeme kakhulu kunomuthi wesinqumo oyedwa. Kodwa-ke, lokhu kungenzeka ukuthi kwakuyinto yokuthi i-GBT ibe nesikhathi esingu-50% sokufuna ama-hyperpameters angcono (ngokujwayelekile abiza kakhulu – ngemuva kwakho konke, kwakuyinombolo efanayo, kwakuyinombolo efanayo yama-Iterations). Kuhle futhi ukukwazi ukuthi ama-GBT anokuthambekela okuphezulu kokungenisa ngokweqile kunamahlathi angahleliwe. Futhi ngenkathi isihlahla sesinqumo sasisebenza kabi, kunjalo -kude Ngokushesha – nakwamanye amacala okusebenzisa, lokhu kubaluleke kakhulu. Ngaphezu kwalokho, njengoba kushiwo, kuneminye imitapo yolwazi, enezinzuzo nezinzuzo – ngokwesibonelo imininingwane yokuhlukanisa, kanti eminye imitapo yolwazi ye-GBT idinga ukuthi idatha ye-GETICAL idinga ukuthi ifakwe ngaphambili (isib. Noma, uma uzizwa unesibindi ngempela, kanjani ngokufaka izinhlobo zezihlahla ezihlukile emcimbini wokusebenza okungcono kakhulu …
Noma kunjalo, kuze kufike isikhathi esilandelayo!



