Machine Learning

Umhlahlandlela obonakalayo wokuhlekisa ngezihlahla ezikhulisa i-gradient

Ukuqalisa

Okuthunyelwe kwami ​​kwangaphambilini kwabheka isihlahla sesinqumo se-bog-standard kanye nokumangala kwehlathi elingahleliwe. Manje, ukuqedela i-triplet, ngizohlola kahle!

Kunenqwaba yemitapo yolwazi yesihlahla ekhulisa i-gradient, kufaka phakathi i-XGBoost, i-Catboost, ne-Lightgbm. Kodwa-ke, kulokhu ngizosebenzisa eyodwa kaSkLearn. Ngani? Simply ngoba, uma kuqhathaniswa nabanye, kwangivumela ukuba ngibone ngeso lelo kube lula. Ekusebenzeni ngijwayele ukusebenzisa eminye imitapo yolwazi ngaphezu kwe-skurfing eyodwa; Kodwa-ke, le projekthi imayelana nokufunda okubonakalayo, hhayi ukusebenza okumsulwa.

Ngokuyisisekelo, i-GBT inhlanganisela yezihlahla lokho Sebenza kuphela ngokubambisana. Ngenkathi isihlahla sesinqumo esisodwa (kufaka phakathi eyodwa ekhishwe ehlathini elingahleliwe) kungenza isibikezelo esihle sodwa, ukuthatha isihlahla esisodwa esivela ku-GBT cishe akunakwenzeka ukuthi sinikeze noma yini esebenzayo.

Ngaphandle kwalokhu, njengakuhlala kunjalo, akukho mbono, akukho matri – iziza nje nama-hyperparameter. As before, I'll be using the California housing dataset via scikit-learn (CC-BY), the same general process as described in my previous posts, the code is at and all images below are created by me (apart from the GIF, which is from Tenor).

Isihlahla esisisekelo esiyisisekelo esikhuthazayo

Ukuqala nge-GBT eyisisekelo: gb = GradientBoostingRegressor(random_state=42). Okufanayo nezinye izinhlobo zezihlahla, izilungiselelo ezizenzakalelayo ze min_samples_split, min_samples_leaf, max_leaf_nodes ngu-2, 1, None ngokulandelana. Kuyathakazelisa, okuzenzakalelayo max_depth 3, hhayi None Njengoba inezihlahla zezinqumo / amahlathi angahleliwe. Ama-hyperpameters aphawulekile, engizowabheka kamuva, afake learning_rate (I-gradient iyini, okuzenzakalelayo 0.1), futhi n_estimators (okufana nehlathi elingahleliwe – inani lezihlahla).

Ukufaneleka kuthathe ama-2.2s, ukubikezela kwathatha ama-0.005s, kanye nemiphumela:

Metric max_depth = akekho
Umzali 0.369
Umvini 0.216
Umnsifo 0.289
Uhlobo luka-RMSE 0.538
² 0.779

Ngakho-ke, ngokushesha kunehlathi elizenzakalelayo elizenzakalelayo, kepha ukusebenza okubi kakhulu. Ngebhulokhi lami elikhethiwe, kwabikezela u-0.803 (okungokoqobo 0.894).

-Dulobha

Yingakho ulapha, ngakwesokudla?

Isihlahla

Ifana ngaphambili, singahlela isihlahla esisodwa. Lokhu kungokokuqala, kufinyelelwe nge gb.estimators_[0, 0]:

Ngikuchazele lokhu kokuthunyelwe kwangaphambilini, ngakho-ke ngeke ngiphinde ngikwenze lapha. Into eyodwa engizoyifaka ekunakekelweni kwakho yize: Qaphela ukuthi amanani ambi kangakanani! Ama-Stufts amathathu aze abe nezindinganiso ezingezinhle, esikwaziyo ngeke zibe njalo. Kungaleso sizathu-ke i-GBT isebenza kuphela njengeqembu elihlanganisiwe, hhayi njengezihlahla ezihlukile ze-Standalone ezinjengehlathi elingahleliwe.

Izibikezelo namaphutha

Indlela engiyithandayo yokubona ngeso lengqondo i-GBTS inesiqubulo se-vs iteration iziza, isebenzisa gb.staged_predict. Ngebhulokhi lami elikhethiwe:

Khumbula imodeli ezenzakalelayo inama-100 escators? Yebo, nakhu. Ukubikezela kokuqala kwaba kude – 2! Kepha isikhathi ngasinye safunda (khumbula learning_rate?), futhi wasondelana nenani langempela. Vele, yaqeqeshwa kwimininingwane yokuqeqesha, hhayi le datha ethile, ngakho-ke inani lokugcina lalivaliwe (0.803, ngakho-ke cishe i-10% avaliwe), kepha ungayibona kahle inqubo.

Kulokhu, ifinyelele esimweni esiqinile ngemuva kwama-20 ama-iterations. Kamuva sizobona ukuthi ungayeka kanjani ukufeza kulesi sigaba, ukugwema ukuchitha isikhathi nemali.

Ngokufanayo, iphutha (okusho ukuthi isibikezelo sisusa inani leqiniso) lingahlelwa. Vele, lokhu kusinika isiza esifanayo, ngamanani ahlukile we-y-axis:

Ake sithathe lesi sinyathelo esisodwa! Imininingwane yokuhlola inamabhulokhi angaphezu kuka-5000 ukubikezela; Singakwazi ukugoba ngakunye, futhi sibabikezele bonke, nge-iteration ngayinye!

Ngiyayithanda le nhle.

Bonke baqala cishe ngo-2, kepha baqhuma ngaphesheya kwezimpawu. Siyazi wonke amanani weqiniso ahlukahluka kusuka ku-0,15 kuye ku-5, anencazelo engu-2,1 (hlola okuthunyelwe kwami ​​kokuqala), ngakho-ke lokhu kusakazeka kokubikezela (kusuka ku- ~ 0.3 kuya ku-5.5) kulindelekile.

Futhi singahlela amaphutha:

Uma uqala ukubheka nje, kubukeka sengathi kuyamangaza – silindele ukuthi baqale, bathi, ± 2, bese beshintshana ngo-0 Inkinga ukuthi, ngemigqa engaphezu kuka-5000 kulesi sakhiwo, kunezinto eziningi ezigcwele ngokweqile, okwenza abathengisi bavelele ngaphezulu. Mhlawumbe kunendlela engcono yokubona ngeso lengqondo lezi? Kuthiwani …

Iphutha le-Median lingu-0.05 – okuyinto enhle kakhulu! I-IQR ingaphansi kuka-0.5, futhi ohlonishwayo. Ngakho-ke, ngenkathi kunezibikezelo ezimbi kakhulu, iningi lihloniphekile.

I-Hyperparameter Tuning

Isinqumo esinqunyelwe esihlahleni

Okufanayo njengakuqala, ake siqhathanise ukuthi ama-hyperpasmeter ahlolwe ngayo endaweni yokuqala yesinqumo esifaka isicelo ku-GBTS, ngama-hyperpareter azenzakalelayo we learning_rate = 0.1, n_estimators = 100. Le khasi min_samples_leaf, min_samples_splitfuthi max_leaf_nodes eyodwa nayo inayo max_depth = 10ukulenza ukuqhathanisa okuhle kokuthunyelwe kwangaphambilini nakubo.

Isifanekiso max_depth = akekho max_depth = 10 Min_Samples_leaf = 10 Min_Samples_Split = 10 max_leaf_nodes = 100
Isikhathi (s) 10.889 7.009 7.101 7.015 6.167
Qagela isikhathi (s) 0.089 0.019 0.015 0.018 0.013
Umzali 0.454 0.304 0.301 0.302 0.301
Umvini 0.253 0.1777 0.1774 0.1774 0.175
Umnsifo 0.496 0.222 0.212 0.217 0.210
Uhlobo luka-RMSE 0.704 0.471 0.46 0.466 0.458
² 0.621 0.830 0.838 0.834 0.840
Ukubikezela okukhethiwe 0.885 0.906 0.962 0.918 0.923
Iphutha elikhethiwe 0.009 0.012 0.068 0.024 0.029

Ngokungafani nezihlahla zesinqumo namahlathi angahleliwe, umuthi ojulile wenza okubi kakhulu! Futhi wathatha isikhathi eside ukulingana. Kodwa-ke, ukwandisa ukujula kusuka ku-3 (okuzenzakalelayo) kuya ku-10 sekuthuthukise izikolo. Ezinye izingqinamba ziholele ekuthuthukisweni okwengeziwe – futhi zibonisa ukuthi wonke ama-hyperpasmeter angadlala ngayo indima.

Ukufunda_yakubhala

I-GBTS isebenza ngokubikezela okuvelayo ngemuva kwe-iteration ngayinye kususelwa kwiphutha. Lapho kukhuphuka ukulungiswa (aku-gradient, aka inani lokufunda), kulapho ukubikezela kushintsha phakathi kwama-iterations.

Kukhona ukuhweba okucacile kwenani lokufunda. Ukuqhathanisa amanani wokufunda we-0.01 (kancane), 0.1 (okuzenzakalelayo), kanye no-0.5 (ngokushesha), ngaphezulu kwe-100 ama-Iterations:

Amanani wokufunda asheshayo angafika kwinani elifanele ngokushesha, kepha maningi amathuba okuthi aqine futhi agxume edlula inani leqiniso (cabanga ukudonswa kwe-shcillation emotweni), futhi kungaholela kuma-oscillations. Amanani wokufunda kancane angasoze afinyelela inani elifanele (cabanga … hhayi ukuguqula isondo lokuqondisa ngokwanele futhi ushayela ngqo esihlahleni). Ngokuqondene nezibalo:

Isifanekiso Phutha Sheshayo -Kancane
Isikhathi (s) 2.159 2.288 2.166
Qagela isikhathi (s) 0.005 I-0.004 0.015
Umzali 0.370 0.338 0.629
Umvini 0.216 0.197 0.427
Umnsifo 0.289 0.247 0.661
Uhlobo luka-RMSE 0.538 0.497 0.813
² 0.779 0.811 0.495
Ukubikezela okukhethiwe 0.803 0.949 I-1.44
Iphutha elikhethiwe 0.091 0.055 0.546

Ngokusobala, imodeli yokufunda ehamba kancane yayiyesabeka. Kulesi block, ngokushesha kwakungcono kancane kune-default ephelele. Kodwa-ke, singabona kwisakhiwo ukuthi, okungenani nge-block ekhethiwe, bekuyinto yokugcina engu-90 ethola imodeli esheshayo ukuze inembe kakhudlwana kune-One ezenzakalelayo – uma singama ema-Interations angama-40 okungenani, imodeli ezenzakalelayo ngabe ingcono kakhulu. Injabulo Yokubona!

n_etimators

Njengoba kushiwo ngenhla, inani labalinganisela lihambisana nenani lokufunda. Ngenjwayeloukulinganisa okuningi okungcono, njengoba kunikeza okuningi okuningana ukukala futhi kuguqule iphutha – yize lokhu kuza ngezindleko zesikhathi esengeziwe.

Njengoba kubonakala ngenhla, inani eliphakeme eliphakeme labalinganiswayo libaluleke kakhulu ngenani lokufunda eliphansi, ukuqinisekisa ukuthi inani elifanele lifinyelelwe. Kukhulisa inani labalinganiselwa ku-500:

Ngokwanele, i-GBT yokufunda kancane yafinyelela enanini leqiniso. Eqinisweni, bonke bagcina basondela kakhulu. Izibalo ziqinisekisa lokhu:

Isifanekiso Ukungazeki Osheshayo I-Slowmore
Isikhathi (s) 12.254 12.489 11.918
Qagela isikhathi (s) 0.018 0.014 0.022
Umzali 0.323 0.319 0.410
Umvini 0.187 0.185 0.248
Umnsifo 0.232 0.228 0.338
Uhlobo luka-RMSE I-0.482 0.477 0.581
² 0.823 0.826 0.742
Ukubikezela okukhethiwe 0.841 0.921 0.858
Iphutha elikhethiwe 0.053 0.027 0.036

Ngokusobala, ukukhulisa inani lezilinganiso ezi-kuqubukayo ezinhlanu zandisa isikhathi ukuze zilingane kakhulu (kulokhu ngokuzithoba okuyisithupha, kepha lokho kungahle kube okukodwa). Kodwa-ke, asikade sidlula izikolo ezicindezelwe kwezihlahla ezicindezelwe ngaphezulu – ngicabanga ukuthi sizodinga ukwenza usesho lwe-hyperparameter ukubona ukuthi singabashaya yini. Futhi, ngebhulokhi ekhethiwe, njengoba kungabonakala esizeni, ngemuva kokuthi ama-300 amamodeli akekho amamodeli athuthukile ngempela. Uma lokhu kuguquguqukayo kuyo yonke imininingwane, khona-ke ama-700 ama-iterations ayengadingekile. Ngishilo ekuqaleni mayelana nokuthi kungenzeka kanjani ukugwema ukuchitha isikhathi ngaphandle kokuthuthuka; Manje isikhathi sokubheka lokho.

n_iter_no_change, ukuqinisekiswa_

Kungenzeka ukutholakala okwengeziwe ukuze ungathuthuki umphumela wokugcina, nokho kusathatha isikhathi ukuwaqhuba. Yilapho kufika khona ukuma kusenesikhathi.

Kunama-hyperpaseter amathathu afanele. Okokuqala, n_iter_no_changeingabe zingaki izinto zokuthi zikhona “aziguquli” ngaphambi kokwenza ezinye izinto. tol[erance] Ngabe kukhulu kangakanani ushintsho kumaphuzu wokuqinisekiswa kufanele ahlukaniswe ngokuthi “alukho ushintsho”. Na- validation_fraction Ngabe imali engakanani yokuqeqeshwa okufanele isetshenziswe njenge-service ebekwe ukukhiqiza amaphuzu okuqinisekisa (Qaphela lokhu kuhlukile kwidatha yokuhlola).

Ukuqhathanisa i-GBT-Estimator GBT neyodwa enolaka ngokuhlukumeza ekuqaleni – n_iter_no_change=5, validation_fraction=0.1, tol=0.005 – Okokugcina amenye ngemuva kokuphakama kuka-61 (futhi ngenxa yalokho athathe kuphela ama-5 ~ 6% wesikhathi ukuze alingane):

Njengoba kulindeleke ukuthi, imiphumela yayibi kakhulu:

Isifanekiso Phutha Ukuma Kwasekuqaleni
Isikhathi (s) 24.843 I-1.304
Qagela isikhathi (s) 0.042 0.003
Umzali 0.313 0.396
Umvini 0.181 0.236
Umnsifo 0.222 0.321
Uhlobo luka-RMSE 0.471 0.566
² 0.830 0.755
Ukubikezela okukhethiwe 0.837 0.805
Iphutha elikhethiwe 0.057 0.089

Kepha njengohlale, umbuzo okufanele ubuze: Kufanelekile ukutshala imali 20x isikhathi sokuthuthukisa i-Rą nge-10%, noma ukunciphisa iphutha nge-20%?

I-Babes Search

Cishe ubulindele lokhu. Izikhala zokucinga:

search_spaces = {
    'learning_rate': (0.01, 0.5),
    'max_depth': (1, 100),
    'max_features': (0.1, 1.0, 'uniform'),
    'max_leaf_nodes': (2, 20000),
    'min_samples_leaf': (1, 100),
    'min_samples_split': (2, 100),
    'n_estimators': (50, 1000),
}

Iningi lifana nokuthunyelwe kwami ​​kwangaphambili; okuwukuphela kwe-hyperparameter eyengeziwe learning_rate.

Kuthathe isikhathi eside kakhulu kuze kube manje, ngamaminithi angama-96 (~ 50% angaphezu kwehlathi elingahleliwe!) Ama-hyperpameter amahle kakhulu:

best_parameters = OrderedDict({
    'learning_rate': 0.04345459461297153,
    'max_depth': 13,
    'max_features': 0.4993693929975871,
    'max_leaf_nodes': 20000,
    'min_samples_leaf': 1,
    'min_samples_split': 83,
    'n_estimators': 325,
})

max_features, max_leaf_nodesfuthi min_samples_leafzifana kakhulu nehlathi elihleliwe. n_estimators Futhi, futhi kuyaqondanisa nalokho okushiwo yi-block ekhethiwe ngaphezulu – ama-700 ama-iterations ayengadingekile. Kodwa-ke, uma kuqhathaniswa nehlathi elihleliwe elingahleliwe, izihlahla zingamathathu kuphela ajule, futhi min_samples_split iphakeme kakhulu kunalokho esikubonile kuze kube manje. Inani le learning_rate kwakungekhona okumangazayo kakhulu ngokusekelwe kulokho esikubonile ngenhla.

Kanye nezikolo eziqinisekiswe yisikhathi:

Metric -Ncishana Isitsheke
Umzali -0.289 0.005
Umvini -0.161 I-0.004
Umnsifo -0.200 0.008
Uhlobo luka-RMSE -0.448 0.009
² 0.849 0.006

Kuwo wonke amamodeli kuze kube manje, lokhu kuhamba phambili, ngamaphutha amancane, okuphezulu kwe-R², kanye nokwehluka okuphansi!

Ekugcineni, umngani wethu wakudala, amabhokisi amabomu:

Ukugcina

Futhi ngakho-ke sifika ekugcineni kochungechunge lwami lwe-mini ngezinhlobo ezintathu ezivame kakhulu zamamodeli asuselwa emithini.

Ithemba lami ukuthi, ngokubona izindlela ezahlukahlukene zokubona izihlahla, manje (a) ziqonde kangcono ukuthi amamodeli ahlukile asebenza kanjani, ngaphandle kokubheka iziqubulo zakho, futhi (b) angasebenzisa amapulazi akho. Kungasiza futhi ngokuphathwa kwababambiqhaza – Amanzi ancamela izithombe ezinhle kumatafula ezinombolo, ngakho-ke ubakhombise ukuthi isicucu sesihlahla singabasiza baqonde ukuthi kungani akunakwenzeka.

Ngokusekelwe kule datha, futhi la mamodeli, i-gradient yakhulisa eyodwa yayiphakeme kakhulu kunehlathi elingahleliwe, futhi bobabili babephakeme kakhulu kunomuthi wesinqumo oyedwa. Kodwa-ke, lokhu kungenzeka ukuthi kwakuyinto yokuthi i-GBT ibe nesikhathi esingu-50% sokufuna ama-hyperpameters angcono (ngokujwayelekile abiza kakhulu – ngemuva kwakho konke, kwakuyinombolo efanayo, kwakuyinombolo efanayo yama-Iterations). Kuhle futhi ukukwazi ukuthi ama-GBT anokuthambekela okuphezulu kokungenisa ngokweqile kunamahlathi angahleliwe. Futhi ngenkathi isihlahla sesinqumo sasisebenza kabi, kunjalo -kude Ngokushesha – nakwamanye amacala okusebenzisa, lokhu kubaluleke kakhulu. Ngaphezu kwalokho, njengoba kushiwo, kuneminye imitapo yolwazi, enezinzuzo nezinzuzo – ngokwesibonelo imininingwane yokuhlukanisa, kanti eminye imitapo yolwazi ye-GBT idinga ukuthi idatha ye-GETICAL idinga ukuthi ifakwe ngaphambili (isib. Noma, uma uzizwa unesibindi ngempela, kanjani ngokufaka izinhlobo zezihlahla ezihlukile emcimbini wokusebenza okungcono kakhulu …

Noma kunjalo, kuze kufike isikhathi esilandelayo!

Source link

Related Articles

Leave a Reply

Your email address will not be published. Required fields are marked *

Back to top button