Ukufunda Ngomshini “Ikhalenda Le-Advent” Usuku 19: Ukufaka isikhwama ku-Excel

sihlole iningi lamamodeli okufunda emishini abalulekile, ahlelwe yaba imindeni emithathu emikhulu: amamodeli asekelwe ebangeni nokuminyana, amamodeli asekelwe esihlahleni noma asekelwe emithethweni, namamodeli asekelwe esisindweni.
Kuze kube manje, isihloko ngasinye sigxile kumodeli owodwa, oqeqeshwe ngokwawo. Ukufunda ngokuhlanganyela kushintsha lo mbono ngokuphelele. Akuyona imodeli ezimele. Kunalokho, kuyindlela ukuhlanganisa lawa mamodeli ayisisekelo ukwakha into entsha.
Njengoba kubonisiwe kumdwebo ongezansi, iqoqo liyi-a imodeli ye-meta. Ihlala phezu kwamamodeli ngamanye futhi ihlanganise ukuqagela kwawo.
Ukuvota: umbono olula wokuhlanganisa
Indlela elula yokufunda ehlanganisiwe yile ukuvota.
Umqondo ucishe ube mncane: qeqesha amamodeli ambalwa, thatha izibikezelo zawo, futhi ubale isilinganiso. Uma imodeli eyodwa ingalungile ngakolunye uhlangothi futhi enye ingalungile ngakolunye uhlangothi, amaphutha kufanele akhanselwe. Okungenani, lokho kuyi-intuition.
Ephepheni, lokhu kuzwakala kunengqondo. Empeleni, izinto zihluke kakhulu.
Ngokushesha nje lapho uzama ukuvota kumamodeli wangempela, iqiniso elilodwa liba sobala: ukuvota akuwona umlingo. Ukubikezela nje okumaphakathi akuqinisekisi ukusebenza okungcono. Ezimweni eziningi, empeleni kwenza izinto zibe zimbi nakakhulu.
Isizathu silula. Uma uhlanganisa amamodeli aziphatha ngendlela ehluke kakhulu, uhlanganisa nobuthakathaka bawo. Uma amamodeli engenzi amaphutha ahambisanayo, isilinganiso singanciphisa isakhiwo esiwusizo esikhundleni sokusiqinisa.
Ukuze ubone lokhu ngokucacile, cabangela isibonelo esilula kakhulu. Thatha isihlahla sesinqumo kanye nokuhlehla komugqa oqeqeshwe kudathasethi efanayo. Isihlahla sesinqumo sithwebula amaphethini endawo, angewona womugqa. Ukuhlehla komugqa kuthwebula ithrendi yomugqa womhlaba. Uma ulinganisa izibikezelo zabo, awutholi imodeli engcono. Uthola ukuvumelana okuvame ukuba kubi kakhulu kunemodeli ngayinye ethathwe iyodwa.

Lokhu kukhombisa iphuzu elibalulekile: ukufunda ngokuhlanganyela kudinga okungaphezu kokulinganisa. Kudinga isu. Indlela yokuhlanganisa amamodeli empeleni athuthukisa ukuzinza noma ukwenziwa okuvamile.
Ngaphezu kwalokho, uma sibheka i-ensemble njengemodeli eyodwa, kufanele iqeqeshwe kanjalo. Isilinganiso esilula alinikezi ipharamitha okufanele ilungiswe. Akukho ongakufunda, akukho okungase kuthuthukiswe.
Ukuthuthukiswa okungaba khona ekuvoteni ukwabela amamodeli anesisindo esihlukile. Esikhundleni sokunikeza imodeli ngayinye ukubaluleka okufanayo, singazama ukufunda ukuthi yiziphi okufanele zibaluleke kakhulu. Kodwa lapho nje sethula izisindo, kuvela umbuzo omusha: siziqeqesha kanjani? Ngaleso sikhathi, i-ensemble ngokwayo iba imodeli edinga ukufakwa.
Lokhu kubuka kuholela ngokwemvelo ezindleleni zokuhlanganisa ezihlelekile.
Kulesi sihloko, siqala ngendlela eyodwa yezibalo yokuphinda sisebenzise idathasethi yokuqeqeshwa ngaphambi kokulinganisa: Ukufaka isikhwama.
I-intuition ngemuva kwe-Bagging
Kuyini isikhwama?
Impendulo empeleni ifihlwe egameni ngokwalo.
Ukugoqa = I-Bootstrap + Ukuhlanganisa.
Ungakwazi ukusho ngokushesha ukuthi isazi sezibalo noma isazi sezibalo siyiqambe. 🙂
Ngemuva kwaleli gama elethusa kancane, umqondo ulula kakhulu. Ukufaka isikhwama kumayelana nokwenza izinto ezimbili: okokuqala, ukudala izinguqulo eziningi zedathasethi usebenzisa i-bootstrap, futhi okwesibili, ukuhlanganisa imiphumela etholwe kulawa madathasethi.
Ngakho-ke umqondo oyinhloko awukona ngokushintsha imodeli. Kumayelana nokushintsha i- idatha.
I-Bootstrapping idathasethi
I-Bootstrapping isho ukuthatha isampula kudathasethi ngokufaka esikhundleni. Isampula ngayinye ye-bootstrap inosayizi ofanayo njengedathasethi yoqobo, kodwa hhayi ukubonwa okufanayo. Eminye imigqa ivela izikhathi ezimbalwa. Abanye bayanyamalala.
Ku-Excel, lokhu kulula kakhulu ukukusebenzisa futhi, okubaluleke kakhulu, kulula kakhulu ukukubona.
Uqala ngokwengeza ikholomu ye-ID kudathasethi yakho, isihlonzi esisodwa esiyingqayizivele umugqa ngamunye. Khona-ke, usebenzisa i- RANDBETWEEN umsebenzi, udweba ama-indices emigqa ngokungahleliwe. Umdwebo ngamunye uhambisana nomugqa owodwa kusampula ye-bootstrap. Ngokuphinda le nqubo, ukhiqiza idathasethi egcwele ebukeka ijwayelekile, kodwa ehluke kancane kuneyokuqala.
Lesi sinyathelo sisodwa kakade senza umqondo wokufaka ukhonkolo. Ungabona izimpinda ngokoqobo. Ungabona ukuthi yikuphi ukuqaphela okungekho. Akukho okungaqondakali.
Ngezansi, ungabona izibonelo zamasampula e-bootstrap akhiqizwe kudathasethi yoqobo efanayo. Isampula ngayinye ixoxa indaba ehluke kancane, nakuba zonke zivela kudatha efanayo.
Lawa amanye amasethi edatha ayisisekelo sokufaka izikhwama.

Ukuhlehla komugqa we-Bagging: ukuqonda isimiso
Inqubo yokufaka isikhwama
Yebo, mhlawumbe kungokokuqala ngqa uzwa ngakho ukuhlehla komugqa.
Ngokombono, akukho lutho olungalungile ngakho. Njengoba sishilo ekuqaleni, isikhwama siyindlela yokuhlanganisa engasetshenziswa kuyo noma iyiphi imodeli yesisekelo. Ukuhlehla komugqa kuyimodeli, ngakho ngokobuchwepheshe, kuyafaneleka.
Ngokwenza, noma kunjalo, uzobona ngokushesha ukuthi lokhu akusizi kakhulu.
Kodwa akukho okusivimbela ukuba sikwenze. Futhi impela ngoba ayilusizo kakhulu, yenza isibonelo esihle kakhulu sokufunda. Ngakho masikwenze.
Kusampula ngayinye ye-bootstrap, silingana ukuhlehla komugqa. Ku-Excel, lokhu kuqondile. Singasebenzisa ngokuqondile i- LINEST umsebenzi wokulinganisa ama-coefficients. Umbala ngamunye esakhiweni uhambisana nesampula ye-bootstrap eyodwa kanye nomugqa wayo wokuhlehla ohlobene.
Kuze kube manje, yonke into iziphatha ngendlela elindelekile. Imigqa isondelene, kodwa ayifani. Isampula ngayinye ye-bootstrap ishintsha kancane ama-coefficient, ngakho-ke umugqa ofakiwe.

Manje kufika ukuqaphela okuyinhloko.
Ungase uqaphele ukuthi imodeli eyengeziwe eyengeziwe ifakwe emnyama. Lokhu kuhambisana nokuhlehla komugqa okujwayelekile okufakwe ku- isethi yedatha yangempelangaphandle kwe-bootstrapping.
Kwenzekani uma siyiqhathanisa namamodeli anezikhwama?
Uma silinganisa ukubikezela kwakho konke lokhu kuhlehla komugqa, umphumela wokugcina uthi kusewukuhlehla komugqa. Umumo wesibikezelo awushintshi. Ubudlelwano phakathi kokuguquguqukayo buhlala bunomugqa. Asizange sidale imodeli ecacile kakhulu.
Futhi okubaluleke nakakhulu, imodeli efakwe esikhwameni igcina isondelene kakhulu nokuhlehla komugqa okuvamile okuqeqeshwe kudatha yangempela.
Singase siqhubekisele phambili isibonelo ngokusebenzisa idathasethi enesakhiwo esingenamugqa ngokusobala. Kulokhu, ukuhlehla komugqa ngakunye kufakwe emzabalazweni wesampula ye-bootstrap ngendlela yakhona. Eminye imigqa itshekela phezulu kancane, eminye ibheke phansi, kuye ngokuthi yikuphi ukubonwa okuphindwe kabili noma okushodayo kusampula.

Izikhathi zokuzethemba ze-Bootstrap
Ngokubuka kokusebenza kokubikezela, ukuhlehla komugqa akusizi kakhulu.
Noma kunjalo, i-bootstrapping ihlala iwusizo kakhulu umbono owodwa wezibalo obalulekile: ukulinganisa i isikhawu sokuzethemba sezibikezelo.
Esikhundleni sokubheka kuphela isibikezelo esimaphakathi, singabheka ukusatshalaliswa yezibikezelo ezikhiqizwe yiwo wonke amamodeli ane-bootstrapped. Ngevelu ngayinye yokokufaka, manje sinamanani amaningi abikezelwe, elilodwa kusampula ngayinye ye-bootstrap.
Indlela elula nenembile yokulinganisa ukungaqiniseki iwukubala ukuchezuka okujwayelekile kwalezi zibikezelo. Lokhu kuchezuka okujwayelekile kusitshela ukuthi isibikezelo sibucayi kangakanani ezinguqukweni kudatha. Inani elincane lisho ukuthi ukubikezela kuzinzile. Inani elikhulu lisho ukuthi aliqinisekile.
Lo mbono usebenza ngokwemvelo ku-Excel. Uma usunazo zonke izibikezelo ezivela kumamodeli e-bootstrapped, ukwenza ikhompuyutha ukuchezuka kwawo okujwayelekile kuqondile. Umphumela ungahunyushwa njengebhande lokuzethemba elizungeze isibikezelo.
Lokhu kubonakala ngokucacile esakhiweni esingezansi. Ukuhumusha kuqondile: ezifundeni lapho idatha yokuqeqeshwa iyingcosana noma ihlakazeke kakhulu, isikhawu sokuzethemba siba banzi, njengoba izibikezelo zihluka kakhulu kuwo wonke amasampula e-bootstrap.
Ngokuphambene, lapho idatha iminyene, izibikezelo zizinzile futhi isikhawu sokuzethemba siyancipha.

Manje, uma sisebenzisa lokhu kudatha engaqondile, okuthile kuba sobala kakhulu. Ezifundeni lapho imodeli yomugqa idonsa kanzima ukuze ilingane idatha, izibikezelo ezivela kumasampula e-bootstrap ahlukene asabalalisa kakhulu. Isikhawu sokuzethemba siba banzi.
Lokhu ukuqonda okubalulekile. Ngisho noma ukufakwa esikhwameni kungathuthukisi ukunemba kokubikezela, kunikeza ulwazi olubalulekile mayelana ukungaqiniseki. Lisitshela ukuthi imodeli inokwethenjelwa kuphi nalapho ingekho khona.
Ukubona lezi zikhathi zokuzethemba zivela ngokuqondile kumasampula e-bootstrap ku-Excel kwenza lo mqondo wezibalo ucace kakhulu futhi ube onembile.

Izihlahla zokuthatha izinqumo: ukusuka kubafundi ababuthakathaka ukuya kumodeli eqinile
Manje sithuthela ezihlahleni zokunquma.
Umgomo we-bagging uhlala ufana ncamashi. Senza amasampula e-bootstrap amaningi, siqeqeshe imodeli eyodwa kuwo ngamunye, bese sihlanganisa ukuqagela kwawo.
Ngithuthukise ukusetshenziswa kwe-Excel ukuze ngenze inqubo yokuhlukanisa ibe ngokuzenzakalela. Ukuze sigcine izinto zilawuleka ku-Excel, sikhawulela izihlahla ekuhlukaniseni okukodwa. Ukwakha izihlahla ezijulile kungenzeka, kodwa ngokushesha kuba nzima kuspredishithi.
Ngezansi, ungabona izihlahla ezimbili eziboshiwe. Sekukonke, ngakhe eziyisishiyagalombili zazo ngokumane ngikopishe futhi nginamathisele amafomula, okwenza inqubo iqonde futhi kube lula ukuyikhiqiza.

Njengoba izihlahla zezinqumo zingamamodeli angewona umugqa futhi izibikezelo zazo aziguquguquki, ukulinganisa okuphumayo kunomphumela oshelelayo.
Ngenxa yalokho, izikhwama ngokwemvelo zibushelelezi izibikezelo. Esikhundleni sokugxuma okubukhali okwenziwe izihlahla ngazinye, imodeli ehlanganisiwe ikhiqiza izinguquko ezihamba kancane kancane.
Ku-Excel, lo mphumela kulula kakhulu ukuwubona. Izibikezelo ezinezikhwama zishelela ngokusobala kunezibikezelo zanoma yisiphi isihlahla esisodwa.

Abanye benu kungenzeka ukuthi sebezwile iziqu zesinqumookuyizihlahla zokunquma ezinokujula okukodwa. Yilokho kanye esikusebenzisa lapha. Imodeli ngayinye ilula kakhulu. Ngokwaso, isiphunzi singumfundi obuthakathaka.
Umbuzo lapha uthi:
ingabe iqoqo leziqu zesinqumo lanele uma lihlanganiswa nezikhwama?
Sizobuyela kulokhu ngokuhamba kwesikhathi kokuthi “Advent Calendar” yami yokufunda ngomshini.
Ihlathi elingahleliwe: ukunweba izikhwama
Kuthiwani ngeHlathi elingahleliwe?
Lokhu mhlawumbe kungenye yamamodeli ayintandokazi phakathi kososayensi bedatha.
Ngakho-ke kungani ungakhulumi ngakho lapha, ngisho naku-Excel?
Eqinisweni, lokhu esisanda kukwakha sekuseduze kakhulu neHlathi Elingahleliwe!
Ukuze uqonde ukuthi kungani, khumbula ukuthi i-Random Forest iyethula imithombo emibili yokungahleliwe.
- Eyokuqala i-bootstrap yedathasethi. Yilokhu kanye esesivele sikwenzile ngokufaka izikhwama.
- Okwesibili kuwukungahleliwe ohlelweni lokuhlukanisa. Ekuhlukaniseni ngakunye, kucatshangelwa isethi engaphansi engahleliwe kuphela yezici.
Nokho, esimweni sethu, sinesici esisodwa kuphela. Lokho kusho ukuthi akukho ongakhetha kukho. Ukungahleliwe kwesici akusebenzi.
Ngenxa yalokho, esikuthola lapha kungabonakala njengeHlathi elingahleliwe elenziwe lula.
Uma lo mqondo usucacile, ukwelula umqondo ezicini eziningi kuwungqimba olwengeziwe lokungahleliwe, hhayi umqondo omusha.
Futhi ungase ubuze, singasebenzisa lesi simiso ku-Linear Regression, futhi senze okungahleliwe
Isiphetho
Ukufunda ngokuhlanganyela kuncane mayelana namamodeli ayinkimbinkimbi futhi okuningi mayelana nokulawula ukungazinzi.
Ukuvota okulula akuvamile ukusebenza. Ukuhlehla komugqa we-Bagging kushintsha kancane futhi kuhlala kungokokufundisa, nakuba kusiza ukulinganisa ukungaqiniseki. Ngezihlahla zesinqumo, nokho, ukugqoka izikhwama kubaluleke ngempela: isilinganiso samamodeli angazinzile siholela ekuqaguleni okushelelayo nokuqine ngokwengeziwe.
Ihlathi elingahleliwe linweba ngokwemvelo lo mbono ngokwengeza ukungahleliwe okwengeziwe, ngaphandle kokushintsha umgomo oyinhloko. Kubonwa ku-Excel, izindlela zokuhlanganisa ziyayeka ukuba amabhokisi amnyama futhi zibe isinyathelo esilandelayo esinengqondo.
Ngiyabonga ngokungeseka kwakho kokufunda kwami ngomshini “Ikhalenda Le-Advent”.
Abantu bavamise ukukhuluma kakhulu ngokufunda okugadiwe, kodwa ukufunda okungagadiwe kwesinye isikhathi kushaywa indiva, noma kungadalula isakhiwo okungekho ilebula engasiveza.
Uma ufuna ukuhlola le mibono ngokuqhubekayo, nazi izindatshana ezintathu ezicwila kumamodeli anamandla angagadiwe.
Imodeli Yengxube Ye-Gaussian
Inguqulo ethuthukisiwe futhi evumelana nezimo kakhudlwana yezindlela ze-k.
Ngokungafani ne-k-means, i-GMM ivumela amaqoqo ukuthi anwebe, ajikeleze, futhi azivumelanise nesimo sangempela sedatha.
Kodwa kunini lapho i-k-means ne-GMM zikhiqiza imiphumela ehlukene?
Bheka lesi sihloko ukuze ubone izibonelo eziphathekayo neziqhathaniso ezibukwayo.
I-Local Outlier Factor (LOF)
Indlela ehlakaniphile eqhathanisa ukuminyana kwendawo kwendawo ngayinye nomakhelwane ukuze kutholwe okudidayo.
Wonke amafayela e-Excel ayatholakala ngalesi sixhumanisi se-Kofi. Ukwesekwa kwenu kusho lukhulu kimina. Intengo izokhuphuka phakathi nenyanga, ukuze abasekeli bangaphambi kwesikhathi bathole inani elingcono kakhulu.




