Machine Learning

Okufanele ukwenze lapho imodeli yakho yengozi yesikweletu isebenza namuhla, kepha ihlukana nezinyanga eziyisithupha kamuva

inemfihlo ekhohlisayo. Izinhlangano Faka amamodeli afinyelela ukunemba kwama-98% ekuqinisekisweni, bese ubuka uncipha ngokukhiqizwa ngokuthula. Iqembu lilibiza ngokuthi “umqondo” uhamba “uqhubeke. Kepha kuthiwani uma lokhu kungeyona into engaqondakali – kuthiwani uma kungumphumela obonakalayo wokuthi sikwenza kanjani?

Ngiqale ukubuza lo mbuzo ngemuva kokubuka enye imodeli yokukhiqiza yehluleka. Impendulo iholele endaweni ethile engalindelekile: Ijometry esiyisebenzisela ukusebenza kahle inquma ukuthi amamodeli ahlala ezinzile njengokuhanjiswa kokusatshalaliswa. Hhayi idatha. Hhayi i-hyperparameters. Isikhala uqobo.

Ngabona ukuthi ubungozi bezikweletu buyinto ebalulekile a Inkinga Yesimohhayi inkinga yokuhlukaniswa. Awudingi ukubikezela “okuzenzakalelayo” noma “akukho okuzenzakalelayo” ngokunemba okungu-98%. Udinga uku-oda ababolekisi ngengozi: Ingabe obolekayo anobungozi kunobo obolekayo b? Uma umnotho uwohloka, ozenzakalelayo kuqala?

Izindlela ezijwayelekile ziphuthelwa ngokuphelele lokhu. Nakhu okwenziwe ngezihlahla ezikhuliswe kahle

  • Ukunemba: 98.7% ← ubukeka ehlaba umxhwele
  • I-AUC (ikhono elisezingeni): 60.7% ← Kungcono Kakhulu Kunokungahleliwe
  • Ezinyangeni eziyi-12 kamuva: 96.6% ukunemba, kepha kunesimo sokuluka
  • Izinyanga ezingama-36 kamuva: 93.2% ukunemba, i-AUC ngu-66.7% (empeleni asizisebenzi)

I-XGBoost ifinyelela an ukunemba okuhlaba umxhwele kepha yehluleka emsebenzini uqobo: Uku-oda ubungozi. Kanye ukucekela phansi.

Manje qhathanisa lokhu nalokho engikuthuthukise (okuvezwe ephepheni okwamukelwe ku-IeeE DSA2025):

  • I-AUC yokuqala: 80.3%
  • Ezinyangeni eziyi-12 kamuva: 76.4%
  • Izinyanga ezingama-36 kamuva: I-69.7%
  • Izinyanga ezingama-60 kamuva: I-69.7%

Umehluko: I-XGBoost ilahlekelwa amaphuzu angama-32 auc ngaphezu kwezinyanga ezingama-60. Indlela yethu? 10.6 Amaphuzu e-AUC – (indawo engaphansi kwe-de Curve) yikho okuzositshela ukuthi i-algorithm yethu eqeqeshiwe izobikezela kanjani engcupheni yedatha engabonakali.

Kungani lokhu kwenzeka? Kwehla kokuthile okungalindelekile: iJiyomethri yokusebenzisa kahle uqobo.

Kungani lokhu kubalulekile (noma ngabe awukho kwezimali)

Lokhu akuyona nje izikolo zesikweletu. Noma yiluphi uhlelo lapho izinto ezikukhulayo zingaphezu kokubikezela ngqo kubhekana nale nkinga:

  • Ukuhlukaniswa Kwengozi Yezokwelapha – Ngubani odinga ukunakekelwa okuphuthumayo kuqala?
  • Ukubikezelwa kwamakhasimende kwekhasimende – Yimaphi amakhasimende okufanele sigxile emizamweni yokugcina?
  • Isincomo sokuqukethwe – Yini okufanele sikhombise ngokulandelayo?
  • Ukutholwa kwenkohliso – Yikuphi ukuthengiselana komuntu okubhaliwe?
  • Ukulethwa kwe-chain – Yikuphi ukuphazamiseka ekhelini kuqala?

Lapho umongo wakho ushintsha kancane kancane – futhi ongakwenzi? – Ukunemba metric amanga kuwe. Imodeli ingagcina ukunemba okunemba okungu-95% ngenkathi iklebhula ngokuphelele ukuhleleka kokuthi ubani empeleni engcupheni ephezulu.

Leyo akuyona inkinga yokonakalisa okuyisibonelo. Leyo yinkinga yokwenza kahle.

Lokho i-physics isifundisa ngokuqina

Cabanga nge-GPS navigation. Uma wenza kahle “umzila wamanje omfushane kuphela,” ungahle uqondise othile emgwaqweni osusondele. Kepha uma ugcina ukwakheka kwendlela yokugeleza kwethrafikhi – ubudlelwano phakathi kwemizila – ungagcina ukuholwa okuhle njengoba zishintsha izimo. Yilokho esikudingayo ngamamodeli wesikweletu. Kepha usigcina kanjani isakhiwo?

I-NASA ibhekane nale nkinga ngqo iminyaka. Lapho wenza iplanethi ejikeleza izigidi zeminyaka, izindlela ezijwayelekile zekhompiyutha zenza amaplanethi ahamba kancane – hhayi ngenxa ye-physics, kepha ngenxa yamaphutha aqoqiwe. I-Harcury kancane imijikelezo elangeni. UJupiter uqhuma ngaphandle. Bakuxazulula lokhu Abahlanganisi be-Syplectic: Ama-algorithms agcina ukwakheka kwejometri yohlelo. Ama-orbits ahlala ezinzile ngoba indlela inhlonipho ukuthi yini izazi zesayensi ye-call call “yesigaba sevolumu” – Igcina ubudlelwano phakathi kwezikhundla neveloci.

Manje nansi ingxenye emangazayo: Ingozi yesikweletu inesakhiwo esifanayo.

IJiyomethri Yezikhundla

Intukuthelo evamile ye-gradient iyakuqondisa endaweni ye-euclidean. Ithola amaminithi endawo ukusatshalaliswa kwakho kokuqeqeshwa. Kepha i-euclidean geometry ayigcini Ukuhleleka okuhlobene lapho ukusatshalaliswa kuguquguquka.

Yini eyenza?

Ama-Syplectic Manholds.

Ku-hamiltonian mechanics (i-formism esetshenziselwa i-physics), amasistimu wokulondolozwa (akukho ukulahleka kwamandla) avela kuma-syplepic manpolds – izikhala ezinesakhiwo sefomu lesibili eligcina ivolumu yesigaba (i-liošille theorem).

Ifomu le-Syplectic 2

Kulesi sigaba sesigaba, ukuguqulwa kwe-syplepic kugcina amabanga ahlobene. Hhayi izikhundla eziphelele, kepha uku-oda. Ngqo lokho esikudingayo ngokwezinga ngaphansi kokushintshwa kokusabalalisa. Lapho ulingisa i-pendulum engenakuphikisana usebenzisa izindlela ezijwayelekile zokuhlanganisa, ama-jumps ashukumisayo. I-pendulum in Umdwebo 1 isheshisa kancane noma ibambezele phansi – hhayi ngenxa ye-physics, kepha ngenxa yokulinganiselwa kwezinombolo. Abahlanganisi be-Syplectic abanankinga le nkinga ngoba bagcina isakhiwo seHamiltonia ngqo. Isimiso esifanayo singasetshenziswa ekusebenzeni kwenethiwekhi ye-neural.

Umdwebo 1. I-Pendulum engenabuntu iyisibonelo esiyisisekelo kunazo zonke sokhekhemisi seHamiltonia. I-Pendulum ayihlangene nomoya njengoba ibingaqeda amandla. I-Hamiltonian Okuhlelekile ku-physics isebenza ezinhlelweni ezilondolozayo noma ezingeyona eyokuphazamisa ngokulondolozwa kwamandla. Lesi sithombe ngakwesobunxele sibonisa umkhondo we-trajectory we-pendulum esikhaleni sesigaba, esimelwe yi-velocity kanye ne-angle (isithombe esiphakathi). Isithombe nguMlobi.

Ukuqunjelwa kwamaprotheni kuvela inkinga efanayo. Ubonisa izinkulungwane zama-athomu asebenzisana ngaphezu kwamasosha ama-milliseconds – izinkulungwane zezigidi zezinyathelo zokuhlanganisa. Ama-Card Standators aqongelela amandla: ama-molecule ashisa ngobuciko, i-bond break okungafanele, ukumbumbuluzwa kuqhuma.

Umdwebo 2: Ukulingana phakathi kwe- “Hamiltonia ezinhlelweni zomzimba”, kanye nesicelo sayo ezikhaleni ze-NN. Isikhundla Q silingana namapharamitha we-Nn Ngaphandle kokuthi singakubiza ngokuthi “ugqozi lwe-physics”, lokhu kusetshenziswa amafomu ahlukile we-geometry Syplictic amafomu, i-theorem kaLiouville, isakhiwo – ukuhlanganisa ukuhlanganiswa. Kepha ngicabanga ukuthi i-hamiltonian analogy inomqondo owengeziwe ngezinhloso zokugamba. Isithombe nguMlobi.

Ukuqaliswa: Ukwakheka-Ukulondolozwa Kwendawo

Nakhu engikwenzile empeleni:

Uhlaka lweHamiltonia lwamanethiwekhi we-neural

Ngiguqulwe kabusha ukuqeqeshwa kwenethiwekhi ye-neural njengohlelo lweHamiltonia:

I-Hamiltonian equation yezinhlelo zemishini

Ku-Mechanical Systems, T (P) yigama lamandla e-kinetic, futhi v (q) 'amandla angaba khona. Kulesi analooliy t (P) kumelela izindleko zokushintsha amapharamitha wemodeli, futhi i-v (q) imelela umsebenzi wokulahleka kwesimo samaholide samanje.

I-Syplectic Euler Optimizer (Hhayi u-Adam / Sgd):

Esikhundleni sika-Adamu noma i-SGD yokwenza kahle, ngisebenzisa ukuhlanganiswa kwe-syplepic:

Ngisebenzise indlela ye-lyplectic euler yohlelo lwe-hamiltonia ngesikhundla q futhi umfutho p

Lapho:

  • H yi-hamiltonia (umsebenzi wamandla asuselwa ekulahlekelweni)
  • ΔT yisinyathelo sesikhathi (analogous ngenani lokufunda)
  • Q yizinsimbi zenethiwekhi (izixhumanisi zesikhundla), futhi
  • p aremiable ahlukahlukene (izixhumanisi zezinto ezivelile)

Qaphela ukuthi i-P_ {T +} ivela kuzo zombili izibuyekezo. Lokhu kuhlangana kubalulekile – yilokho okugcina ukwakheka kwe-syplepic. Lokhu akuyona nje umfutho; Ukwakheka – ukuhlanganisa ukuhlanganiswa.

Ukulahleka kweHamiltonia

Ngaphezu kwalokho, ngidale ukulahleka ngokususelwa ku-hamiltonian morm:

Lapho:

  • L_base (θ) ukulahleka kwe-binary cross-entropy
  • R (θ) ingumugqa ojwayelekile (l2 inhlawulo yezisindo), futhi
  • I-λ yenziwa ngokujwayelekile

Isimo esijwayelekile sijezisa ukuphambuka kokulondolozwa kwamandla, ukwenziwa kwenqwaba kwezinto ezinokwakheka okuphansi endaweni yepharamitha.

Isebenza kanjani

Umshini unezakhi ezintathu:

  1. Isakhiwo se-Syplectic → Ukulondolozwa kwevolumu → Ukuhlola ipharamitha
  2. IHamiltonian Convertaent → Ukulondolozwa kwamandla → Amandla wesikhathi eside esitebeleni
  3. Ukuvuselelwa okuhlanganisiwe → Ukondzela ukwakheka kwejometri efanelekile

Lesi sakhiwo simelelwa ku-algorithm elandelayo

Umdwebo 3: U-Algorithm wasebenzisa wasebenzisa isibuyekezo se-momentum kanye nokwenza kahle kweHamiltonia.

Imiphumela: Ukuqina okungcono kwesikhashana

Njengoba kuchaziwe, ngivivinye lolu uhlaka ngisebenzisa i-Freddie Mac single-Family Damle-Family Loan-Level Dataset – okuwukuphela kwedatha yesikweletu ende enezikweletu zesikhathi eside zesikhashana ezihamba ngemijikelezo yezomnotho efanelekile.

I-Logic isitshela ukuthi ukunemba kufanele kunciphe kuwo wonke ama-datasets amathathu (kusuka ezinyangeni eziyi-12 kuye kwezingu-60). Izibikezelo ezinde eziphakeme zisetshenziswa zinembile kunesikhashana. But what we see is that XGBoost does not follow this pattern (AUC values ​​from 0.61 to 0.67 — this is the signature of optimization in the wrong space)- Our symplectic optimizer, despite showing less accuracy, does it (AUC values ​​decrease from 0.84 to 0.70). Isibonelo, yini okuqinisekisa ukuthi ukubikezela kwabangu-36 kuya konengqondo ngokwengeziwe? Ukunemba okungu-0.97 kwe-xgboost noma inani le-0,77 auc kusuka enkambisweni ephefumulelwe ye-hamiltonia? I-XGBoost inezinyanga ezingama-36 i-AUC ka-0.63 (eduze kakhulu nokubikezela okungahleliwe).

Ingxenye ngayinye enomthelela

Esifundweni sethu sobudlova, zonke izingxenye zinikela, nge-Momentum endaweni elula yokunikeza izinzuzo ezinkulu. Lokhu kuqondanisa ne-backgroun ye-theoretical- Ifomu le-Syplectic 2 ligcinwe ngokuhlanganiswa kwesikhundla se-momenteum.

Itafula. Isifundo se-ablation. I-NN ejwayelekile nge-Adam Optimizer vs. Indlela yethu (imodeli egcwele yeHamiltonia)

Ukusebenzisa le ndlela

Sebenzisa i-Syplectic noptimization njenge-altynerative kuma-gradient adtimizers lapho:

  • Ukusuka Izinto Ezingaphezu Kokunemba Kwezigaba
  • Ukusabalalisa Ukushintsha kancane kancane futhi kuyabonakala (imijikelezo yezomnotho, hhayi ama-swans amnyama)
  • Ukuqina kwesikhashana kubalulekile (ubungozi bezezimali, ukuqondiswa kwezokwelapha ngokuhamba kwesikhathi)
  • Ukubuyiselwa kwemali kuyabiza (ukuqinisekiswa kokulawulwa, ukuvunywa kwemvume)
  • Ungakwazi ukukhokhela isikhathi sokuqeqeshwa esingu-2-3x sokuqina kokukhiqiza
  • Unezici ze- <10k (zisebenza kahle kuze kube ~ 10k ubukhulu)

Ungasebenzisi lapho:

  • Ukuhanjiswa kokusabalalisa kusaphazamiseka / okungalindeleki (ukuphahlazeka kwemakethe, izinguquko zombuso)
  • Udinga ukutolika kokuhambisana (lokhu akusizi ngokuchazwa)
  • Usezingeni eliphezulu-eliphakeme (> Izici eziyi-10k, izindleko ziyavinjwa)
  • Izingqinamba zokuqeqeshwa zesikhathi sangempela (2-3x kancane kune-Adamu)

Okusho ukuthi empeleni kusho ukuthini ngezinhlelo zokukhiqiza

Izinhlangano ezisebenzisa amamodeli wesikweletu noma izinselelo ezifanayo:

Inkinga: Uthola njalo ngekota. Isikhathi ngasinye, uqinisekisa idatha ye-HOLDOOT, bheka ukunemba + ukunemba, ukuhambisa, futhi ubuke amahlathi e-auc ngaphezu kwezinyanga eziyi-12-18. Usola “izimo zemakethe” futhi futhi futhi.

Isixazululo: Sebenzisa i-Syplitic Poplialing. Yamukela ukunemba okuphansi okuphansi (80% vs 98%) ukushintshana ngezikhathi ze-3x ezingcono zesikhashana. Imodeli yakho ihlala ithembekile isikhathi eside. Uphinda ube njalo. Izincazelo Zokulawula zilula: “Imodeli yethu igcina ukuqina kwesimo ngaphansi kokusabalalisa.”

Izindleko: 2-3x isikhathi eside sokuqeqeshwa. Ngokubuyiselwa kwanyanga zonke noma ngekota, lokhu kwamukelekile – unamahora okuhweba okuhlanganisa izinyanga ezizinqumo.

Lokhu yinjiniyela, hhayi umlingo. Sikwenza kahle isikhala esigcina lokho empeleni okubaluleke ngayo inkinga yebhizinisi.

Isithombe esikhulu

Ukuwohloka kwamamodeli akunakugwenywa. Kungumphumela wokwenza kahle esikhaleni esingalungile. I-Gradient Forcent evamile ithola izixazululo ezisebenzela ukusatshalaliswa kwakho kwamanje. Ukusebenza kahle kwe-Sypleetic kuthola izixazululo ezilondolozwa ukwakheka – ubudlelwano phakathi kwezibonelo ezinquma ama-rankings. Indlela yethu ehlongozwayo ngeke ixazulule yonke inkinga ku-ML. Kepha ngoba udokotela ebuka ukubola kwawo ukukhiqizwa – kwenhlangano ebheke imibuzo yokulawula mayelana nokuqina kwemodeli – yisisombululo esisebenza namuhla.

Izinyathelo ezilandelayo

Ikhodi iyatholakala: https://towardsdatascience.com/your-credit-risk-model-works-today-it-breaks-in-six-months/

Iphepha eligcwele: Izotholakala kungekudala. Ngithinte uma unesifiso sokuthola ([email protected]Isihlehlukene

Imibuzo noma ukusebenzisana: Uma usebenza ezinkingeni ezisezingeni elinezidingo zokuqina zesikhashana, ngizoba nentshisekelo yokuzwa ngecala lakho lokusebenzisa.


Siyabonga ngokufunda – nokwabelana!

Udinga usizo lokusebenzisa lolu hlobo lwezinhlelo?

Javier marin
Kusetshenziswe umeluleki we-AI | Ukukhiqizwa kwezinhlelo ze-AI + Ukulandela umthetho wokulawula
[email protected]


Source link

Related Articles

Leave a Reply

Your email address will not be published. Required fields are marked *

Back to top button