Okufanele ukwenze lapho imodeli yakho yengozi yesikweletu isebenza namuhla, kepha ihlukana nezinyanga eziyisithupha kamuva

inemfihlo ekhohlisayo. Izinhlangano Faka amamodeli afinyelela ukunemba kwama-98% ekuqinisekisweni, bese ubuka uncipha ngokukhiqizwa ngokuthula. Iqembu lilibiza ngokuthi “umqondo” uhamba “uqhubeke. Kepha kuthiwani uma lokhu kungeyona into engaqondakali – kuthiwani uma kungumphumela obonakalayo wokuthi sikwenza kanjani?
Ngiqale ukubuza lo mbuzo ngemuva kokubuka enye imodeli yokukhiqiza yehluleka. Impendulo iholele endaweni ethile engalindelekile: Ijometry esiyisebenzisela ukusebenza kahle inquma ukuthi amamodeli ahlala ezinzile njengokuhanjiswa kokusatshalaliswa. Hhayi idatha. Hhayi i-hyperparameters. Isikhala uqobo.
Ngabona ukuthi ubungozi bezikweletu buyinto ebalulekile a Inkinga Yesimohhayi inkinga yokuhlukaniswa. Awudingi ukubikezela “okuzenzakalelayo” noma “akukho okuzenzakalelayo” ngokunemba okungu-98%. Udinga uku-oda ababolekisi ngengozi: Ingabe obolekayo anobungozi kunobo obolekayo b? Uma umnotho uwohloka, ozenzakalelayo kuqala?
Izindlela ezijwayelekile ziphuthelwa ngokuphelele lokhu. Nakhu okwenziwe ngezihlahla ezikhuliswe kahle
- Ukunemba: 98.7% ← ubukeka ehlaba umxhwele
- I-AUC (ikhono elisezingeni): 60.7% ← Kungcono Kakhulu Kunokungahleliwe
- Ezinyangeni eziyi-12 kamuva: 96.6% ukunemba, kepha kunesimo sokuluka
- Izinyanga ezingama-36 kamuva: 93.2% ukunemba, i-AUC ngu-66.7% (empeleni asizisebenzi)
I-XGBoost ifinyelela an ukunemba okuhlaba umxhwele kepha yehluleka emsebenzini uqobo: Uku-oda ubungozi. Kanye ukucekela phansi.
Manje qhathanisa lokhu nalokho engikuthuthukise (okuvezwe ephepheni okwamukelwe ku-IeeE DSA2025):
- I-AUC yokuqala: 80.3%
- Ezinyangeni eziyi-12 kamuva: 76.4%
- Izinyanga ezingama-36 kamuva: I-69.7%
- Izinyanga ezingama-60 kamuva: I-69.7%
Umehluko: I-XGBoost ilahlekelwa amaphuzu angama-32 auc ngaphezu kwezinyanga ezingama-60. Indlela yethu? 10.6 Amaphuzu e-AUC – (indawo engaphansi kwe-de Curve) yikho okuzositshela ukuthi i-algorithm yethu eqeqeshiwe izobikezela kanjani engcupheni yedatha engabonakali.
Kungani lokhu kwenzeka? Kwehla kokuthile okungalindelekile: iJiyomethri yokusebenzisa kahle uqobo.
Kungani lokhu kubalulekile (noma ngabe awukho kwezimali)
Lokhu akuyona nje izikolo zesikweletu. Noma yiluphi uhlelo lapho izinto ezikukhulayo zingaphezu kokubikezela ngqo kubhekana nale nkinga:
- Ukuhlukaniswa Kwengozi Yezokwelapha – Ngubani odinga ukunakekelwa okuphuthumayo kuqala?
- Ukubikezelwa kwamakhasimende kwekhasimende – Yimaphi amakhasimende okufanele sigxile emizamweni yokugcina?
- Isincomo sokuqukethwe – Yini okufanele sikhombise ngokulandelayo?
- Ukutholwa kwenkohliso – Yikuphi ukuthengiselana komuntu okubhaliwe?
- Ukulethwa kwe-chain – Yikuphi ukuphazamiseka ekhelini kuqala?
Lapho umongo wakho ushintsha kancane kancane – futhi ongakwenzi? – Ukunemba metric amanga kuwe. Imodeli ingagcina ukunemba okunemba okungu-95% ngenkathi iklebhula ngokuphelele ukuhleleka kokuthi ubani empeleni engcupheni ephezulu.
Leyo akuyona inkinga yokonakalisa okuyisibonelo. Leyo yinkinga yokwenza kahle.
Lokho i-physics isifundisa ngokuqina
Cabanga nge-GPS navigation. Uma wenza kahle “umzila wamanje omfushane kuphela,” ungahle uqondise othile emgwaqweni osusondele. Kepha uma ugcina ukwakheka kwendlela yokugeleza kwethrafikhi – ubudlelwano phakathi kwemizila – ungagcina ukuholwa okuhle njengoba zishintsha izimo. Yilokho esikudingayo ngamamodeli wesikweletu. Kepha usigcina kanjani isakhiwo?
I-NASA ibhekane nale nkinga ngqo iminyaka. Lapho wenza iplanethi ejikeleza izigidi zeminyaka, izindlela ezijwayelekile zekhompiyutha zenza amaplanethi ahamba kancane – hhayi ngenxa ye-physics, kepha ngenxa yamaphutha aqoqiwe. I-Harcury kancane imijikelezo elangeni. UJupiter uqhuma ngaphandle. Bakuxazulula lokhu Abahlanganisi be-Syplectic: Ama-algorithms agcina ukwakheka kwejometri yohlelo. Ama-orbits ahlala ezinzile ngoba indlela inhlonipho ukuthi yini izazi zesayensi ye-call call “yesigaba sevolumu” – Igcina ubudlelwano phakathi kwezikhundla neveloci.
Manje nansi ingxenye emangazayo: Ingozi yesikweletu inesakhiwo esifanayo.
IJiyomethri Yezikhundla
Intukuthelo evamile ye-gradient iyakuqondisa endaweni ye-euclidean. Ithola amaminithi endawo ukusatshalaliswa kwakho kokuqeqeshwa. Kepha i-euclidean geometry ayigcini Ukuhleleka okuhlobene lapho ukusatshalaliswa kuguquguquka.
Yini eyenza?
Ama-Syplectic Manholds.
Ku-hamiltonian mechanics (i-formism esetshenziselwa i-physics), amasistimu wokulondolozwa (akukho ukulahleka kwamandla) avela kuma-syplepic manpolds – izikhala ezinesakhiwo sefomu lesibili eligcina ivolumu yesigaba (i-liošille theorem).
Kulesi sigaba sesigaba, ukuguqulwa kwe-syplepic kugcina amabanga ahlobene. Hhayi izikhundla eziphelele, kepha uku-oda. Ngqo lokho esikudingayo ngokwezinga ngaphansi kokushintshwa kokusabalalisa. Lapho ulingisa i-pendulum engenakuphikisana usebenzisa izindlela ezijwayelekile zokuhlanganisa, ama-jumps ashukumisayo. I-pendulum in Umdwebo 1 isheshisa kancane noma ibambezele phansi – hhayi ngenxa ye-physics, kepha ngenxa yokulinganiselwa kwezinombolo. Abahlanganisi be-Syplectic abanankinga le nkinga ngoba bagcina isakhiwo seHamiltonia ngqo. Isimiso esifanayo singasetshenziswa ekusebenzeni kwenethiwekhi ye-neural.

Ukuqunjelwa kwamaprotheni kuvela inkinga efanayo. Ubonisa izinkulungwane zama-athomu asebenzisana ngaphezu kwamasosha ama-milliseconds – izinkulungwane zezigidi zezinyathelo zokuhlanganisa. Ama-Card Standators aqongelela amandla: ama-molecule ashisa ngobuciko, i-bond break okungafanele, ukumbumbuluzwa kuqhuma.

Ukuqaliswa: Ukwakheka-Ukulondolozwa Kwendawo
Nakhu engikwenzile empeleni:
Uhlaka lweHamiltonia lwamanethiwekhi we-neural
Ngiguqulwe kabusha ukuqeqeshwa kwenethiwekhi ye-neural njengohlelo lweHamiltonia:

Ku-Mechanical Systems, T (P) yigama lamandla e-kinetic, futhi v (q) 'amandla angaba khona. Kulesi analooliy t (P) kumelela izindleko zokushintsha amapharamitha wemodeli, futhi i-v (q) imelela umsebenzi wokulahleka kwesimo samaholide samanje.
I-Syplectic Euler Optimizer (Hhayi u-Adam / Sgd):
Esikhundleni sika-Adamu noma i-SGD yokwenza kahle, ngisebenzisa ukuhlanganiswa kwe-syplepic:

Ngisebenzise indlela ye-lyplectic euler yohlelo lwe-hamiltonia ngesikhundla q futhi umfutho p
Lapho:
- H yi-hamiltonia (umsebenzi wamandla asuselwa ekulahlekelweni)
- ΔT yisinyathelo sesikhathi (analogous ngenani lokufunda)
- Q yizinsimbi zenethiwekhi (izixhumanisi zesikhundla), futhi
- p aremiable ahlukahlukene (izixhumanisi zezinto ezivelile)
Qaphela ukuthi i-P_ {T +} ivela kuzo zombili izibuyekezo. Lokhu kuhlangana kubalulekile – yilokho okugcina ukwakheka kwe-syplepic. Lokhu akuyona nje umfutho; Ukwakheka – ukuhlanganisa ukuhlanganiswa.
Ukulahleka kweHamiltonia
Ngaphezu kwalokho, ngidale ukulahleka ngokususelwa ku-hamiltonian morm:

Lapho:
- L_base (θ) ukulahleka kwe-binary cross-entropy
- R (θ) ingumugqa ojwayelekile (l2 inhlawulo yezisindo), futhi
- I-λ yenziwa ngokujwayelekile
Isimo esijwayelekile sijezisa ukuphambuka kokulondolozwa kwamandla, ukwenziwa kwenqwaba kwezinto ezinokwakheka okuphansi endaweni yepharamitha.
Isebenza kanjani
Umshini unezakhi ezintathu:
- Isakhiwo se-Syplectic → Ukulondolozwa kwevolumu → Ukuhlola ipharamitha
- IHamiltonian Convertaent → Ukulondolozwa kwamandla → Amandla wesikhathi eside esitebeleni
- Ukuvuselelwa okuhlanganisiwe → Ukondzela ukwakheka kwejometri efanelekile
Lesi sakhiwo simelelwa ku-algorithm elandelayo

Imiphumela: Ukuqina okungcono kwesikhashana
Njengoba kuchaziwe, ngivivinye lolu uhlaka ngisebenzisa i-Freddie Mac single-Family Damle-Family Loan-Level Dataset – okuwukuphela kwedatha yesikweletu ende enezikweletu zesikhathi eside zesikhashana ezihamba ngemijikelezo yezomnotho efanelekile.

I-Logic isitshela ukuthi ukunemba kufanele kunciphe kuwo wonke ama-datasets amathathu (kusuka ezinyangeni eziyi-12 kuye kwezingu-60). Izibikezelo ezinde eziphakeme zisetshenziswa zinembile kunesikhashana. But what we see is that XGBoost does not follow this pattern (AUC values from 0.61 to 0.67 — this is the signature of optimization in the wrong space)- Our symplectic optimizer, despite showing less accuracy, does it (AUC values decrease from 0.84 to 0.70). Isibonelo, yini okuqinisekisa ukuthi ukubikezela kwabangu-36 kuya konengqondo ngokwengeziwe? Ukunemba okungu-0.97 kwe-xgboost noma inani le-0,77 auc kusuka enkambisweni ephefumulelwe ye-hamiltonia? I-XGBoost inezinyanga ezingama-36 i-AUC ka-0.63 (eduze kakhulu nokubikezela okungahleliwe).
Ingxenye ngayinye enomthelela
Esifundweni sethu sobudlova, zonke izingxenye zinikela, nge-Momentum endaweni elula yokunikeza izinzuzo ezinkulu. Lokhu kuqondanisa ne-backgroun ye-theoretical- Ifomu le-Syplectic 2 ligcinwe ngokuhlanganiswa kwesikhundla se-momenteum.

Ukusebenzisa le ndlela
Sebenzisa i-Syplectic noptimization njenge-altynerative kuma-gradient adtimizers lapho:
- Ukusuka Izinto Ezingaphezu Kokunemba Kwezigaba
- Ukusabalalisa Ukushintsha kancane kancane futhi kuyabonakala (imijikelezo yezomnotho, hhayi ama-swans amnyama)
- Ukuqina kwesikhashana kubalulekile (ubungozi bezezimali, ukuqondiswa kwezokwelapha ngokuhamba kwesikhathi)
- Ukubuyiselwa kwemali kuyabiza (ukuqinisekiswa kokulawulwa, ukuvunywa kwemvume)
- Ungakwazi ukukhokhela isikhathi sokuqeqeshwa esingu-2-3x sokuqina kokukhiqiza
- Unezici ze- <10k (zisebenza kahle kuze kube ~ 10k ubukhulu)
Ungasebenzisi lapho:
- Ukuhanjiswa kokusabalalisa kusaphazamiseka / okungalindeleki (ukuphahlazeka kwemakethe, izinguquko zombuso)
- Udinga ukutolika kokuhambisana (lokhu akusizi ngokuchazwa)
- Usezingeni eliphezulu-eliphakeme (> Izici eziyi-10k, izindleko ziyavinjwa)
- Izingqinamba zokuqeqeshwa zesikhathi sangempela (2-3x kancane kune-Adamu)
Okusho ukuthi empeleni kusho ukuthini ngezinhlelo zokukhiqiza
Izinhlangano ezisebenzisa amamodeli wesikweletu noma izinselelo ezifanayo:
Inkinga: Uthola njalo ngekota. Isikhathi ngasinye, uqinisekisa idatha ye-HOLDOOT, bheka ukunemba + ukunemba, ukuhambisa, futhi ubuke amahlathi e-auc ngaphezu kwezinyanga eziyi-12-18. Usola “izimo zemakethe” futhi futhi futhi.
Isixazululo: Sebenzisa i-Syplitic Poplialing. Yamukela ukunemba okuphansi okuphansi (80% vs 98%) ukushintshana ngezikhathi ze-3x ezingcono zesikhashana. Imodeli yakho ihlala ithembekile isikhathi eside. Uphinda ube njalo. Izincazelo Zokulawula zilula: “Imodeli yethu igcina ukuqina kwesimo ngaphansi kokusabalalisa.”
Izindleko: 2-3x isikhathi eside sokuqeqeshwa. Ngokubuyiselwa kwanyanga zonke noma ngekota, lokhu kwamukelekile – unamahora okuhweba okuhlanganisa izinyanga ezizinqumo.
Lokhu yinjiniyela, hhayi umlingo. Sikwenza kahle isikhala esigcina lokho empeleni okubaluleke ngayo inkinga yebhizinisi.
Isithombe esikhulu
Ukuwohloka kwamamodeli akunakugwenywa. Kungumphumela wokwenza kahle esikhaleni esingalungile. I-Gradient Forcent evamile ithola izixazululo ezisebenzela ukusatshalaliswa kwakho kwamanje. Ukusebenza kahle kwe-Sypleetic kuthola izixazululo ezilondolozwa ukwakheka – ubudlelwano phakathi kwezibonelo ezinquma ama-rankings. Indlela yethu ehlongozwayo ngeke ixazulule yonke inkinga ku-ML. Kepha ngoba udokotela ebuka ukubola kwawo ukukhiqizwa – kwenhlangano ebheke imibuzo yokulawula mayelana nokuqina kwemodeli – yisisombululo esisebenza namuhla.
Izinyathelo ezilandelayo
Ikhodi iyatholakala: https://towardsdatascience.com/your-credit-risk-model-works-today-it-breaks-in-six-months/
Iphepha eligcwele: Izotholakala kungekudala. Ngithinte uma unesifiso sokuthola ([email protected]Isihlehlukene
Imibuzo noma ukusebenzisana: Uma usebenza ezinkingeni ezisezingeni elinezidingo zokuqina zesikhashana, ngizoba nentshisekelo yokuzwa ngecala lakho lokusebenzisa.
Siyabonga ngokufunda – nokwabelana!
Udinga usizo lokusebenzisa lolu hlobo lwezinhlelo?
Javier marin
Kusetshenziswe umeluleki we-AI | Ukukhiqizwa kwezinhlelo ze-AI + Ukulandela umthetho wokulawula
[email protected]



