Ukufunda Okudidiyelwe, Ingxenye 1: Izisekelo Zamamodeli Wokuqeqesha Lapho Kuhlala Khona Idatha

Mina umqondo wokufunda okudidiyelwe (FL) ngokusebenzisa ikhomikhi ye-Google ngo-2019. Bekuyisiqephu esihle futhi ngenze umsebenzi omuhle ekuchazeni ukuthi imikhiqizo ingathuthuka kanjani ngaphandle kokuthumela idatha yomsebenzisi efwini. Kamuva nje, bengifuna ukuqonda uhlangothi lobuchwepheshe lwalo mkhakha ngokuningiliziwe. Idatha yokuqeqeshwa isiphenduke into ebaluleke kakhulu njengoba ibalulekile ekwakheni amamodeli amahle kodwa okuningi kwalokhu akusasetshenziswa ngenxa yokuthi ahlukene, awakhiwe noma akhiyelwe ngaphakathi kwezisele.
Lapho ngiqala ukuhlola lo mkhakha, ngathola i- Uhlaka lwezimbali ukuze ube indlela eqondile nefanele abaqalayo yokuqala e-FL. Iwumthombo ovulekile, imibhalo icacile, futhi umphakathi owazungezile ukhuthele futhi uwusizo. Kungesinye sezizathu zentshisekelo yami evuselelwe kulo mkhakha.
Lesi sihloko siyingxenye yokuqala yochungechunge lapho ngihlola khona ukufunda okuhlanganyelwe ngokujula okwengeziwe, okuhlanganisa ukuthi kuyini, ukuthi kwenziwa kanjani, izinkinga ezivulekile ebhekene nazo, nokuthi kungani kubalulekile kuzilungiselelo ezizwelayo zobumfihlo. Ezinyaweni ezilandelayo, ngizongena ngijule ekusetshenzisweni okungokoqobo ne Imbaliuhlaka, xoxa ngobumfihlo ekufundeni okuhlanganyelwe futhi uhlole ukuthi le mibono inweba kanjani ezimweni zokusetshenziswa ezithuthuke kakhulu.
Lapho ukufunda komshini ophakathi nendawo akulungile
Siyazi ukuthi amamodeli e-AI ancike enanini elikhulu ledatha, nokho idatha ewusizo kakhulu ibucayi, isabalalisiwe, futhi kunzima ukuyifinyelela. Cabanga ngedatha engaphakathi kwezibhedlela, amafoni, izimoto, izinzwa, namanye amasistimu onqenqema. Ukukhathazeka kobumfihlo, imithetho yendawo, isitoreji esilinganiselwe, nemikhawulo yenethiwekhi yenza ukuhambisa le datha endaweni emaphakathi kube nzima kakhulu noma kungenzeki. Ngenxa yalokho, amanani amakhulu edatha ebalulekile ahlala engasetshenziswa. Ekunakekelweni kwezempilo, le nkinga ibonakala ikakhulukazi. Izibhedlela zikhiqiza amashumi ama-petabytes edatha minyaka yonke, nokho ucwaningo lulinganisela lokho kuze kufike U-97% wale datha awusetshenziswa .
Ukufunda ngomshini okuvamile kuthatha ukuthi yonke idatha yokuqeqeshwa ingaqoqwa endaweni eyodwa, ngokuvamile kuseva ephakathi nendawo noma isikhungo sedatha. Lokhu kusebenza lapho idatha ingahanjiswa ngokukhululekile, kodwa iphuka lapho idatha iyimfihlo noma ivikelwe. Empeleni, ukuqeqeshwa okumaphakathi nakho kuncike ekuxhumekeni okuzinzile, umkhawulokudonsa owanele, kanye ne-latency ephansi, okunzima ukuyiqinisekisa ezindaweni ezisabalalisiwe noma ezisemaphethelweni.
Ezimweni ezinjalo, ukukhetha okubili okujwayelekile kuvela. Enye inketho ukungasebenzisi idatha nhlobo, okusho ukuthi ulwazi olubalulekile luhlala lukhiyiwe ngaphakathi kwezisele.
Enye inketho ukuvumela inkampane ngayinye yendawo iqeqeshe imodeli kudatha yayo futhi yabelane kuphela ngalokho imodeli ekufundayo, kuyilapho idatha eluhlaza ingashiyi indawo yayo yangempela. Le nketho yesibili yakha isisekelo sokufunda okuhlangene, okuvumela amamodeli ukuthi afunde kudatha esabalalisiwe ngaphandle kokuyihambisa. Isibonelo esaziwa kakhulu i-Google Gboard ku-Android, lapho izici ezifana nokuqagela igama elilandelayo kanye Ukuqamba Okuhlakaniphilesebenzisa amakhulu ezigidi zamadivayisi.
Ukufunda Okuhlanganisiwe: Ukuhambisa Imodeli Kudatha
Ukufunda okuhlanganyelwe kungacatshangwa njengokusetha umshini wokuhlanganyela lapho ukuqeqeshwa kwenzeka ngaphandle kokuqoqa idatha endaweni eyodwa emaphakathi. Ngaphambi kokubheka ukuthi isebenza kanjani ngaphansi kwe-hood, ake sibone izibonelo ezimbalwa zomhlaba wangempela ezibonisa ukuthi kungani le ndlela ibalulekile kuzilungiselelo ezinobungozi obukhulu, kusuka ezizindeni kusukela ekunakekelweni kwezempilo kuya ezindaweni ezibucayi zokuphepha.
Ukunakekela impilo
Kwezokunakekelwa kwempilo, ukufunda okuhlanganyelwe kunikeze amandla ukuhlolwa kwe-COVID kusenesikhathi Curial AI uhlelo oluqeqeshwe ezibhedlela eziningi ze-NHS lisebenzisa izimpawu ezibalulekile ezivamile nokuhlolwa kwegazi. Ngenxa yokuthi idatha yesiguli ayikwazanga ukwabelwa kuzo zonke izibhedlela, ukuqeqeshwa kwenziwa endaweni endaweni ngayinye futhi kwashintshaniswa ngamamodeli kuphela. Imodeli yomhlaba wonke ewumphumela yenzeka kangcono kunamamodeli aqeqeshwe ezibhedlela ngazinye, ikakhulukazi uma ehlolwa kumasayithi angabonwa.
Ukufanekisa Kwezokwelapha

Ukufunda okudidiyelwe nakho kuyabhekwa emfanekisweni wezokwelapha. Abacwaningi base-UCL nase-Moorfields Eye Hospital bayisebenzisela ukulungisa amamodeli amakhulu esisekelo sombono kumaskeni wamehlo abucayi angakwazi ukubekwa endaweni eyodwa.
Ukuzivikela
Ngale kokunakekelwa kwezempilo, ukufunda okuhlanganyelwe kuyasetshenziswa nasezizindeni ezizwelayo zokuphepha ezifana nezokuvikela nokundiza. Lapha, amamodeli aqeqeshwa kudatha esabalalisiwe ye-physiological neyokusebenza okufanele ihlale isendaweni.
Izinhlobo ezahlukene zokufunda ezihlanganisiwe
Ezingeni eliphezulu, Ukufunda Okudidiyelwe kungahlanganiswa kube izinhlobo ezimbalwa ezijwayelekile ngokusekelwe ngobani amaklayentifuthi ukuthi idatha ihlukaniswa kanjani .
• Cross-Device vs Cross-Silo Federated Learning
Ukufunda okuhlanganyelwe kwamadivayisi kuhlanganisa ukusetshenziswa kwamaklayenti amaningi okungase kufike ezigidini, njengamadivayisi omuntu siqu noma amafoni, ngalinye linenani elincane ledatha yendawo kanye nokuxhumana okungathembeki. Nokho, ngesikhathi esithile, ingxenye encane kuphela yamadivayisi ibamba iqhaza kunoma yimuphi umjikelezo othile. I-Google Gboard iyisibonelo esijwayelekile salokhu kusetha.
Cross-silo imfundo ehlanganisiwe, Ngokolunye uhlangothi, kubandakanya inani elincane kakhulu lamakhasimende, ngokuvamile izinhlangano ezifana nezibhedlela noma amabhange. Iklayenti ngalinye liphethe idathasethi enkulu futhi linekhompyutha ezinzile nokuxhumana. Iningi lamabhizinisi omhlaba wangempela kanye nezimo zokusebenzisa ukunakekelwa kwezempilo zibukeka njengemfundo ehlanganisiwe ye-cross-silo.
• Ukufunda Okuhlangene Okuvundlile vs Kuqondile

Ukufunda okuhlangene okuvundlile ichaza ukuthi idatha ihlukaniswa kanjani kuwo wonke amaklayenti. Kulokhu, wonke amaklayenti abelana ngesikhala sesici esifanayo, kodwa ngalinye liphethe amasampula ahlukene. Isibonelo, izibhedlela eziningi zingarekhoda ukuguquguquka okufanayo kwezokwelapha, kodwa ezigulini ezihlukene. Lolu uhlobo oluvame kakhulu lokufunda ngokuhlanganyela.
Ukufunda okuhlanganisiwe okuqondile isetshenziswa lapho amaklayenti abelana ngesethi efanayo yezinhlangano kodwa enezici ezihlukile. Isibonelo, isibhedlela nomhlinzeki womshwalense bobabili bangase babe nedatha emayelana nabantu abafanayo, kodwa ngezibaluli ezihlukile. Ukuqeqeshwa, kulokhu kudinga ukuxhumana okuphephile ngoba izikhala zesici ziyahlukahluka, futhi lokhu kusetha akuvamile kunokufunda okuhlangene okuvundlile.
Lezi zigaba azihlukani. Uhlelo lwangempela luvame ukuchazwa kusetshenziswa zombili izimbazo, isibonelo, a i-cross-silo, ukufunda okuhlangene okuvundlileukumisa.
Isebenza kanjani iFederated Learning
Ukufunda okuhlanganyelwe kulandela inqubo elula, ephindaphindwayo ehlanganiswe iseva emaphakathi futhi yenziwa amaklayenti amaningi abamba idatha endaweni, njengoba kuboniswe kumdwebo ongezansi.

Ukuqeqeshwa ekufundeni okuhlangene kuqhubeka ngokuphindaphindiwe imijikelezo yokufunda ehlanganisiwe. Emzuliswaneni ngamunye, iseva ikhetha isethi encane yamakhasimende engahleliwe, iwathumele izisindo zemodeli yamanje, bese ilinda ukubuyekezwa. Iklayenti ngalinye liqeqesha imodeli endaweni lisebenzisa ukwehla kwe-stochastic gradient ngokuvamile ama-epoch ambalwa wendawo ngamaqoqo ayo, futhi ibuyisela kuphela izisindo ezibuyekeziwe. Ezingeni eliphezulu ilandela lezi zinyathelo ezinhlanu ezilandelayo:
- Ukuqaliswa
Imodeli yomhlaba wonke idaliwe kuseva, esebenza njengomxhumanisi. Imodeli ingase iqaliswe ngokungahleliwe noma iqale esimweni esiqeqeshwe kusengaphambili.
2. Ukusatshalaliswa kwemodeli
Emzuliswaneni ngamunye, iseva ikhetha isethi yamakhasimende (ngokusekelwe kumasampula angahleliwe noma isu elichazwe ngaphambilini) ababamba iqhaza ekuqeqeshweni futhi abathumele amamodeli amanje esisindo somhlaba wonke. Lawa maklayenti kungaba amafoni, amadivaysi e-IoT noma izibhedlela ngazinye.
3. Ukuqeqeshwa kwendawo
Iklayenti ngalinye elikhethiwe libe seliqeqesha imodeli endaweni lisebenzisa idatha yalo. Idatha ayilokothi ishiye iklayenti futhi konke ukubala kwenzeka kudivayisi noma ngaphakathi kwenhlangano efana nesibhedlela noma ibhange.
4. Ukuxhumana kokubuyekezwa kwemodeli
Ngemva kokuqeqeshwa kwasendaweni, amaklayenti athumela kuphela amapharamitha emodeli abuyekeziwe (okungaba izisindo noma amagradient) emuva kuseva kuyilapho idatha eluhlaza yabiwa nganoma isiphi isikhathi.
5. Ukuhlanganisa
Iseva ihlanganisa izibuyekezo zeklayenti ukuze ikhiqize imodeli entsha yomhlaba wonke. Ngenkathi Isilinganiso Esihlanganisiwe (Fed Avg) kuyindlela evamile yokuhlanganisa, amanye amaqhinga nawo ayasetshenziswa. Imodeli ebuyekeziwe ibe isibuyiselwa kumakhasimende, futhi inqubo iphinda kuze kube yilapho ihlangana.
Ukufunda okuhlanganyelwe kuyinqubo ephindaphindayo futhi ukudlula ngakunye kule loop kubizwa ngokuthi iround. Ukuqeqesha imodeli ehlanganisiwe ngokuvamile kudinga imizuliswano eminingi, ngezinye izikhathi amakhulu, kuya ngezici ezifana nosayizi wemodeli, ukusatshalaliswa kwedatha kanye nenkinga exazululwayo.
I-Mathematical Intuition ngemuva kwe-Federated Averaging
Ukuhamba komsebenzi okuchazwe ngenhla nakho kungabhalwa ngokusemthethweni. Umfanekiso ongezansi ukhombisa okwangempela Isilinganiso Esihlanganisiwe (Fed Avg) I-algorithm evela ephepheni lesibili le-Google. Le-algorithm kamuva yaba yindawo eyinkomba eyinhloko futhi yabonisa ukuthi ukufunda okuhlangene kungasebenza ngokuzijwayeza. Lokhu kwakhiwa kwaba yindawo eyinkomba yezinhlelo eziningi zokufunda ezihlangene namuhla.

I-algorithm yokuqala ye-Federated Averaging, ebonisa iluphu yokuqeqeshwa kweseva-iklayenti kanye nokuhlanganisa okunesisindo kwamamodeli endawo.
Emgogodleni we-Federated Averaging isinyathelo sokuhlanganisa, lapho iseva ibuyekeza imodeli yomhlaba wonke ngokuthatha isilinganiso esinesisindo samamodeli amaklayenti aqeqeshwe endaweni. Lokhu kungabhalwa ngokuthi:

Lesi sibalo sikwenza kucace ukuthi iklayenti ngalinye linikela kanjani kumodeli yomhlaba wonke. Amaklayenti anedatha yasendaweni eyengeziwe anomthelela omkhulu, kuyilapho lawo anamasampuli ambalwa anikela kancane ngokulinganayo. Empeleni, lo mbono olula uyisizathu esenza ukuthi i-Fed Avg ibe isisekelo esimisiwe sokufunda okuhlangene.
Ukuqaliswa okulula kwe-NumPy
Ake sibheke isibonelo esincane lapho amaklayenti amahlanu ekhethiwe. Ukuze senze kube lula, sicabanga ukuthi iklayenti ngalinye seliqedile kakade ukuqeqeshwa kwasendaweni futhi libuyisele izisindo zalo ezibuyekeziwe zemodeli kanye nenani lamasampuli eliwasebenzisile. Isebenzisa lawa manani, iseva ibala isamba esinesisindo esikhiqiza imodeli entsha yomhlaba wonke yomzuliswano olandelayo. Lokhu kufaka isibalo se-Fed Avg ngokuqondile, ngaphandle kokwethula ukuqeqeshwa noma imininingwane yohlangothi lweklayenti.
import numpy as np
# Client models after local training (w_{t+1}^k)
client_weights = [
np.array([1.0, 0.8, 0.5]), # client 1
np.array([1.2, 0.9, 0.6]), # client 2
np.array([0.9, 0.7, 0.4]), # client 3
np.array([1.1, 0.85, 0.55]), # client 4
np.array([1.3, 1.0, 0.65]) # client 5
]
# Number of samples at each client (n_k)
client_sizes = [50, 150, 100, 300, 4000]
# m_t = total number of samples across selected clients S_t
m_t = sum(client_sizes) # 50+150+100+300+400
# Initialize global model w_{t+1}
w_t_plus_1 = np.zeros_like(client_weights[0])
# FedAvg aggregation:
# w_{t+1} = sum_{k in S_t} (n_k / m_t) * w_{t+1}^k
# (50/1000) * w_1 + (150/1000) * w_2 + ...
for w_k, n_k in zip(client_weights, client_sizes):
w_t_plus_1 += (n_k / m_t) * w_k
print("Aggregated global model w_{t+1}:", w_t_plus_1)
-------------------------------------------------------------
Aggregated global model w_{t+1}: [1.27173913 0.97826087 0.63478261]
Ukuthi ukuhlanganisa kubalwa kanjani
Ukuze sibeke izinto ngendlela efanele, singanweba isinyathelo sokuhlanganisa kumakhasimende amabili nje futhi sibone ukuthi izinombolo zihambisana kanjani.

Izinselelo Ezindaweni Zokufunda Ezihlanganisiwe
Ukufunda okuhlanganyelwe kuza nesethi yayo yezinselelo. Enye yezinkinga ezinkulu lapho isetshenziswa ukuthi idatha kuwo wonke amaklayenti ngokuvamile ayiyona i-IID (ayizimele futhi isatshalaliswa ngokufanayo). Lokhu kusho ukuthi amaklayenti ahlukene angabona ukusatshalaliswa kwedatha okuhluke kakhulu okungase kubambezele ukuqeqeshwa futhi kwenze imodeli yomhlaba wonke izinze. Isibonelo, izibhedlela kumfelandawonye zingakwazi ukusiza abantu abahlukene abangalandela amaphethini ahlukene.
Amasistimu ahlanganisiwe angabandakanya noma yini kusukela ezinhlanganweni ezimbalwa kuya ezigidini zamadivayisi futhi ukuphatha ukubamba iqhaza, ukuyeka nokuhlanganisa kuba nzima kakhulu njengoba isikali sohlelo.
Nakuba ukufunda okuhlanganyelwe kugcina idatha eluhlaza yasendaweni, ayixazululi ngokugcwele ubumfihlo ngokwalo. Izibuyekezo zemodeli zisengavuza ulwazi oluyimfihlo uma lungavikelekile ngakho-ke izindlela zobumfihlo ezengeziwe zivame ukudingeka. Ekugcineni, ukuxhumanakungaba umthombo webhodlela. Njengoba amanethiwekhi anganensa noma angathembeki futhi ukuthumela izibuyekezo njalo kungase kubize.
Isiphetho nokuthi yini elandelayo
Kulesi sihloko, sizwile ukuthi ukufunda okuhlanganisiwe kusebenza kanjani ezingeni eliphezulu futhi siphinde sahamba ngokusetshenziswa okulula kwe-Numpy. Nokho, esikhundleni sokubhala ingqondo ewumgogodla ngesandla, kunezinhlaka ezifana ne-Flower ehlinzeka ngendlela elula neguquguqukayo yokwakha izinhlelo zokufunda ezihlangene. Engxenyeni elandelayo, sizosebenzisa i-Flower ukuze isiphakamisele esindayo ukuze sikwazi ukugxila kumodeli nakudatha kunemishini yokufunda eyinhlangano. Sizophinde sibheke ama-LLM ahlanganisiwelapho usayizi wemodeli, izindleko zokuxhumana, nezithiyo zobumfihlo zibaluleka nakakhulu.
Qaphela: Zonke izithombe, ngaphandle uma kushiwo ngenye indlela, zidalwe umbhali.



