I-“Advent Calendar” Yokufunda Ngomshini Usuku lwama-24: Ama-Transformers Ombhalo ku-Excel

leKhalenda lami le-Machine Learning Advent.
Ngaphambi kokuvala lolu chungechunge, ngithanda ukubonga ngobuqotho wonke umuntu owalulandelayo, owabelana ngempendulo, futhi walusekela, ikakhulukazi ithimba le-Towards Data Science.
Ukuqeda leli khalenda ngama-Transformers akuyona into eyenzeka ngengozi. I-Transformer ayilona nje igama elihle. Iwumgogodla wamamodeli wesimanje woLimi Olukhulu.
Kuningi ongakusho ngama-RNN, ama-LSTM, nama-GRU. Badlale indima ebalulekile yomlando ekumodeleni okulandelanayo. Kodwa namuhla, ama-LLM anamuhla asekelwe kakhulu kuma-Transformers.
Igama elithi Transformer ngokwalo liphawula ukugqashuka. Ngokombono wokuqamba amagama, ababhali bebengakhetha into efana ne-Attention Neural Networks, ngokuhambisana ne-Recurrent Neural Networks noma Convolutional Neural Networks. Njengomqondo weCartesian, bengingabonga ukwakheka kwamagama okungaguquki. Kepha ukuqamba eceleni, ukuguquguquka komqondo okwethulwe ngabaguquli kuwufakazela ngokugcwele umehluko.
Ama-transformer angasetshenziswa ngezindlela ezahlukene. Izakhiwo zesifaki khodi zivame ukusetshenziselwa ukuhlukanisa. Izakhiwo zedekhoda zisetshenziselwa ukubikezela ithokheni elandelayo, ngakho-ke ukukhiqiza umbhalo.
Kulesi sihloko, sizogxila embonweni owodwa owumongo kuphela: ukuthi i-matrix yokunaka ikuguqula kanjani ukushumeka okokufaka kube okuthile okunenjongo.
Esihlokweni esedlule, sethule i-1D Convolutional Neural Networks yombhalo. Sibonile ukuthi i-CNN iskena umusho isebenzisa amawindi amancane futhi iphendule lapho ibona amaphethini endawo. Le ndlela isivele inamandla kakhulu, kodwa inomkhawulo ocacile: i-CNN ibheka endaweni kuphela.
Namuhla, siqhubekela phambili ngesinyathelo esisodwa.
I-Transformer iphendula umbuzo ohluke kakhulu.
Kuthiwani uma wonke amagama angabheka wonke amanye amagama ngesikhathi esisodwa?
1. Igama elifanayo ezimweni ezimbili ezahlukene
Ukuze siqonde ukuthi kungani kudingeka ukunakwa, sizoqala ngombono olula.
Sizosebenzisa imisho yokufaka emibili eyehlukenezombili eziqukethe igama igundanekodwa isetshenziswe ezimweni ezahlukene.
Okokufaka kokuqala, igundane ivela emshweni nge ikati. Okokufaka kwesibili, igundane ivela emshweni nge ikhibhodi.
Ezingeni lokokufaka, sisebenzisa ngamabomu ukushumeka okufanayo kwegama elithi “igundane” kuzo zombili izimo. Lokhu kubalulekile. Kulesi sigaba, imodeli akazi ukuthi iyiphi incazelo ehloselwe.
Ukushumeka kwe igundane iqukethe kokubili:
- ingxenye eqinile yezilwane
- ingxenye eqinile yezobuchwepheshe
Lokhu kungaqondakali kungamabomu. Ngaphandle komongo, igundane ingase ibhekisele esilwaneni noma kumshini wekhompyutha.
Wonke amanye amagama anikeza amasignali acacile. Ikati isilwane kakhulu. Ikhibhodi i-tech kakhulu. Amazwi afanayo futhi noma kukhona ikakhulukazi ukuphatha ulwazi lohlelo. Amazwi afanayo abangani futhi wusizo banolwazi olubuthakathaka bebodwa.
Kuleli qophelo, akukho lutho ekushumekeni kokokufaka okuvumela imodeli ukuthi inqume ukuthi iyiphi incazelo igundane kulungile.
Esahlukweni esilandelayo, sizobona ukuthi i-matrix yokunaka yenza kanjani lolu shintsho, isinyathelo ngesinyathelo.
2. Ukuzinaka wena: ukuthi umongo ufakwa kanjani ekushumekeni
2.1 Ukuzinaka, hhayi nje ukunaka
Okokuqala sicacisa ukuthi hlobo luni lokunaka esikusebenzisa lapha. Lesi sahluko sigxile kukho ukuzinaka.
Ukuzinaka kusho ukuthi igama ngalinye libheka amanye amagama e ukulandelana kokufaka okufanayo.
Kulesi sibonelo esenziwe lula, senza ukukhetha okwengeziwe kokufundisa. Sicabanga ukuthi Imibuzo Nokhiye zilingana ngokuqondile nokushumeka okokufaka. Ngamanye amazwi, awekho ama-matric esisindo afundiwe ka-Q no-K kulesi sahluko.
Lokhu ukwenza lula ngamabomu. Kusivumela ukuthi sigxile ngokuphelele kumshini wokunaka, ngaphandle kokwethula amapharamitha engeziwe. Ukufana phakathi kwamagama kubalwa ngokuqondile ekushumekeni kwawo.
Ngokomqondo, lokhu kusho:
Q = Okokufaka
K = Okokufaka
Amavekhtha we-Value kuphela asetshenziswa kamuva ukuze asakaze ulwazi kokuphumayo.
Kumamodeli wangempela we-Transformer, i-Q, K, kanye ne-V zonke zitholakala ngokuqagela okufundiwe ngomugqa. Lezo zibikezelo zengeza ukuguquguquka, kodwa aziyishintshi indlela yokunaka ngokwayo. Inguqulo eyenziwe lula eboniswe lapha ithwebula umqondo oyinhloko.
Nasi sonke isithombe esizosibhidliza.

2.2 Ukusuka kokushumekayo kuye kumaphuzu okunakwa okungahluziwe
Siqala ku-matrix yokushumeka okokufaka, lapho umugqa ngamunye uhambisana negama futhi ikholomu ngayinye ihambisana nobukhulu be-semantic.
Umsebenzi wokuqala ukuqhathanisa igama ngalinye nawo wonke amanye amagama. Lokhu kwenziwa ngokwenza ikhompuyutha imikhiqizo yamachashazi phakathi Kwemibuzo kanye Nokhiye.
Ngenxa yokuthi Imibuzo Nokhiye ilingana nokushumekiwe kokufakwayo kulesi sibonelo, lesi sinyathelo sehlisela ekwenzeni ikhompuyutha imikhiqizo yamachashazi phakathi kwama-vector okokufaka.
Yonke imikhiqizo yamachashazi ibalwa ngesikhathi esisodwa kusetshenziswa ukuphindaphinda kwe-matrix:
Izikolo = Okokufaka × Okokufaka
Iseli ngalinye lale matrix liphendula umbuzo olula: afana kangakanani la magama amabili, uma kubhekwa ukushumeka kwawo?
Kulesi sigaba, amanani amaphuzu angavuthiwe. Akuwona amathuba, futhi awakabi nayo incazelo eqondile njengezisindo.

2.3 Ukukalwa kanye nokujwayelekile
Imikhiqizo yamachashazi aluhlaza ingakhula ibe mikhulu njengoba ubukhulu bokushumeka bukhula. Ukuze ugcine amanani kububanzi obuzinzile, amaphuzu akalwa ngempande eyisikwele yobukhulu bokushumeka.
I-ScaledScores = Izikolo / √d
Lesi sinyathelo sokukala asijulile ngokomqondo, kodwa sibalulekile. Ivimbela isinyathelo esilandelayo, i-softmax, ekubeni ibukhali kakhulu.

Uma isikali, i-softmax isetshenziswa umugqa ngomugqa. Lokhu kuguqula amaphuzu aluhlaza abe amanani avumayo aba nesamba esisodwa.
Umphumela uba i-matrix yokunaka.
Umugqa ngamunye wale matrix uchaza ukuthi igama elinikeziwe linaka kangakanani kuwo wonke amanye amagama emshweni.

2.4 Ukuhumusha i-matrix yokunaka
I-matrix yokunaka iyinto emaphakathi yokuzinaka.
Egameni elinikeziwe, umugqa walo ku-matrix yokunaka uphendula umbuzo: lapho ubuyekeza leli gama, yimaphi amanye amagama abalulekile, futhi malini?
Isibonelo, umugqa ohambisana ne igundane inika izisindo eziphakeme emagameni ahlobene ngokwezibalo kumongo wamanje. Emshweni nge ikati futhi abangani, igundane ubheka kakhulu amagama ahlobene nezilwane. Emshweni nge ikhibhodi futhi wusizoibheka kakhulu amagama obuchwepheshe.
Indlela yokusebenza iyafana kuzo zombili izimo. Amagama azungezile kuphela ashintsha umphumela.
2.5 Kusukela ezisindweni zokunaka kuye kokushumekiwe okukhiphayo
I-matrix yokunaka ngokwayo akuwona umphumela wokugcina. Kuyiqoqo lezisindo.
Ukuze sikhiqize ukushumeka kokuphumayo, sihlanganisa lezi zisindo namaVektha weValue.
Okukhiphayo = Ukunaka × V
Kulesi sibonelo esenziwe lula, amavekhtha weValue athathwa ngokuqondile kokushumekiwe okokufaka. Ngakho-ke ivektha yegama ngalinye eliphumayo liyisilinganiso esinesisindo samavekhtha okokufaka, anesisindo esinikezwe umugqa ohambisanayo we-matrix yokunaka.
Ngezwi elithi igundanelokhu kusho ukuthi ukumelwa kwayo kokugcina kuba yingxube:
- ukushumeka kwayo
- ukushumeka kwamagama ewanakekela kakhulu
Lesi yisikhathi esinembile lapho umongo ufakwa khona ekumeleleni.

Ekupheleni kokuzinaka, ukushumeka akusadideki.
Igama igundane ayisekho ukumelwa okufanayo kuyo yomibili imisho. I-vector yayo ephumayo ibonisa umongo wayo. Kokunye, iziphatha njengesilwane. Kokunye, iziphatha njengento yobuchwepheshe.
Akukho okukuthebula lokushumeka okushintshile. Okushintshile ukuthi ulwazi lwaluhlanganiswa kanjani ngamagama.
Lona umqondo oyinhloko wokuzinaka, kanye nesisekelo lapho amamodeli e-Transformer akhelwe khona.
Uma manje siqhathanisa lezi zibonelo ezimbili, ikati negundane kwesokunxele kanye ikhibhodi negundane ngakwesokudla, umphumela wokuzinaka uba sobala.
Kuzo zombili izimo, ukushumeka okokufaka kwe igundane kuyefana. Nokho ukumelwa kokugcina kuyehluka. Emshweni nge ikatiukushumeka okukhiphayo kwe igundane ibuswa ubukhulu besilwane. Emshweni nge ikhibhodiisici sobuchwepheshe siba sigqama kakhulu. Akukho okukuthebula lokushumeka okushintshile. Umehluko uvela ngokuphelele endleleni ukunaka okuphinde kwasabalalisa ngayo izisindo kuwo wonke amagama ngaphambi kokuxuba amanani.
Lesi siqhathaniso sigqamisa indima yokuzinaka: akuguquli amagama ngokuzihlukanisa, kodwa kubumba kabusha izethulo zawo ngokucabangela umongo ogcwele.

3. Ukufunda ukuhlanganisa ulwazi

3.1 Sethula izisindo ezifundiwe ze-Q, K, kanye ne-V
Kuze kube manje, sigxile kumakhenikha wokuzinaka ngokwakho. Manje sethula into ebalulekile: wafunda izisindo.
Ku-Transformer yangempela, Imibuzo, Okhiye, Namanani akuthathwa ngokuqondile kulokho okushumekiwe. Kunalokho, akhiqizwa ukuguqulwa komugqa okufundiwe.
Ekushumekeni kwegama ngalinye, imodeli ibala:
Q = Okokufaka × W_Q
K = Okokufaka × W_K
V = Okokufaka × W_V
Lawa matrices esisindo afundwa ngesikhathi sokuqeqeshwa.
Kulesi sigaba, ngokuvamile sigcina ubukhulu obufanayo. Okushumekiwe okokufaka, Q, K, V, nokushumeka kokuphumayo konke kunenombolo efanayo yobukhulu. Lokhu kwenza indima yokunaka ibe lula ukuyiqonda: ilungisa izethulo ngaphandle kokushintsha indawo abahlala kuyo.
Ngokomqondo, lezi zisindo zivumela imodeli ukuthi inqume:
- yiziphi izici zegama elibalulekile uma liqhathaniswa (Q kanye no-K)
- yiziphi izici zegama okufanele zidluliselwe kwabanye (V)

3.2 Lokho okufundwa yimodeli
Indlela yokunaka ngokwayo ilungisiwe. Imikhiqizo yamachashazi, ukukala, i-softmax, nokuphindaphinda kwe-matrix kuhlala kusebenza ngendlela efanayo. Okufundwa yimodeli empeleni ukuqagela.
Ngokulungisa izisindo ze-Q ne-K, imodeli ifunda ukukala ubudlelwano phakathi kwamagama omsebenzi othile. Ngokulungisa izisindo ze-V, ifunda ukuthi yiluphi ulwazi okufanele lusakazwe lapho ukunakwa kuphezulu. Isakhiwo sichaza ukuthi ulwazi lugeleza kanjani, kuyilapho izisindo zichaza ukuthi yiluphi ulwazi olugelezayo.
Ngoba i-matrix yokunaka incike ku-Q no-K, ihumusheka kancane. Singahlola ukuthi yimaphi amagama abheka amanye futhi sibheke amaphethini avame ukuhambisana ne-syntax noma semantics.
Lokhu kuba sobala uma kuqhathaniswa igama elifanayo ezimweni ezimbili ezahlukene. Kuzo zombili izibonelo, igama igundane iqala ngokushumeka kokufaka okufanayo ncamashi, okuqukethe kokubili isilwane kanye nengxenye yobuchwepheshe. Ngokwayo, ayicacisi.
Okushintshayo akulona igama, kodwa ukunakwa elikutholayo. Emshweni nge ikati futhi abanganiukunaka kugcizelela amagama ahlobene nezilwane. Emshweni nge ikhibhodi futhi wusizoukunaka kushintshela kumagama obuchwepheshe. Indlela kanye nezisindo ziyafana kuzo zombili izimo, nokho ukushumeka kokuphumayo kuyehluka. Umehluko uvela ngokuphelele endleleni ama-projections afundiwe ahlangana ngayo nomongo ozungezile.
Yingakho nje i-matrix yokunaka ichazeka: iveza ukuthi yibuphi ubudlelwano imodeli ebufunde ukuze bucatshangelwe njengobunenjongo emsebenzini.

3.3 Ukushintsha ubukhulu ngamabomu
Akukho, nokho, okuphoqa u-Q, K, kanye no-V ukuthi abe nobukhulu obufanayo nokokufaka.
I-Value projection, ikakhulukazi, ingenza imephu yokushumeka ibe isikhala sosayizi ohlukile. Uma lokhu kwenzeka, okushumekayo okukhiphayo kuzuze ifa lobukhulu bamavekhtha weValue.
Lokhu akulona ilukuluku lethiyori. Yilokho kanye okwenzeka kumamodeli wangempela, ikakhulukazi ekunakeni kwamakhanda amaningi. Ikhanda ngalinye lisebenza endaweni yalo engaphansi, ngokuvamile enobukhulu obuncane, futhi imiphumela kamuva ihlanganiswe ibe ukumelwa okukhulu.
Ngakho ukunaka kungenza izinto ezimbili:
- hlanganisa ulwazi ngamagama
- lungisa kabusha indawo lapho lolu lwazi luhlala khona
Lokhu kuchaza ukuthi kungani ama-Transformers enza kahle kangaka.
Abathembele kuzici ezingashintshi. Bafunda:
- indlela yokuqhathanisa amagama
- indlela yokudlulisa ulwazi
- indlela yokufaka incazelo ezindaweni ezahlukene

Izilawuli ze-matrix yokunaka lapho ulwazi luyageleza.
Ama-projections afundiwe alawula ini ulwazi lugeleza futhi ukuthi limelelwa kanjani.
Ndawonye, bakha indlela eyinhloko yamamodeli ezilimi zesimanje.
Isiphetho
Leli Khalenda Le-Advent lakhiwe eduze kombono olula: ukuqonda amamodeli okufunda omshini ngokubheka ukuthi empeleni ayiguqula kanjani idatha.
Ama-Transformer ayindlela efanelekile yokuvala lolu hambo. Awanciki emithethweni engashintshi noma amaphethini endawo, kodwa ebudlelwaneni obufundiwe phakathi kwazo zonke izici zokulandelana. Ngokunaka, baguqula ukushumeka okungashintshi kube izethulo zesimo, okuyisisekelo sezibonelo zezilimi zesimanje.
Siyabonga futhi kuwo wonke umuntu olandele lolu chungechunge, wabelane ngempendulo, futhi walusekela, ikakhulukazi ithimba le-Towards Data Science.
Ukhisimusi omuhle 🎄



