Machine Learning

Amamodeli Olimi Olukhulu: Isingeniso Esifushane | by Carolina Bento | Jan, 2025

Kukhona isifinyezo okungenzeka ukuthi usizwe singanqamuki eminyakeni embalwa edlule: i-LLM, emele i-Large Language Model.

Kulesi sihloko sizobheka kafushane ukuthi ayini ama-LLM, kungani ewucezu lobuchwepheshe olujabulisa kakhulu, kungani lubalulekile kimina nawe, nokuthi kungani kufanele ukhathalele ama-LLM.

Qaphela: kulesi sihloko, sizosebenzisa Imodeli Yolimi Olukhulu, i-LLM kanye nemodeli ngokushintshana.

Imodeli Yolimi Olukhulu, ngokuvamile ebizwa ngokuthi i-LLM njengoba iyi-twister yolimi kancane, iyimodeli yezibalo ekhiqiza umbhalo, njengokugcwalisa igebe legama elilandelayo emshweni. [1].

Isibonelo, uma uyinikeza umusho Impungushe ensundu esheshayo yeqa phezu kwevila _____akazi kahle ukuthi igama elilandelayo inja. Lokho okukhiqizwa yimodeli esikhundleni salokho uhlu lwamagama alandelayo okungenzeka ahambisana namathuba awo ezayo ngokulandelayo emshweni oqala ngalawo magama ngqo.

Isibonelo sokubikezela igama elilandelayo emshweni. Isithombe ngombhali.

Isizathu sokuthi kungani ama-LLM ekwazi ukubikezela igama elilandelayo emshweni ingoba aqeqeshwe ngenani elikhulu ngokumangalisayo lombhalo, elivame ukuchithwa ku-inthanethi. Ngakho-ke uma imodeli ingenisa umbhalo kulesi sihloko nganoma yiliphi ithuba, Sawubona 👋

Ngakolunye uhlangothi, uma wakha i-LLM eqonde ngqo esizindeni esithile, ngokwesibonelo, wakha i-chatbot engaxoxa nawe njengokungathi ingumlingiswa wemidlalo kaShakespeare, i-inthanethi izoba nayo. amazwibela amaningi noma ngisho nemisebenzi yakhe ephelele, kodwa izoba nethoni yomunye umbhalo ongahambisani nomsebenzi owenziwayo. Kulokhu, ungaphakela i-LLM ku-chatbot kuphela umongo ka-Shakespeare, okungukuthi, yonke imidlalo yakhe namanonethi.

Yize ama-LLM eqeqeshwa ngenani elikhulu ledatha, akukhona lokho Okukhulu kumamodeli Olimi Olukhulu imele. Ngaphandle kobukhulu bedatha yokuqeqeshwa, elinye inani elikhulu kulawa mamodeli inani lamapharamitha anawo, ngayinye inethuba lokulungiswa, okungukuthi, ukushunwa.

Amamodeli ezibalo alula I-Simple Linear Regression, enamapharamitha amabili kuphela, i-slope kanye nokunqamula. Futhi ngisho namapharamitha amabili nje kuphela, kunezimo ezimbalwa ezihlukene okukhishwayo kwemodeli okungathathwa.

Umumo ohlukile wokuhlehla komugqa. Isithombe ngumbhali.

Njengesiqhathaniso, ngenkathi i-GPT-3 ikhishwa ngo-2020 yayinamapharamitha angu-175B, yebo Billion![3] Ngenkathi i-LLaMa, umthombo ovulekile we-Meta i-LLM, yayinamamodeli amaningi ahlukene asuka ku-7B kuya ku-65B amapharamitha lapho ikhishwa ngo-2023.

Lezi zigidigidi zamapharamitha wonke aqala ngamavelu angahleliwe, ekuqaleni kwenqubo yokuqeqesha, futhi kusesikhathini sengxenye ye-Backpropagation yesigaba sokuqeqesha lapho eqhubeka eshintshwa futhi elungiswa.

Ngokufanayo nanoma iyiphi enye imodeli Yokufunda Ngomshini, phakathi nesigaba sokuqeqesha, okukhiphayo kwemodeli kuqhathaniswa nenani langempela elilindelekile lokuphumayo, ukuze kubalwe iphutha. Uma sisekhona isikhala sokuthuthukisa, i-Backpropagation iqinisekisa ukuthi amapharamitha emodeli ayalungiswa ukuze imodeli ikwazi ukubikezela amanani ngephutha elincane kancane ngokuzayo.

Kodwa lokhu kubizwa nje ukuqeqeshwa kwangaphambililapho imodeli iba nekhono lokubikezela igama elilandelayo emshweni.

Ukuze imodeli ibe nokusebenzisana okuhle ngempela nomuntu, kuze kufike ezingeni lokuthi wena – umuntu – ungabuza umbuzo we-chatbot futhi impendulo yawo ibonakala inembile ngokwesakhiwo, i-LLM eyisisekelo kufanele idlule esinyathelweni Ukuqinisa Ukufunda Ngempendulo Yomuntu. Lokhu kusho ngokwezwi nezwi womuntu ku-loop lokho okuvame ukukhulunywa ngakho kumongo wamamodeli wokufunda ngomshini.

Kulesi sigaba, abantu bamaka izibikezelo ezingezinhle kangako futhi ngokuthatha leyo mpendulo, amapharamitha emodeli ayabuyekezwa futhi imodeli iyaqeqeshwa futhi, izikhathi eziningi ezidingekayo, ukuze ifinyelele izinga lekhwalithi yokubikezela efiselekayo.

Kuyacaca manje ukuthi lawa mamodeli ayinkimbinkimbi kakhulu, futhi adinga ukwazi ukwenza izigidi, uma kungezona izigidigidi zezibalo. Le khompuyutha enomfutho ophezulu idinga izakhiwo zamanoveli, ezingeni lemodeli elinama-Transformers kanye nekhompyutha, nama-GPU.

I-GPU yilesi sigaba samaphrosesa ezithombe asetshenziswa ezimeni lapho udinga ukwenza inani elikhulu ngokumangalisayo lezibalo ngesikhathi esifushane, isibonelo ngenkathi unikeza izinhlamvu ngokushelelayo emdlalweni wevidiyo. Uma kuqhathaniswa nama-CPU endabuko atholakala kukhompuyutha yakho ephathekayo noma i-PC engumbhoshongo, ama-GPU anekhono lokusebenzisa kalula izibalo eziningi ezifanayo.

Ukuphumelela kwama-LLM kwaba yilapho abacwaningi beqaphela ukuthi ama-GPU angasetshenziswa ezinkingeni ezingezona zezithombe. Kokubili Ukufunda Ngomshini kanye ne-Computer Graphics kuncike ku-algebra yomugqa, eqhuba imisebenzi kumatrices, ngakho-ke zombili ziyazuza ekhonweni lokusebenzisa izibalo eziningi ezifanayo.

I-Transformers iwuhlobo olusha lwezakhiwo olwakhiwe yi-Google, okwenza kube njalo ukuthi umsebenzi ngamunye owenziwe phakathi nokuqeqeshwa okuyimodeli ungafaniswa. Isibonelo, ngenkathi ubikezela igama elilandelayo emshweni, imodeli esebenzisa i-Transformer architecture ayidingi funda umusho kusukela ekuqaleni kuya ekugcineni, ucubungula umbhalo wonke ngesikhathi esisodwa, ngokuhambisana. Ihlobanisa igama ngalinye elicutshungulwe nohlu olude lwezinombolo ezinikeza incazelo yalelo gama. Uma ucabanga nge-Linear Algebra futhi umzuzwana, esikhundleni sokucubungula nokuguqula iphoyinti ledatha elilodwa ngesikhathi, inhlanganisela ye-Transformers nama-GPU ingacubungula amathani amaphoyinti ngesikhathi esifanayo ngokusebenzisa ama-matrices.

Ngokungeziwe ekubalweni okuhambisanayo, okuhlukanisa ama-Transformers umsebenzi oyingqayizivele obizwa ngokuthi Ukunaka. Ngendlela elula kakhulu, Ukunaka kwenza kube nokwenzeka ukubheka wonke umongo ozungeze igama, ngisho noma kwenzeka izikhathi eziningi emishweni ehlukene njenge

Ekupheleni kombukiso, umculi wathatha umnsalo izikhathi eziningi.

UJack wayefuna ukuya esitolo ukuze athenge umnsalo omusha wokuzijwayeza okuhlosiwe.

Uma sigxila egameni khothamaungabona ukuthi umongo leli gama elivela kuwo emshweni ngamunye nencazelo yalo yangempela uhluke kanjani.

Ukunaka kuvumela imodeli ukuthi icwenge incazelo yegama ngalinye elibhalwe ngekhodi ngokusekelwe kumongo owazungezile.

Lokhu, kanye nezinye izinyathelo ezengeziwe njengokuqeqeshwa a Inethiwekhi ye-Feedforward Neuralkonke kwenziwa izikhathi eziningi, kwenze kube ngendlela yokuthi imodeli kancane kancane icwenge amandla ayo okuhlanganisa ulwazi olulungile. Zonke lezi zinyathelo zihloselwe ukwenza imodeli inembe kakhudlwana futhi ingahlanganisi incazelo ye khothamaumnyakazo, kanye khothama (into ehlobene nokucibishela) uma isebenzisa umsebenzi wokubikezela.

Umdwebo oyisisekelo wokugeleza obonisa izigaba ezahlukahlukene zama-LLM ukusuka ekuqeqeshweni kwangaphambili ukuya ekwazisweni/ekusetshenzisweni. Ukwazisa ama-LLM ukuthi akhiqize izimpendulo kungenzeka ezigabeni ezihlukene zokuqeqesha ezifana nokuqeqeshwa kwangaphambili, ukushuna iziyalezo, noma ukulungisa ukuqondanisa. I-“RL” imele ukuqinisa ukufunda, i-“RM” imele ukumodeliswa komvuzo, kanti i-“RLHF” imele ukufunda okuqinisiwe ngempendulo yomuntu. Isithombe namagama-ncazo athathwe ephepheni okubhekiselwe kulo [2]

Ukuthuthukiswa kwama-Transformers nama-GPU kuvumele ama-LLM ukuthi aqhume ekusetshenzisweni nasekusetshenzisweni uma kuqhathaniswa nangaphambi kwamamodeli olimi ayedinga ukufunda igama elilodwa ngesikhathi. Ukwazi ukuthi imodeli iba ngcono ngedatha yekhwalithi eyengeziwe efunda kuyo, ungabona ukuthi ukucubungula igama elilodwa ngesikhathi bekuyibhodlela elikhulu kanjani.

Ngomthamo ochaziwe, lawo ma-LLM angacubungula amanani amakhulu ezibonelo zombhalo bese ebikezela ngokunemba okuphezulu, igama elilandelayo emushweni, kuhlanganiswe nezinye izinhlaka ezinamandla zoBuhlakaniphi Bezokwenziwa, imisebenzi eminingi yolimi lwemvelo nolwazi oluye lwaba lula kakhulu ukulwenza. sebenzisa futhi khiqiza.

Empeleni, Amamodeli Olimi Olukhulu (LLMs) avele njengamasistimu obuhlakani bokwenziwa angasebenza futhi akhiqize umbhalo ngokuxhumana okubumbene futhi enze imisebenzi eminingi ibe jikelele.[2].

Cabanga ngemisebenzi efana nokuhumusha kusuka kusiNgisi kuya kuSpanishi, ukufingqa iqoqo lamadokhumenti, ukukhomba izindima ezithile kumadokhumenti, noma ukuba ne-chatbot uphendule imibuzo yakho ngesihloko esithile.

Le misebenzi ebingenzeka ngaphambili, kodwa umzamo odingekayo wokwakha imodeli wawuphezulu kakhulu futhi izinga lokuthuthukiswa kwalezi zinhlobo lalihamba kancane kakhulu ngenxa yokuvinjelwa kobuchwepheshe. Kwangena ama-LLM futhi ashaja kakhulu yonke le misebenzi nezinhlelo zokusebenza.

Cishe uhlanganyele noma ubone othile esebenzisana ngqo nemikhiqizo esebenzisa ama-LLM emnyombweni wayo.

Le mikhiqizo ingaphezu kwe-LLM elula ebikezela ngokunembile igama elilandelayo emshweni. Basebenzisa ama-LLM namanye amasu okufunda komshini kanye nezinhlaka, ukuze baqonde ukuthi yini oyibuzayo, bacinge kulo lonke ulwazi lwesimo abalubonile kuze kube manje, futhi bakwethule impendulo efana neyomuntu futhi, izikhathi eziningi ehambisanayo. Noma okungenani ezinye zinikeza isiqondiso mayelana nokuthi yini okufanele uyibheke ngokulandelayo.

Kukhona amathani emikhiqizo ye-Artificial Intelligence (AI) esebenzisa ama-LLM, kusukela ku-Meta AI ye-Facebook, i-Gemini yakwaGoogle, i-Open AI's ChatGPT, eboleka igama layo kubuchwepheshe be-Generative Pre-trained Transformer ngaphansi kwe-hood, i-Microsoft's CoPilot, phakathi kwezinye eziningi, eziningi. , ehlanganisa imisebenzi eminingi engakusiza kuyo.

Isibonelo, emasontweni ambalwa edlule, bengizibuza ukuthi mangaki ama-albhamu e-studio i-Incubus akhiphe. Ezinyangeni eziyisithupha ezedlule, mhlawumbe bengiyi-Google noma ngiqonde ngqo ku-Wikipedia. Namuhla, ngivame ukubuza uGemini.

Isibonelo sombuzo engiwubuze uGemini 🤣 Isithombe sombhali.

Lesi isibonelo esilula kuphela. Ziningi ezinye izinhlobo zemibuzo noma iziyalezo ongazinikeza kule mikhiqizo ye-Artificial Intelligence, njengokucela ukufingqa umbhalo othile noma idokhumenti, noma uma unjengami futhi uya eMelbourne, ngicela izincomo mayelana nokuthi yini okufanele uyenze. Lapho.

Isibonelo sombuzo engiwubuze uGemini 🤣 Isithombe sombhali.

Yaqonda ngqo ephuzwini, yanginikeza izinkomba ezihlukahlukene zokuthi ngenzeni, ngabe sengiya emijahweni, ngakwazi ukumba kancane ezindaweni ezithile ezazibonakala zithakazelisa kakhulu kimi.

Ungabona ukuthi lokhu kungisindise kanjani inqwaba yesikhathi okungenzeka ukuthi bengizosichitha phakathi kokubuyekezwa kwe-Yelp i-TripAdvisor, amavidiyo we-YouTube noma okuthunyelwe kwebhulogi mayelana nezindawo eziyisithonjana nezinconyiwe eMelbourne.

Ama-LMM, ngaphandle kokungabaza, ayindawo esafufusa yocwaningo ebilokhu ivela ngokushesha okukhulu, njengoba ubona ngomugqa wesikhathi ongezansi.

Ukuboniswa kokulandelana kwesikhathi kokukhishwa kwe-LLM: amakhadi aluhlaza amele amamodeli 'aqeqeshwe ngaphambilini', kuyilapho amakhadi awolintshi ahambisana namamodeli 'alungiselelwe iziyalezo'. Amamodeli engxenyeni engenhla abonisa ukutholakala komthombo ovulekile, kuyilapho lawo aphansi engumthombo ovaliwe. Ishadi libonisa ithrendi ekhulayo eya kumamodeli acushwe iziyalezo kanye nomthombo ovulekile, eligqamisa isimo sezwe esithuthukayo namathrendi ocwaningweni lokucubungula ulimi lwemvelo. Isithombe namagama-ncazo athathwe ephepheni okubhekiselwe kulo [2]

Sisezinsukwini zakuqala zokukhiqiza, noma ukusetshenziswa komkhiqizo. Izinkampani eziningi ziyanda zisebenzisa ama-LLM ezindaweni zazo zesizinda, ukuze ziqondise imisebenzi engazithatha iminyaka eminingana, kanye nesamba semali esimangalisayo sokucwaninga, ukuthuthukisa nokuletha emakethe.

Uma isetshenziswa ezindleleni zokuziphatha kanye nokunaka umthengi, ama-LLM kanye nemikhiqizo enama-LLM emnyombweni wayo inikeza ithuba elikhulu kuwo wonke umuntu. Kubacwaningi, yinkambu esezingeni eliphezulu enengcebo yazo zombili izinkinga zethiyori nezisebenzayo okufanele zixazululwe.

Isibonelo, ku-Genomics, i-gLMs noma i-Genomic Language Models, okungukuthi, Amamodeli Olimi Amakhulu aqeqeshelwe ukulandelana kwe-DNA, asetshenziselwa ukusheshisa ukuqonda kwethu okuvamile kwama-genome kanye nendlela i-DNA esebenza futhi isebenzisana ngayo neminye imisebenzi.[4]. Lena imibuzo emikhulu ososayensi abangenazo izimpendulo zayo eziqondile, kodwa ama-LLM abonakala eyithuluzi elingabasiza ukuthi benze inqubekelaphambili ngezinga elikhulu kakhulu futhi bafunde lokho abakutholile ngokushesha okukhulu. Ukuze wenze inqubekelaphambili ezinzile kusayensi, amalophu empendulo asheshayo abalulekile.

Ezinkampanini, kunoshintsho olukhulu kanye nethuba lokwenzela amakhasimende okwengeziwe, ukubhekana nezinkinga zawo eziningi kanye nezindawo ezibuhlungu, okwenza kube lula kumakhasimende ukubona inani lemikhiqizo. Kungaba okokusebenza, ukusetshenziswa kalula, izindleko, noma konke okungenhla.

Kubathengi, sithola ulwazi ngemikhiqizo namathuluzi okusisiza emisebenzini yansuku zonke, esisiza ukwenza imisebenzi yethu kangcono, ukuze sifinyelele ngokushesha olwazini noma sithole izikhombisi lapho singasesha futhi sijule ngalokho. ulwazi.

Kimina, ingxenye ejabulisa kakhulu, ijubane le mikhiqizo ezishintsha ngayo futhi iphelelwe yisikhathi. Mina ngokwami ​​ngiyafisa ukubona ukuthi le mikhiqizo izobukeka kanjani eminyakeni engu-5 ezayo nokuthi ingaba enembe kakhulu futhi ithembeke kanjani.

Source link

Related Articles

Leave a Reply

Your email address will not be published. Required fields are marked *

Back to top button