I-Google AI Ikhipha I-DiffusionGemma, imodeli evuliwe engu-26B MoE Esebenzisa Ukusabalalisa Umbhalo kuze kufike esizukulwaneni esisheshayo esingu-4x

Ithimba le-Google AI elihlanganisa abacwaningi be-Google DeepMind basanda kukhipha i-DiffusionGemma, imodeli evuliwe yokuhlola yokukhiqiza umbhalo. Isebenzisa ukuhluka kombhalo esikhundleni sokukhishwa kwekhodi okuzenzakalelayo okujwayelekile. Imodeli ihamba ngaphansi kwelayisensi ye-Apache 2.0 evumelekile. I-Google iyibeka kuma-devs nabacwaningi abahlola isivinini esibucayi, ukugeleza komsebenzi kwasendaweni okusebenzisanayo. Izibonelo zifaka ukuhlela emgqeni, ukuphindaphinda okusheshayo, kanye nokukhiqiza izakhiwo zombhalo ezingaqondile.
Izinhlobo eziningi zezilimi ezisetshenziswayo namuhla zi-autoregressive. Bakha ithokheni eyodwa ngesikhathi, ukusuka kwesokunxele kuye kwesokudla. Ithokheni ngayinye entsha incike kuthokheni elingaphambi kwalo. I-DiffusionGemma isebenza ngokuhlukile. Ikhiqiza wonke amabhlogo wombhalo kanyekanye, ngokuhambisana. Kuma-GPU azinikele, lokhu kuletha isizukulwane esisheshayo esifika ku-4x.
Iyini i-DiffusionGemma
I-DiffusionGemma imodeli engu-26B Mixture of Experts (MoE). Ivula amapharamitha angu-3.8B kuphela ngesikhathi sokunquma. Yakhelwe kumgogodla we-Gemma 4, ikakhulukazi ukwakhiwa kwe-26B-A4B. I-Google ihlanganise inhloko yokusabalalisa kuleso sisekelo.
Imodeli ine-multimodal. Icubungula okokufaka okunezihibe kombhalo, isithombe, nevidiyo. Ikhiqiza okuphumayo kombhalo okuvela kulokho okokufaka. Iwindi lomongo lingamathokheni angu-256K, futhi lisekela izilimi ezingu-140+.
Ngokwesilinganiso, imodeli ilingana ngaphakathi kwe-18GB ye-VRAM. Lokho kuyibeka ngaphakathi kwemikhawulo ye-GPU yabathengi ephezulu. Kwi-NVIDIA H100 eyodwa, ifinyelela kumathokheni angu-1000+ ngomzuzwana. Ku-NVIDIA GeForce RTX 5090, ifinyelela kumathokheni angama-700+ ngomzuzwana.
I-Google iqonde ngqo mayelana nokuhwebelana. I-DiffusionGemma ibeka phambili isivinini nokukhiqizwa kwesakhiwo esihambisanayo. Ikhwalithi yayo yonke yokukhipha ingaphansi kune-Gemma 4 evamile. Ngomsebenzi wokukhiqiza wekhwalithi ephezulu, i-Google isancoma i-Gemma 4 ezenzakalelayo.
Indlela Ukusabalalisa Umbhalo Osebenza Ngayo
Ukusatshalaliswa kombhalo kuboleka umbono wawo oyinhloko kumajeneretha ezithombe ze-AI. Lawo mamodeli aqala nge-visual static futhi ayicwenge ngokuphindaphindiwe. I-DiffusionGemma isebenzisa iphethini efanayo ekukhiqizeni umbhalo.
Inqubo ihamba ngezigaba ezintathu zomqondo. Okokuqala, imodeli iqala ngekhanvasi yamathokheni abamba indawo angahleliwe. Okwesibili, yenza amaphasi amaningi kuleyo khanvasi. Ikhiya amathokheni okuzethemba okukhulu futhi iwasebenzise njengomongo. Okwesithathu, umbhalo uhlangana ube okukhiphayo kokugcina.
I-Google ibiza i-core mechanism ye-Uniform State Diffusion. Amathokheni aqiniseka kakhulu asiza ukuxazulula izikhundla eziseduze ngesikhathi sokukhipha umsindo. Uchungechunge olugcwele lube selufinyelela ekugxilweni phezu kwamaphasi amaningana.
Ngokwenza, imodeli denoises 256-token canvas ngokuhambisana. Iphothula cishe amathokheni ayi-15-20 ngokudlula okuya phambili. Lokho kufana yikho okwenza ukuphumelela kokuphuma phambili.
Imodeli isebenzisa ukunaka okuphindwe kabili ngesikhathi sokukhipha umsindo. Lonke ithokheni kukhanvasi ingakwazi ukunakekela zonke ezinye ithokheni. Leli ikhefu elibukhali kumamodeli we-autoregressive. Lawo mamodeli angabheka emuva kuphela kumathokheni angaphambili.
Lowo mongo oqondisa kabili unika amandla ukuzilungisa kwesikhathi sangempela. Uma ukuzethemba kwethokheni kwehla, isampuli singayiphinda futhi. Imodeli ibe isithatha indawo yalelo thokheni ekuhambeni kwesikhathi. Amamodeli we-Autoregressive awakwazi ukwenza lokhu, ngoba enza ithokheni ngayinye kanye.
I-Architecture
Intuthuko yezobuchwepheshe lapha ukusetshenziswa kwehadiwe. Ngokuqonda kwe-GPU yendawo, ibhodlela eliyinhloko umkhawulokudonsa wememori. Amamodeli we-Autoregressive alayisha izisindo ngokuphindaphindiwe kusuka kwithokheni ngayinye. Ngesikhathi sokusebenza komsebenzisi oyedwa, i-GPU ichitha isikhathi esiningi ilindile.
I-DiffusionGemma ishintsha ibhodlela isuka kumkhawulokudonsa wememori iye kukhompyutha. Ibhala futhi icwenge ikhanvasi enamathokheni angama-256 ngokuhambisana. Lokhu kunikeza ama-tensor cores angasebenzi umthwalo omkhulu ohambisanayo.
Imodeli ishintshanisa izindlela zokunaka ezimbili ngesikhathi sokunquma. Ukugcwalisa kuqala kusebenzisa ukunaka okuyimbangela ukuze kungene ukwaziswa nokubhala inqolobane ye-KV. I-Denoising isebenzisa ukunaka okukabili ukuze icwenge ikhanvasi.
Ngemiphumela emide, i-DiffusionGemma isebenzisa i-Block Autoregressive Diffusion. Uma ibhulokhi enamathokheni angama-256 isikhishwe umsindo ngokuphelele, izinikela kunqolobane ye-KV. Imodeli ibe isiqala ikhanvasi entsha ebekwe kumlando wangaphambili. Lokhu kubhangqa isivinini sebhulokhi ehambisanayo nokuzinza okuzenzakalelayo okulandelanayo.
I-architecture yabelana ngomgogodla ofanayo ne-Gemma 4 26B A4B. Onjiniyela ngokuyinhloko badinga ukwenza isinyathelo sokususa umsindo. Lokho kwenza ukuhlanganisa kuzinhlaka zokuphakela ezikhona zibe lula.
Isibonelo esicacile umbukiso we-Sudoku ovela kumhlahlandlela wonjiniyela we-Google. Amamodeli e-autoregressive alwa namaphazili aqinile, ahlukahlukene. Imodeli eyisisekelo ye-DiffusionGemma ixazulula cishe u-0% wamaphazili e-Sudoku. Ngemuva kweresiphi elula egadiwe ye-JAX yokulungisa kahle, ukulunga kukhuphukela ku-80%. Imodeli ecushwe kahle futhi iyama ngaphambili, inqamula izinyathelo zokucabanga.
Idemo Esebenzisanayo: I-DiffusionGemma Inquma Kanjani Ngokuhambisana
Isibonisi esisebenzisanayo esingezansi sibonisa ukuthi i-DiffusionGemma iwunquma kanjani umbhalo, uma uqhathaniswa nemodeli evamile yokuzenzakalela. Guqula phakathi kwamamodi amabili bese ucindezela u-Run. Ku I-Autoregressive Imodi, amathokheni agcwalisa eyodwa ngesikhathi, ukusuka kwesokudla ukuya kwesokudla, ukuthatha iphasi eyodwa eya phambili ngethokheni ngayinye – indlela ama-LLM amaningi enza ngayo namuhla. Ku Ukusabalalisa Imodi, imodeli iqala kukhanvasi yamathokheni esibambi esifihlekile futhi ixazulula amaningi awo ngokufana iphasi ngayinye, ngokungahleleki, ihlangana ngamaphasi ambalwa kakhulu. Opopayi baphinde babonise isinyathelo esifushane somsindo kabusha, lapho ithokheni yokuzithemba kancane isethwe kabusha futhi icwengisiswe futhi – ukuma kokuzilungisa kwemodeli yangempela, okuyinto ukuqopha okuzenzakalelayo okungakwazi ukukwenza uma ithokheni isizibophezele. Qaphela ukuthi lokhu ukugqwayiza komqondo, hhayi okukhiphayo kwemodeli ebukhoma: i-DiffusionGemma yangempela ixazulula ikhanvasi yamathokheni angu-256 futhi iphetha cishe amathokheni angu-15–20 ngokudlula okuya phambili.
Buka i-DiffusionGemma Decode in Parallel
Lokhu ukugqwayiza komcabango wenqubo yokukhipha umsindo — hhayi okukhiphayo kwemodeli ebukhoma. Imodeli yangempela ixazulula i-canvas yamathokheni angu-256, iphothula ~ amathokheni angu-15-20 ngokudlula phambili ngakunye.
Cindezela u-Run ukuze uqale.
Sebenzisa Amacala
I-DiffusionGemma iqondise umthwalo othile wokusebenza, hhayi ikhwalithi yokukhiqiza evamile. Ozakwethu be-Google kanye ne-ecosystem bagqamisa izinhlelo zokusebenza ezimbalwa ezisebenzayo:
- Ukuhlela okusemgqeni nokugcwalisa ikhodi: Ukunaka kwezinhlangothi ezimbili kufanelana kahle nezakhiwo zombhalo ezingaqondile.
- Ukuphindaphinda okusheshayo: Ukubambezeleka kwasendaweni okuphansi kusekela amaluphu kanjiniyela asebenzisanayo, omsebenzisi oyedwa.
- Ukuhlaziywa kwedokhumenti yomongo omude: Iwindi le-256K lisekela ukucubungula okokufaka okukhulu.
- I-OCR nokuhlukaniswa kwedokhumenti: Okokufaka kwe-Multimodal kusingatha izithombe namadokhumenti askeniwe.
- Ukwenziwa kwekhodi, ukushaya kwamathuluzi, nokugeleza komsebenzi we-ejenti: I-Unsloth ibala lokhu njengemisebenzi esekelwayo.
- Ukukhiqiza okuphoqelekile: I-Sudoku, amagrafu ezibalo, nokulandelana kwe-amino acid kuyazuza ekunakeni okufanayo.
Isixwayiso esisodwa sibumba zonke lezi. I-speedup yakhelwe ukuqondiswa kwendawo, ngemali ephansi. Ekusetshenzisweni kwamafu kwe-QPS ephezulu, amamodeli azenzakalelayo agcwala ngekhompuyutha kahle. Lapho, ukususa amakhodi okuhambisanayo kunikeza izinzuzo ezinciphisa futhi kungakhuphula izindleko zokuhlinzeka.

I-DiffusionGemma vs Standard Gemma 4
| Isibaluli | I-DiffusionGemma (26B-A4B) | I-Gemma 4 ejwayelekile (26B A4B) |
|---|---|---|
| Indlela yesizukulwane | Ukusakazwa kombhalo ohlukene (okuhambisanayo) | I-Autoregressive (ithokheni ngethokheni) |
| Qopha ibhodlela | Iboshiwe ngekhompyutha | I-Memory-bandwidth-bound |
| Iyunithi ehambisanayo | Ikhanvasi enamathokheni angama-256 ngokudlula ngakunye | Ithokheni eyodwa ngesinyathelo ngasinye |
| Ukunaka ngesikhathi sokukhipha ikhodi | I-Bidirectional | Imbangela (emuva kuphela) |
| Ukuzilungisa | Yebo, ngokuphinda umsindo | Cha, amathokheni enziwa kanye |
| Isivinini ku-GPU ezinikele | Kufikela ku-4x ngokushesha | Isisekelo |
| I-H100 throughput | 1000+ amathokheni/isekhondi | Ngaphansi (isisekelo) |
| Ukukhishwa kwe-RTX 5090 | 700+ amathokheni/isekhondi | Ngaphansi (isisekelo) |
| Ikhwalithi yokuphuma | Ngaphansi kune-Gemma 4 | Phezulu; kunconyelwe ukukhiqizwa |
| Ukulingana okuhle kakhulu | Indawo, imali ephansi, iyasebenzisana | Ukunikezwa kwamafu kwekhwalithi ephezulu ne-QPS ephezulu |
| Ilayisensi | I-Apache 2.0 | Amagama weGemma |
Okuthathwayo Okubalulekile
- I-DiffusionGemma imodeli evuliwe engu-26B MoE (3.8B iyasebenza) ekhiqiza umbhalo ngokusakazwa okufanayo, hhayi ithokheni nethokheni.
- Isebenza kuze kufike ku-4x ngokushesha kuma-GPU azinikele: amathokheni angu-1000+/isekhondi ku-H100, 700+ ku-RTX 5090.
- Ukunaka okuphindwe kabili phezu kwekhanvasi yamathokheni angu-256 kunika amandla ukuzilungisa kwesikhathi sangempela, ngokungafani namamodeli azenzakalelayo.
- Ilinganiswe ngenani, ilingana ku-18GB VRAM ngokusekelwa kosuku-zero ku-vLLM, Transformers, MLX, kanye ne-Unsloth.
- Iwukuhlola futhi ikhwalithi ephansi kune-Gemma 4 ejwayelekile; I-Google incoma i-Gemma 4 ngokukhiqizwa.
Isichazi Esibonakalayo sikaMarktechpost
I-DiffusionGemma: Umhlahlandlela obonakalayo
Imodeli ye-Google DeepMind's 26B evulekile yokusabalalisa umbhalo – ukuthi iyini nokuthi isebenza kanjani.
Hlola Izisindo zemodeli futhi Imininingwane yobuchwepheshe. Siphinde sakha a idemo emfushane yaleli phepha locwaningo. Futhi, zizwe ukhululekile ukusilandela Twitter futhi ungakhohlwa ukujoyina wethu 150k+ ML SubReddit futhi Bhalisela ku Iphephandaba lethu. Linda! ukutelegram? manje ungasijoyina kuthelegramu futhi.
Udinga ukusebenzisana nathi ekuthuthukiseni i-GitHub Repo yakho NOMA Ikhasi Lobuso Lokugona NOMA Ukukhishwa Komkhiqizo NOMA I-Webinar njll.? Xhumana nathi




