Generative AI

Ucwaningo lwe-Google AI Lethula I-PaperOrchestra: Uhlaka Lwe-Agent Eningi Lokubhala Kwephepha Lokucwaninga kwe-AI okuzenzakalelayo

Ukubhala iphepha locwaningo kuwubulwane. Ngisho nangemva kokwenziwa kokuhlolwa, umcwaningi usabhekene namasonto okuhumusha amanothi ngelebhu angcolile, amathebula emiphumela ahlakazekile, nemibono eyakhiwe uhhafu ibe umbhalo wesandla opholishiwe, ohlangene ngokunengqondo ofomethwe ngokunembile ekucacisweni kwengqungquthela. Kubacwaningi abaningi abasha, lowo msebenzi wokuhumusha yilapho amaphepha efa khona.

Ithimba ku-Google Cloud AI Research liphakamisa 'I-PaperOrchestra', isistimu yama-ejenti amaningi eguqulela ngokuzenzakalelayo izinto zokubhala ngaphambilini ezingahlelekile – isifinyezo sombono onzima kanye namalogi okuhlola angavuthiwe – kube umbhalo wesandla olungele ukuhanjiswa we-LaTeX, ophelele ngokubuyekezwa kwezincwadi, izibalo ezikhiqiziwe, nezingcaphuno eziqinisekisiwe ze-API.

Inkinga Eyinhloko Eyixazululayo

Izinhlelo zokubhala ezizenzakalelayo zangaphambilini, njenge-PaperRobot, zingakhiqiza ukulandelana kombhalo okukhulayo kodwa azikwazanga ukubhekana nobunzima obugcwele bokulandisa kwesayensi okuqhutshwa idatha. Izinhlaka zakamuva zocwaningo ezizimele ezifana I-AI Scientist-v1 (eyethula ukuhlola okuzenzakalelayo nokubhala ngezifanekiso zekhodi) kanye nomlandeli wayo I-AI Scientist-v2 (okwandisa ukuzimela kusetshenziswa i-agent-tree-search) yenza ngokuzenzakalelayo yonke loop yocwaningo – kodwa amamojula awo okubhala ahlanganiswe ngokuqinile namapayipi awo okuhlola angaphakathi. Awukwazi ukuvele ubanikeze idatha yakho bese ulindela iphepha. Akubona ababhali abazimele.

Ngaleso sikhathi, amasistimu akhethekile ekubuyekezweni kwezincwadi, njenge I-AutoSurvey2 futhi LiRAkhiqiza izinhlolovo ezibanzi kodwa sintula ukuqaphela umongo ukubhala okuhlosiwe Umsebenzi Ohlobene isigaba esibeka ngokucacile indlela entsha ethile ngokumelene nobuciko bangaphambili. I-CycleResearcher idinga uhlu lwereferensi olwakhiwe ngaphambilini lwe-BibTeX njengokufakwayo – i-artifact engavamile ukutholakala ekuqaleni kokubhala – futhi yehluleka ngokuphelele okokufaka okungahleliwe.

Umphumela uba igebe: alikho ithuluzi elikhona elingathatha izinto ezingavinjelwe ezihlinzekwe ngumuntu – uhlobo lwento umcwaningi wangempela angaba nayo ngemva kokuqeda ukuhlola – futhi akhiqize umbhalo wesandla ophelele, oqinile ngokwawo. I-PaperOrchestra yakhelwe ngokukhethekile ukugcwalisa leso sikhala.

Indlela Ipayipi Elisebenza Ngayo

I-PaperOchestra ihlela ama-ejenti akhethekile amahlanu asebenza ngokulandelana, amabili asebenza ngokufana:

Isinyathelo 1 – Umenzeli Wohlaka: Lo menzeli ufunda isifinyezo sombono, ilogu yokuhlola, isifanekiso sengqungquthela ye-LaTeX, nemihlahlandlela yengqungquthela, bese ekhiqiza uhlaka lwe-JSON oluhlelekile. Lolu hlaka luhlanganisa uhlelo lokubonisa ngeso lengqondo (elicacisa ukuthi yiziphi iziqephu nemidwebo okufanele yenziwe), isu lokusesha izincwadi eliqondiwe elihlukanisa umongo weleveli enkulu Yesingeniso kumaqoqo endlela yokusebenza yeleveli encane Yomsebenzi Ohlobene, kanye nohlelo lokubhala lwezinga lesigaba olunamasu wokucaphuna kuyo yonke idathasethi, i-optimizer, imethrikhi, kanye nendlela yokuqala eshiwo ezintweni zokusebenza.

Izinyathelo 2 & 3 — Umenzeli Wesakhiwo kanye Nomenzeli Wokubuyekeza Imibhalo (okuhambisanayo): I-ejenti Yokuhlela isebenzisa uhlelo lokubonisa ngeso IphephaUbhananaithuluzi lemifanekiso lezemfundo elisebenzisa i-Vision-Language Model (VLM) ukugxeka ukuhlola izithombe ezikhiqiziwe ngokumelene nezinjongo zokuklama nokuzibuyekeza ngokuphindaphindiwe. Ngesikhathi esifanayo, I-ejenti Yokubuyekeza Izincwadi yenza ipayipi lokucaphuna lezigaba ezimbili: isebenzisa i-LLM efakwe usesho lwewebhu ukuze ihlonze amaphepha ekhandidethi, bese iqinisekisa ngalinye ngokusebenzisa I-Semantic Scholar APIihlola ukufana kwesihloko okungaqondakali okuvumelekile kusetshenziswa ibanga le-Levenshtein, ithola i-abstract nemethadatha, futhi iphoqelela ukunqanyulwa kwesikhashana okuhambisana nomnqamulajuqu wokuhambisa wengqungquthela. Amareferensi ama-hallucified noma angenakuqinisekiswa ayalahlwa. Izingcaphuno eziqinisekisiwe zihlanganiswa zibe ifayela le-BibTeX, futhi i-ejenti izisebenzisela ukubhala Izigaba Zesethulo kanye Nomsebenzi Ohlobene – ngesibopho esinzima sokuthi okungenani amaphesenti angama-90 eqoqo lezincwadi eziqoqiwe kufanele acashunwe.

Isinyathelo sesi-4 – Umenzeli Wokubhala Isigaba: Lo menzeli uthatha yonke into ekhiqizwe kuze kube manje – uhlaka, izingcaphuno eziqinisekisiwe, izibalo ezikhiqiziwe – futhi abhale izigaba ezisele: okungabonakali, indlela yokwenza, ukuhlola, nesiphetho. Ikhipha amanani ezinombolo ngokuqondile kulogi yokuhlola ukuze kwakhiwe amathebula futhi ihlanganise izibalo ezikhiqiziwe emthonjeni we-LaTeX.

Isinyathelo sesi-5 – I-ejenti Yokuhluza Okuqukethwe: Ukusebenzisa I-AgentReviewisistimu yokubuyekezwa kontanga, lo menzeli ulungiselela ngokuphindaphindiwe umbhalo wesandla. Ngemva kokubuyekeza ngakunye, umbhalo wesandla wamukelwa kuphela uma umphumela we-AgentReview usuwonke ukhuphuka, noma uhlangana nenzuzo yonke ye-axis engaphansi engemibi. Noma yikuphi ukuncipha kwesikolo sekukonke kudala ukuhlehla nokuma ngokushesha. Imiphumela yokukhishwa ikhombisa ukuthi lesi sinyathelo sibalulekile: imibhalo yesandla ecolisisiwe ibusa okusalungiswa okungacutshunguliwe nge 79%–81% amanani okuwina ekuqhathaniseni okuzenzakalelayo kwezinhlangothi, futhi ilethe izinzuzo zesilinganiso sokwamukela ngokuphelele +19% ku-CVPR futhi +22% ku-ICLR ekufanisweni kwe-AgentReview.

Ipayipi eligcwele lenza cishe amakholi angu-60–70 LLM API futhi iqeda ngenani 39.6 imizuzu ephepheni ngalinye – cishe imizuzu engu-4.5 kuphela ngaphezu kwemizuzu engu-35.1 ye-AI Scientist-v2, naphezu kokushayela izingcingo ze-LLM eziningi kakhulu (40–45 ye-AI Scientist-v2 vs. 60–70 ye-PaperOrchestra).

Ibhentshimakhi: PaperWritingBench

Ithimba labacwaningi liphinde lethule I-PaperWritingBenchechazwe njengebhentshimakhi yokuqala emisiwe ikakhulukazi yokubhala iphepha locwaningo lwe-AI. Iqukethe amaphepha amukelwayo angama-200 avela ku-CVPR 2025 kanye ne-ICLR 2025 (ayi-100 endaweni ngayinye), akhethelwe ukuhlola ukuzivumelanisa namafomethi enkomfa ahlukene – ikholomu ekabili ye-CVPR iqhathaniswa nekholomu eyodwa ye-ICLR.

Ephepheni ngalinye, i-LLM yasetshenziswa ukuhlehlisa-unjiniyela okokufaka okubili okuvela ku-PDF eshicilelwe: a I-Sparse Idea Isifinyezo (incazelo yomqondo yezinga eliphezulu, azikho izibalo noma i-LaTeX) kanye a Isifinyezo Sombono Ominyene (ukugcina izincazelo ezisemthethweni, imisebenzi yokulahlekelwa, nezibalo ze-LaTeX), kanye ne- Ilogi Yokuhlola etholakala ngokukhipha yonke idatha yezinombolo nokuguqula imininingwane yezibalo ibe ukubhekwa kweqiniso okuzimele. Zonke izinto ezisetshenziswayo zenziwa zangaziwa ngokuphelele, kwahlubula amagama ababhali, izihloko, izingcaphuno, nezinkomba zezibalo.

Lo mklamo uhlukanisa umsebenzi wokubhala kunoma iyiphi ipayipi yokuhlola ethile, usebenzisa amaphepha amukelwa ngempela njengeqiniso eliyisisekelo – futhi wembula okuthile okubalulekile. Ngoba Ikhwalithi Yephepha Lilonkeisilungiselelo sombono ominyene sisebenza kangcono kakhulu kune-Sparse (43%–56% amanani okuwina uma kuqhathaniswa no-18%–24%), njengoba izincazelo zendlela yokusebenza enembe kakhudlwana zivumela ukubhalwa kwesigaba okunembayo. Kodwa ngoba Ikhwalithi Yokubuyekezwa Kwezincwadilezi zilungiselelo ezimbili zicishe zilingane (Akuncane: 32%–40%, Okuminyene: 28%–39%), okusho ukuthi Umenzeli Wokubuyekeza Izincwadi angakwazi ukuhlonza ngokuzenzakalelayo izikhala zocwaningo nezingcaphuno ezifanele ngaphandle kokuncika ekufakweni kwabantu okunemininingwane.

Imiphumela

Ekuhlolweni okuzenzakalelayo kwe-side-by-side (SxS) kusetshenziswa kokubili i-Gemini-3.1-Pro kanye ne-GPT-5 njengamamodeli wamajaji, i-PaperOrchestra ibibusa kakhulu ngekhwalithi yokubuyekezwa kwezincwadi, yazuza amamajini wokuwina aphelele 88%–99% ngaphezu kwesisekelo se-AI. Ngekhwalithi yephepha iyonke, iphumelele i-AI Scientist-v2 nge 39%–86% kanye ne-Single Agent by 52%–88% kuzo zonke izilungiselelo.

Ukuhlola komuntu – okwenziwa ngabaphenyi be-11 be-AI kuzo zonke iziqhathaniso zemibhalo yesandla ebhanqiwe engu-180 – kuqinisekise imiphumela ezenzakalelayo. I-PaperOrchestra ithole amamajini esilinganiso sokuwina esiphelele 50%–68% ngaphezu kwesisekelo se-AI kukhwalithi yokubuyekezwa kwezincwadi, kanye 14%–38% ngekhwalithi yombhalo wesandla iyonke. Iphinde yathola isilinganiso esingu-43% sokulingana/sokuphumelela ngokumelene neqiniso eliyisisekelo elibhalwe umuntu ekuhlanganiseni izincwadi – umphumela ophawulekayo wesistimu ezenzakalelayo ngokugcwele.

Izinombolo zekhava yengcaphuno zixoxa indaba ecacile. Imigqa eyisisekelo ye-AI ibe nesilinganiso sokucashunwa okungu-9.75–14.18 kuphela ephepheni ngalinye, ikhuphula amaphuzu abo e-F1 esigabeni sereferensi okufanele icashunwe (P0) kuyilapho ishiya i-recall ethi “good-to-cite” (P1) ikhumbula eduze kukaziro. I-PaperOchestra ikhiqize isilinganiso se 45.73–47.98 izingcaphunosibonisa eduze izingcaphuno ezingu-~59 ezitholakala emaphepheni abhalwe abantu, kanye ne-P1 Recall ethuthukisiwe 12.59%–13.75% phezu kwesisekelo esiqine kakhulu.

Ngaphansi kohlaka lokuhlola lwe-ScholarPeer, i-PaperOrchestra izuze izilinganiso zokwamukela ezifanisiwe ze- 84% ku-CVPR futhi 81% ku-ICLRuma kuqhathaniswa namazinga eqiniso eliyisisekelo abhalwe abantu angu-86% nama-94% ngokulandelana. Yenze kangcono kunokuqala okuqinile kokuzimela ngokwamukelwa okuphelele okungu-13% ku-CVPR no-9% ku-ICLR.

Ngokuphawulekayo, ngisho nalapho i-PaperOrchestra ikhiqiza izibalo zayo ngokuzenzakalelayo ukusuka ekuqaleni (imodi ye-PlotOn) kunokusebenzisa izibalo ezibhalwe ngabantu (imodi ye-PlotOff), ifinyelela izibopho noma iwine 51%–66% wokuqhathanisa ngapha nangapha – naphezu kokuba i-PlotOff inenzuzo yolwazi engokwemvelo njengoba izibalo ezibhalwe abantu zivame ukushumeka imiphumela eyengeziwe engekho kumalogi okuhlola angahluziwe.

Okuthathwayo Okubalulekile

  • Ingumbhali ozimele, hhayi i-bot yocwaningo. I-PaperOrchestra yakhelwe ngokuqondile ukusebenza nayo lakho izinto zokwakha – isifinyezo sombono onzima kanye namalogi okuhlola angavuthiwe – ngaphandle kokudinga ukwenza izivivinyo ngokwazo. Lokhu ukulungisa okuqondile emkhawulweni omkhulu wezinhlelo ezikhona njenge-AI Scientist-v2, ezibhala kuphela amaphepha njengengxenye yocwaningo lwazo lwangaphakathi.
  • Ikhwalithi yengcaphuno, hhayi nje ukubala kwengcaphuno, iyona ehlukanisayo wangempela. Amasistimu aqhudelanayo abe nesilinganiso sokucashunwa okungu-9–14 ephepheni ngalinye, okuzwakala kwamukeleka uze uqaphele ukuthi cishe kwakuyizikhombo “okumelwe zicaphune” ngokuphelele. I-PaperOchestra yenza isilinganiso sezingcaphuno ezingu-45–48 ephepheni ngalinye, afana namaphepha abhalwe ngabantu (~59), kanye nokufakwa okuthuthuke kakhulu kwendawo yezemfundo ebanzi – izinkomba “zokubalula” ezibonisa ukujula kolwazi lwangempela.
  • Ukusebenza kwama-ejenti amaningi kuhlala kudlula ukwaziswa komenzeli oyedwa. Isisekelo Somenzeli Oyedwa – ikholi eyodwa ye-LLM ye-monolithic enikezwe zonke izinto zokusetshenziswa ezifanayo – yedlulwe yi-PaperOrchestra ngo-52%–88% kwikhwalithi yephepha iyonke. Ama-ejenti amahlanu akhethekile ohlaka, ukuqaliswa okuhambisanayo, kanye nelophu yokuthuthukisa ephindaphindayo benza umsebenzi okungekho saziso esisodwa, kungakhathaliseki ikhwalithi, esingawenza.
  • Umenzeli Wokuthuthukisa Okuqukethwe akakhethi. Ukukhishwa kubonisa ukuthi ukususa iluphu yokubuyekeza kontanga kubangela ukwehla kwekhwalithi okumangazayo. Imibhalo yesandla ecwengisisiwe yehlula okusalungiswa okungacutshungulwanga ngo-79%–81% wesikhathi ekuqhathaniseni ngakunye, namazinga okwamukelwa alingisiwe eqa +19% ku-CVPR kanye +22% ku-ICLR. Lesi sinyathelo sisodwa sinesibopho sokuphakamisa okusalungiswa okusebenzayo kokuthile okulungele ukuhanjiswa.
  • Abacwaningi abangabantu basesenkingeni – futhi kufanele kube njalo. Isistimu ngokusobala ayikwazi ukwenza imiphumela yokuhlola emisha, futhi i-ejenti yayo yokuthuthukisa iyalwa ukuthi izibe izicelo zombuyekezi zedatha engekho kulogu lokuhlola. Ababhali babeka i-PaperOrchestra njengethuluzi lokusiza elithuthukisiwe, abacwaningi abangabantu abagcina ukuziphendulela okugcwele ngokunemba, ubuqiniso, kanye nokuba semthethweni kombhalo wesandla wokugcina.

Hlola Iphepha futhi Ikhasi Lephrojekthi. Futhi, zizwe ukhululekile ukusilandela Twitter futhi ungakhohlwa ukujoyina wethu 120k+ ML SubReddit futhi Bhalisela ku Iphephandaba lethu. Linda! ukutelegram? manje ungasijoyina kuthelegramu futhi.

Udinga ukusebenzisana nathi ekuthuthukiseni i-GitHub Repo yakho NOMA Ikhasi Lobuso Lokugona NOMA Ukukhishwa Komkhiqizo NOMA I-Webinar njll.? Xhumana nathi


Source link

Related Articles

Leave a Reply

Your email address will not be published. Required fields are marked *

Back to top button