Machine Learning

I-Proxy-Pointer Framework for Structure-Aware Enterprise Document Intelligence

izimo ezibaluleke kakhulu zokusebenzisa i-AI zebhizinisi namuhla, amazinga okuqhathanisa amadokhumenti eduze kwama-chatbots ezingxoxo. Izinhlangano zichitha inani elikhulu lamahora omuntu ziqhathanisa izinkontileka, izinqubomgomo, imininingwane yobuchwepheshe, izikhalazo zomthetho, amaphepha ocwaningo nokunye okuningi ukuhlonza umehluko, ubungozi, ukubuyekezwa kanye nokungahambisani kwe-semantic.

Nokho, ukuqhathanisa amadokhumenti kuyinkimbinkimbi kakhulu kunomehluko wombhalo ovamile. Okokuqala, lawa mathuluzi enzelwe ukuba abasizi abaphumelelayo kochwepheshe bezomthetho nabezentengiselwano, ososayensi nabanye abalindele ukuhlaziya ukuthi kube sezingeni lokujula nolimi njengoba kungalindelwa kuchwepheshe omncane esizindeni.

Inkinga enzima nakakhulu ukuthi incazelo emibhalweni yebhizinisi ngokuvamile ayiqukethwe ngezigaxa ezingazodwa. Ishumekwe phakathi kwezigaba, izigaba, amaqembu ezigatshana kanye nobudlelwano. Futhi lezi zigaba zingasakazeka emakhasini amaningi edokhumenti enamakhasi angaphezu kwe-100. Isibonelo, isivumelwano sesikweletu singachaza imikhawulo yesibambiso esigabeni esisodwa, okuhlukile kulawo makhasi ambalwa kamuva, futhi sichaze amalungelo okuqinisa ngaphansi kwe-athikili ehluke ngokuphelele. Uma esinye isivumelwano siqhathaniswa nalokhu ngemibandela efana “nesakhiwo sesibambiso, izithakazelo zokuphepha, nezimfuneko zemalimboleko,” isistimu kufanele ikhombe, ibuyise, futhi ihlanganise zonke lezi zigaba ezihlakazeke ngokwesakhiwo ndawonye ngaphambi kokuba kwenzeke noma yikuphi ukuqhathanisa okunengqondo.

Isakhiwo se-Proxy-Pointer, esinolwazi ngesakhiwo, kodwa nephayiphi lokubuyisa lezindleko eziphansi eligcina ukuhlelwa kwemibhalo ngesikhathi sokubuyisa nokuqhathanisa, kuwufanelekela kahle lo msebenzi. Isebenzisa inhlanganisela yokushumekwa kwe-hierarchical breadcrumb kanye nokubeka kabusha isikhundla kwe-LLM engasindi, iyakwazi ukukhipha ngokunembile izifunda eziqondaniswe ngokwesimo kuwo wonke amadokhumenti ngaphambi kokuthi kuqale ukucabanga okuqhathanisayo.

Kulesi sihloko, ngabelana ngomklamo kanye nemiphumela yomhlaba wangempela yesiqhathanisi samadokhumenti esisebenza ngezindlela eziningi esikwazi ukuhlaziya kokubili Izivumelwano Zezikweletu zezezimali eziyinkimbinkimbi kanye namaphepha ocwaningo lwezemfundo. Njengoba uzoqaphela ekwakhiweni kwezakhiwo ezichazwe esigabeni esilandelayo, injini yokuqhathanisa ewumongo ihlukaniswa kusukela ekucubunguleni amadokhumenti akhuphukayo kanye nokufometha nokukhiqizwa kombiko ongezansi, okuvumela isistimu ukuthi iguqulelwe kalula kunoma yisiphi isizinda sedokhumenti entsha (njengezinqubomgomo zomshwalense, imihlahlandlela yezokwelapha, noma amakhodi entela). Okudingekayo nje ipayipi lokukhipha elikhuphuka nomfula ukuze kwakhiwe okokufaka kokukhiqizwa kwezihlahla ngezigaba, kanye nokubuyekezwa komfula kumuntu ohlaziyayo we-LLM kanye nefomethi yombiko—okushiya umnyombo wokubuyiswa kwezigaba eziningi kanye nepayipi lokuqhathanisa lingathintwa nhlobo.

Futhi, ngengeza ikhodi egcwele, endaweni yami ekhona evulekile ye-Proxy-Pointer github repository, kanye nokuqalisa okusheshayo kwemizuzu emi-5.

I-Document Comparator Architecture

Nasi isifinyezo sezakhiwo ezinengqondo. I-LLM esetshenzisiwe yi gemini-3-flash kanye gemini-embedding-001 (dimension: 1536) zokushumeka kwe-vector.

Izigaba Zokwakha

Isendlalelo Sesizinda Esiphezulu

Iguqula noma isiphi isakhiwo sedokhumenti eluhlaza sibe isigaba esivamile, esifundeka ngomshini.

Izinhlelo Ezihilelekile

  • extract_pdf_to_md.py: Iphatha ukungeniswa komfula, iguqula ama-PDF abe i-Markdown ehlanzekile, ehlelwe ngokwezigaba.
  • build_doc_index.py: Ihlaziya izihloko ze-Markdown, ihlunga umsindo wokuphatha, futhi yakha imephu yesakhiwo se-JSON esezingeni eliphezulu (_structure.json).

Injini Yokuqhathanisa Eyinhloko

Ixhumanisa ukusesha kwe-semantic phezu kwamanodi edokhumenti ye-hierarchical.

Izinhlelo Ezihilelekile

  • criteria_validator.py: Ithola ngokunamandla i doc_type (isb., Ezemfundo Kuqhathaniswa Nomthetho) futhi lenze ukuhlola kokuqala kokuthi kungenzeka yini kumbandela wokuqhathanisa nomsebenzisi, ukuze kuqinisekiswe ukuthi umbandela uhambisana yini nohlobo lwedokhumenti ekhonjiwe.
  • section_selector.py: Isebenzisa ukubuyiselwa kwe-Proxy-Pointer yeSigaba 1. Ihlonza futhi ikhiphe izigaba ezifanele kakhulu Zombhalo 1 ngokusekelwe kumibandela yomsebenzisi esebenzisa ukusesha kwe-FAISS semantic kanye nokukleliswa kabusha kwe-LLM.
  • cross_retriever.py: Isebenzisa ukubuyiselwa kwe-Proxy-Pointer yeSigaba 2. Lenza ukusesha kwe-semantic okuqondisiwe ngaphakathi kwesikhala se-Vector ye-Document 2 lisebenzisa umongo wezigaba ezikhethiwe ze-Document 1 (imatanisa okuqukethwe kwesigaba se-Doc 1 nemibandela yomsebenzisi njengombuzo). Ipayipi le-Proxy-Pointer linembe ngokwedlulele ekuhlonzeni izigaba ezifanele zesifaniso somusho ukuze ziqhathaniswe.
  • section_comparator.py: Ididiyela ukuhlolwa okukabili ngakubili kwezigaba ezifanayo, izidlulisele ku-LLM ukuze ihlaziye ukuqondanisa nokungavumelani.

Isendlalelo Sesethulo Esiphansi

Ihlanganisa okukhiphayo kokuhlaziya izethameli eziqondiwe futhi ifomethe ukubonwa kokugcina.

Izinhlelo Ezihilelekile

  • build_comparison_prompt (nge criteria_validator.py): Umyalo unikeza umuntu ofanelekile (isb., Umcwaningi Wezemfundo Onolwazi noma Umeluleki Ophakeme Wezomthetho) ngokusekelwe kulokho okutholiwe. doc_type.
  • report_builder.py: Inikeza umbiko wokugcina wokuqhathanisa uhlangothi nehlangothi kusetshenziswa imibala ye-CSS yobungcweti nokufometha kwesakhiwo efundeka kakhulu. Umbiko ungaphinda ulandwe njengefayela lokumaka.

Isethi yedatha esetshenzisiwe

Ngokwesibonelo, Izivumelwano Zesikweletu ezitholakala esidlangalaleni, i-Emerson (amakhasi angu-136) kanye ne-Texas Roadhouse (amakhasi angu-190) zisetshenziswa. Laba bakhethwe ngamabomu njengoba benezinhlaka ezihlukene futhi bengaphansi kwezimboni ezahlukene. U-Emerson ungumnikezeli wezinsiza, futhi isivumelwano sakhe sifundeka njengedokhumenti yomgcinimafa wenkampani ezimele esekelwe ezilinganisweni ze-ejensi yezikweletu, kuyilapho isivumelwano se-Texas Roadhouse senziwe ngendlela oyifisayo kakhulu, sakhiwe ngokuqondile eduze nokuqashiswa kwezindawo zokudlela, izakhiwo ezingaphansi kwezinkampani eziningi, kanye nezilinganiso eziguquguqukayo zokuzuza.

Ngaphezu kwalokho, ngengeze isici ukuze ngiqhathanise amaphepha ocwaningo engikhethe kuwo i-VectorFusion neVectorPainter, asetshenziswe esihlokweni sami se-Multimodal Answers RAG. Womabili angamaphepha emkhakheni okhethekile kakhulu wokwenziwa kwezithombe zombhalo ziye kuvekhtha. Nakuba bobabili babelana ngesisekelo esifanayo sobuchwepheshe—besebenzisa ukuhumusha okuhlukanisekayo (okufana ne-DiffVG) ukuze kuthuthukiswe izindlela ze-Scalable Vector Graphics (SVG) ngamamodeli okusabalalisa—zihluke kakhulu ekusebenziseni kwazo indlela. Lobu budlelwano obuncane, nesizinda esabiwe buyicala elinzima lokuhlola injini yethu yokuqhathanisa, yekhono layo lokudlula ukufana kwezinga eliphezulu futhi esikhundleni salokho ihlole ukuhlukahluka okucashile kwezakhiwo, esizokubona esigabeni esilandelayo.

Ukuqhathaniswa Kwezivumelwano Zesikweletu

Ngiphendule imibuzo eminingana eyahlukene ngesethi yemibandela eyahlukene; imibiko enemininingwane ifakwe ngokugcwele endaweni yokugcina, futhi kwabelwa isifinyezo ngezansi. I-Streamlit UI yamukela amadokhumenti amabili (kungaba ngo .pdf noma .md format) njengokufakwayo, nokuqhathanisa okwenziwe ngokuqinile ngokombono Wombhalo 1. Isibonelo, uma Idokhumenti 1 Emerson kanye neDokhumenti 2 Texas Roadhouseukuqhathanisa kokugcina kufakwe ku-Emerson.

Kunezinyathelo ezintathu zenqubo. Okokuqala, ikhetha zonke izigaba esivumelwaneni sika-Emerson ezihambisana nemibandela yomsebenzisi. Esigabeni ngasinye esikhethiwe, ithola kufika ezigabeni ezintathu zokuqhathanisa e-Texas Roadhouse, bese yenza ukuhlaziya okuhambisana nehlangothi. Kanye nokuhlaziya okuningiliziwe, uhlelo luhlinzeka ngendima esebenzayo, Isilinganiso Sokungafani, kanye Nesiqondiso Sengozi (noma i-Methodological Tradeoff yamaphepha emfundo)

Ezimweni ezine ezilandelayo, iDokhumenti 1 ithi Emerson, Idokhumenti 2 yi-Texas Roadhouse.

Umbandela 1: ukwakheka kwesibambiso, izintshisekelo zokuphepha, iziqinisekiso, nezidingo zemalimboleko

Umbandela 2: izehlakalo zokuzenzakalelayo, amakhambi omboleki, amalungelo okusheshisa, nezikhathi zokwelashwa

Umbandela 3: izivumelwano zezimali, izimfuneko zesilinganiso sokutholwa, nezibopho zokuthobela umboleki

Umbandela 4a: izethulo namawaranti, izigaba zomphumela omubi, nezibopho zokudalula

Ukuze uthole ukuhlolwa kwecala elibucayi, nakhu okungenhla “amawaranti” umbandela oshintshiwe ngamadokhumenti. Ngokulandelayo, Idokhumenti 1 ithi Texas Roadhouse futhi Idokhumenti 2 ithi Emerson.

Umbandela 4b: izethulo namawaranti, izigaba zomphumela omubi, nezibopho zokudalula

Ukuhlaziywa kokuqhathanisa Isivumelwano Sesikweletu

Okuboniswa yimiphumela engenhla ukuthi i-Proxy-Pointer ayifani nje nezigaba ngamagama angukhiye noma izingcezu ezingaphelele, ibheka kumuntu womhlaziyi wezomthetho, umuntu oqondayo ukuthi isikweletu sisebenza kanjani, kuzo zonke lezi zimboni ezihluke kakhulu. Enye iwusizo lwezinga lokutshala imali, kanti enye iwuchungechunge lwezindawo zokudlela ezimaphakathi. Isibonelo, ihlonza imiphumela yezomnotho nezomthetho efihlwe ngaphansi kolimi olucishe lufane – njengengozi yokubekwa ngaphansi kwesakhiwo ngaphakathi kwesithembiso esibi, ukulondolozwa kwenani lebhizinisi ngaphakathi kwezivumelwano zesimo noma ukuvezwa ezinkantolo ngaphakathi kokuvezwa kolwazi.

Okunye okuphawuliwe ukuthi ukuhlaziya kwahlala kungaguquguquki lapho imibhalo iphendulwa. Ayizange ibambelele ku-Emerson njengoMqulu 1, kodwa esikhundleni salokho ihlaziye kabusha izivumelwano ngombono we-Texas Roadhouse. Ihlonze kahle ukuthi yisiphi isivumelwano esibeka imikhawulo eyengeziwe kumboleki, esanikeza ababolekisi ukulawula okukhulu ngesikhathi sokungakhokhi, esasisengozini enkulu yezimpahla ezikhishwa lapho zingafinyeleleki, futhi esasidinga inkampani ukuthi idalule ulwazi olwengeziwe. Akukho kulokhu okubhalwe ngokucacile kuzo zombili izivumelwano. Ziba sobala kumhlaziyi wezomthetho lapho izigatshana eziningi, okuhlukile, imingcele, nezincazelo kufundwa ndawonye. Umphumela uzizwa ungaphansi njengesiqhathaniso sesigatshana esilula futhi kufana nokuqonda ukuthi ubungozi nokulawula kwabiwa kanjani phakathi komboleki kanye nomabolekisi.

Ukuqhathanisa Kwephepha Lokucwaninga

Ngamaphepha e-VectorFusion kanye ne-VectorPainter, ngiqhathanise ukusebenzisa lezi zindlela ezilandelayo: Qhathanisa ukuthi iphepha ngalinye lisondela kanjani ekulawuleni isitayela kanye nokuqalisa kwakudala ku-vector graphics synthesis. Ngokukhethekile, hlaziya ukuthi i-VectorFusion isebenzisa kanjani ukuqaliswa kabusha kwendlela kanye nokuqaliswa kwesampula okuyi-raster kuqhathaniswa nendlela i-VectorPainter ekhipha ngayo futhi ihlele kabusha imivimbo ene-vectorized kusuka esithombeni esiyireferensi kusetshenziswa ukufunda ukulingisa i-stroke kanye nokulahlekelwa kokulondoloza isitayela.

Nasi isiqhathaniso esisodwa:

Ukuhlaziywa kubonisa ukuqhathanisa okujulile kwesizinda, ithuluzi umcwaningi angalisebenzisa ukuze aqhathanise amaphepha omabili ngaphandle kokuwafunda ewonke. I-Proxy-Pointer ihamba ngaphezu kokufanisa izakhiwo zezinga eliphezulu futhi ihlonze ifilosofi yedizayini ejulile ngemva kwamaphepha womabili. Ngaphezu kwalokho, ibona kahle ukuthi i-VectorFusion iphatha isizukulwane se-SVG njengenkinga yokwenza kahle ngokuqhubekayo nokuvuselelwa kwendlela eqhubekayo, kuyilapho i-VectorPainter isondela kuyo njengenkinga yokuhlanganisa eqondiswa isitayela egxile ekungaguquguquki kobuciko kanye nomlando wokufunda we-stroke. Okwakuthakazelisa kakhulu ukuthi kwakukwazi ukuxhumanisa imibono esabalaliswe ezigabeni ezihluke ngokuphelele zamaphepha futhi kulinganise imikhawulo eyisisekelo. Lokhu kubonisa ukuhlaziya okucolisekile kwamasistimu amabili esizindeni esifanayo esiwumngcingo kodwa asebenza ngokuhlukile.

Open-Source Repository

I-Proxy-Pointer ingumthombo ovulekile ngokugcwele (i-MIT License) futhi ingafinyelelwa endaweni yokugcina ye-Proxy-Pointer Github. I-Document Comparator yengezwa ku-repo ngaphezu kwe-bots ekhona yokuphendula umbhalo kuphela kanye ne-Multimodal Answering.

A Imizuzu emi-5 iqala ngokushesha kuzokuvumela ukuthi uhlole ngokushesha ngedatha etholakalayo.

DocComparator/
├── src/
│   ├── comparison/
│   │   ├── cross_retriever.py    # Stage 2 PP Retrieval (Doc 2)
│   │   ├── section_comparator.py # Pairwise LLM evaluation engine
│   │   └── section_selector.py   # Stage 1 PP Retrieval (Doc 1)
│   ├── extraction/
│   │   └── extract_pdf_to_md.py  # LlamaParse PDF ingestion & formatting
│   ├── indexing/
│   │   └── build_doc_index.py    # Skeleton tree & FAISS vector builder
│   ├── report/
│   │   └── report_builder.py     # Markdown report generation logic
│   ├── validation/
│   │   └── criteria_validator.py # Persona injection & criteria feasibility
│   └── config.py                 # Core configurations and model definitions
├── data/                         # Unified Data Hub
│   └── uploads/                  # Raw PDFs and test documents
├── results/                      # Artifact reports for the test cases tried
└── app.py                        # Streamlit Comparator UI

Isiphetho

Ukuqhathanisa idokhumenti kusetshenziswa indlela ye-Chunk-Embed-Match cishe ngeke kunikeze imiphumela emihle. Encwadini yebhizinisi eyinkimbinkimbi njengeMigomo Nemibandela Yenkontileka, incazelo ye-semantic ifakwe ezigabeni nezigatshana eziqukethe umbhalo ominyene. Ngayinye yalezi zigaba ingaba amakhasi ubude futhi ibe yingxenye yombhalo omude kakhulu. Ukuze kuqhathaniswe ngempumelelo nokuhlaziya – izigaba, izincazelo, okuhlukile, nobudlelwano besakhiwo kudingeka kukhishwe ndawonye ukuze kube nomqondo lapho kufundwa ndawonye.

I-Proxy-Pointer nepayipi layo lokubuyisa elizinyathelo ezimbili elinembile ilungele lo msebenzi. Njengoba imiphumela engenhla ibonisa, ngisho nebhajethi ye-LLM efana gemini-flashumuntu angakwazi ukuqhathanisa izivumelwano noma amaphepha ocwaningo ukuze akwazi ukulondoloza inhloso eyisisekelo kanye nokuhwebelana kufihlwe kuzo zonke izigaba ezihlukene ngesakhiwo.

I-architecture ye-3-tier ye-Document Comparator ingafinyelela kwezinye izizinda ngaphandle koshintsho injini yokuqhathanisa ngokwayo. Lokhu kuvumela ukubuyiswa kolwazi lwesakhiwo ukuthi kwenzeke kangcono kunethuluzi elakhelwe ngokwezifiso elisebenza kuphela kuhlobo oluthile lwedokhumenti. Izinhlangano zingakujwayelanisa lokhu nezimboni zazo ezithile kanye nezimo zokusebenzisa, ngomzamo omncane wobunjiniyela okhuphukayo.

Hlanganisa i-repo. Zama amadokhumenti akho. Ngazise imicabango yakho.

Xhumana nami futhi wabelane ngamazwana akho ku-www.linkedin.com/in/partha-sarkar-lets-talk-AI

Wonke amaphepha ocwaningo asetshenziswe kulesi sihloko atholakala ku-VectorFusion kanye ne-VectorPainter enelayisensi ye-CC-BY. Izivumelwano zezikweletu zitholakala esidlangalaleni kwa-SEC.gov. Ikhodi nemiphumela yebhentshimakhi ingumthombo ovulekile ngaphansi kwelayisense ye-MIT. Izithombe ezisetshenziswe kulesi sihloko zenziwa kusetshenziswa i-Google Gemini.

Source link

Related Articles

Leave a Reply

Your email address will not be published. Required fields are marked *

Back to top button