Reactive Machines

Yakha i-Amazon Bedrock Bedrock Batch Jobflow Workflow Workchestration usebenzisa imisebenzi yezinyathelo ze-AWS

Njengoba izinhlangano ziqhubeka zithola amamodeli weSisekelo (FMS) ngobuhlakani babo bokufakelwa kanye nokufunda komshini (AI / ML), ukuphatha imisebenzi yokuqashwa kwamandla amakhulu, ukuphathwa kwemisebenzi eminingi yokuqashwa kwamandla amakhulu iba nzima. I-Amazon Bedrock isekela izinhlobo ezimbili ezijwayelekile zamaphethini wokulinganisa amakhulu: Ukuzithoba kwesikhathi sangempela kanye ne-batch ukutholwa kwamacala asebenzisa ukucubungula imininingwane eminingi lapho imiphumela esheshayo ingadingeki.

I-Amazon Bedrock Batch Incence iyikhambi elisebenzisekayo elinikeza isaphulelo esingu-50% ngokuqhathaniswa nokucutshungulwa okufunwayo, okwenza kube kuhle kakhulu ngomthamo wevolumu, isikhathi esivelayo. Kodwa-ke, ukusebenzisa i-batch ukutholwa esikalini kuza nezinselelo zalo zezinselelo, kubandakanya nokuphatha ukufomatha kokufaka kanye nezilinganiso zomsebenzi, ukuhlela ukubulawa kwabantu kanye nokuphatha imisebenzi yasemuva, kanye nokuphatha imisebenzi yangemva kokuthumela. Onjiniyela badinga uhlaka oluqinile lokuhambisa le misebenzi.

Kulokhu okuthunyelwe, sethula isixazululo esiguquguqukayo nesingenakahle esikwenza lula ukuhamba komsebenzi we-batch infence. Lesi sixazululo sihlinzeka ngendlela enobuthana ekuphatheni izidingo zakho zokuhlasela kwe-FM BATCH, njengokukhiqiza ukushumeka kwezigidi zamadokhumenti noma imisebenzi yokuhlola ngokwezifiso noma ukuqeda ngama-dataset amakhulu.

Ukubuka konke

Umdwebo olandelayo imininingwane ebanzi Ukubuka konke okubanzi kokuhamba komsebenzi okuzenzakalelayo, okubandakanya izigaba ezintathu eziphambili: ukufakwa kolwazi lokufaka (ngokwesibonelo, ukufomatha), ukwenziwa kwemisebenzi ye-batch yokuphakanyiswa ngokufana, futhi kwangezelela ukuhlanganisa imiphumela yemodeli.

Lesi sixazululo sinikezela ngohlaka oluguquguqukayo futhi olunakwehlekile ukuze kube lula i-Batch Orchestration. Ngokunikezwa kokufakwa okulula kokucushwa, isinyathelo semisebenzi yombuso sithunyelwe kulesi sitabhu se-AWS Cloud Development Kit (AWS CDK) izibambo ze-stack zisingatha i-dataset, sethula imisebenzi ye-patch ye-parallel, futhi kwangezelela imisebenzi ehambisanayo.

Esimweni sethu esiqondile sokusetshenziswa, sisebenzisa imigqa eyizigidi ezingama-2.2 yezigidi ezivela kumthombo ovulekile wedatha yedatha elula. I-SimpleCot Dataset ekugwingweni kobuso iqoqo lezibonelo ezahlukene ezenzelwe imisebenzi eyenzelwe ukukhombisa nokuqeqesha i-chain-of-tempenting (cot) Ukubonisana ngezilimi. Le datha ihlanganisa izinhlobo eziningi zezinkinga, kufaka phakathi ukuqondisisa kokufunda, ukucabanga kwezibalo, ukuncishiswa okunengqondo, kanye nemisebenzi yezemvelo yokusebenza (NLP). Idathasethi ihlelwe ngokungena ngakunye okuqukethe incazelo yomsebenzi, umbuzo, impendulo efanele, kanye nencazelo eningiliziwe yenqubo yokubonisana.

Umdwebo olandelayo ukhombisa ukwakhiwa kwesixazululo.

Umdwebo wezakhiwo

Iphethini ye-Amazon Bedrock Batch Batch isebenzisa izingxenye ezi-scwable futhi ezingenasikali futhi zimboze ukucatshangelwa okusemqoka kwezakhiwo eziqondene nokuhamba komsebenzi we-batch:

  • Ifomethi yefayela nesitoreji – Ukufakwa komsebenzi kufanele kuhlelwe njengamafayela we-JSONL agcinwe kwi-Amazon Simple Service Service (I-Amazon S3), ngomugqa ngamunye omele irekhodi elilodwa lokufaka elihambisana nesakhiwo sesicelo se-APM salowo mhlinzeki noma umhlinzeki. Isibonelo, amamodeli ka-Anthropic's Claude anesakhiwo esihlukile se-JSON ngokuqhathaniswa ne-Amazon Titan Umbhalo Eshuddings V2. Kukhona futhi izilinganiso okufanele zicatshangelwe: Ngesikhathi sokubhala, ubuncane be-1 000 kanye nobukhulu bamarekhodi angama-50 000 nge-batch ngayinye. Ungacela ukukhuphuka kwesilinganiso usebenzisa izilinganiso zensizakalo ngokuya ngezidingo zakho zecala.
  • Isinyathelo semisebenzi umshini wombuso – Ukufakwa kwama-orchestation kwe-asynchronous, imisebenzi ende esebenza isikhathi kudinga uhlelo lokugeleza kokulawula oluqinile. Ubuciko bethu busebenzisa imisebenzi yesinyathelo ukudidiyela inqubo ephelele, nge-Amazon DynanomDB ukugcina ukusungula imisebenzi yomuntu ngamunye kanye nezifundazwe zazo. Futhi, kunesilinganiso esibalulekile sokucatshangelwa: ngokwesibonelo, isamba esiphezulu sokuthuthuka futhi sithumele imisebenzi ye-batch yokuthambisa usebenzisa imodeli eyisisekelo ye-Amazon Titan Umbhalo Isifunda Sombhalo-njengamanje. Usebenzisa izifundazwe zomsebenzi wemephu, imisebenzi yesinyathelo ingasiza ukukhulisa ukukhishwa ngokulawula ukuhanjiswa komsebenzi nokuqapha ukuqedwa kwesimo.
  • Ukuthumela emuva – Ekugcineni, cishe uzofuna ukwenza okuthile okukhanyayo kokuphuma kwe-batch (namafayela e-JSONL e-Amazon S3) ukuhlanganisa izimpendulo futhi ujoyine okuphumayo emuva kokufaka kokuqala. Isibonelo, lapho udala ukushumeka kombhalo, kufanele ube nendlela yokuthola imephu yokukhipha imephu emuva embhalweni wabo womthombo. Le misebenzi ye-AWS Lambda elungisekayo ibangelwa njengengxenye yemisebenzi yesinyathelo isebenza ngomsebenzi ngemuva kwemiphumela ye-batch ifika e-Amazon S3.

Ezingxenyeni ezilandelayo, sihamba ngezinyathelo zokuhambisa isitaki se-AWS CDK kwimvelo yakho ye-AWS.

Izimfuneko

Qedela lezi zinyathelo zokuqala ezilandelayo:

  1. Faka i-node ne-npm.
  2. Faka i-AWS CDK:
  1. Clone the github indawo yokuhlala endaweni yakho yentuthuko yendawo:
git clone 
cd poc-to-prod/bedrock-batch-orchestrator

Sebenzisa Isixazululo

Faka amaphakheji adingekayo ngekhodi elandelayo:npm i

Bheka prompt_templates.py ifayela bese wengeza ithempulethi entsha entsha ukuze prompt_id_to_template icala lakho lokusebenzisa oyifunayo.

prompt_id_to_template kuyinto lapho ukhiye yi- prompt_id (ukuvumela ukuthi uhlobanise umsebenzi onikezwe ngesikhashana). Ufometha okhiye kwithempulethi ye-Prompt string kumele futhi ibe khona kufayela lakho lokufaka. Isibonelo, cabanga ngethempulethi elandelayo elandelayo:

You are an AI assistant tasked with providing accurate and justified answers to users' questions.
    
You will be given a task, and you should respond with a chain-of-thought surrounded by  tags, then a final answer in  tags.

For example, given the following task:


You are given an original reference as well as a system generated reference. Your task is to judge the naturaleness of the system generated reference. If the utterance could have been produced by a native speaker output 1, else output 0. System Reference: may i ask near where? Original Reference: where do you need a hotel near?.



The utterance "may i ask near where?" is not natural. 
This utterance does not make sense grammatically.
Thus we output 0.


0

Your turn. Please respond to the following task:


{source}

Kufanele uqiniseke ukuthi idatha yakho yokufaka inekholomu yokhiye ngamunye wokufometha (ngokwesibonelo, source kwikhodi eyisibonelo eyedlule).

Amathempulethi asheshayo awasetshenziselwa imisebenzi esekwe kumodelinpm run cdk deploy

Qaphela imiphumela ye-AWS Cloudformation Events ebonisa amagama webhakede kanye nemisebenzi yesinyathelo somsebenzi ukuhamba komsebenzi:

✅ BedrockBatchOrchestratorStack

✨ Deployment time: 23.16s

Outputs:
BedrockBatchOrchestratorStack.bucketName = batch-inference-bucket-
BedrockBatchOrchestratorStack.stepFunctionName = bedrockBatchOrchestratorSfnE5E2B976-4yznxekguxxm
Stack ARN:
arn:aws:cloudformation:us-east-1::stack/BedrockBatchOrchestratorStack/0787ba80-b0cb-11ef-a481-0affd4b49c99

✨ Total time: 26.74s

Isakhiwo sokufaka umsebenzi

Njengodathasethi yakho yokufaka, ungasebenzisa i-ID yobuso be-hugging yobuso noma iphuzu ngqo kudathabhethi ku-Amazon S3 (CSV noma i-Parquet Fomakhiwo kusekelwa ngesikhathi sokubhala). Umthombo wedathafathi yokufaka kanye nohlobo lwemodeli (ukukhiqizwa kombhalo noma ukushumeka) Qamba ukwakheka kwemisebenzi ye-Step Inctions.

Ukuqabula phansi dataset yobuso

Ukuze uthole imininingwane yobuso be-hugging, ireferensi I-ID yedatha (ngokwesibonelo, w601sxs/simpleCoT) nokuhlukaniswa (ngokwesibonelo, train), futhi i-dataset yakho izodonswa ngokuqondile kusuka ekugobeni kobuso.

Ukugoba Imephu Yobuso

Le khasi question_answering template esheshayo ngaphakathi prompt_templates.py inenkinobho yokufometha ebizwa ngokuthi source Ukufanisa igama lekholomu efanelekile kudathabhethi ekhonjiswayo (bheka isibonelo esandulele). Sisebenzisa lokhu ngokushesha ukukhiqiza umgomo kanye nempendulo yemigqa ngayinye eyizigidi eziyi-2.2 kudathabhethi. Bona ikhodi elandelayo:

{
  "job_name_prefix": "full-cot-job",
  "model_id": "us.anthropic.claude-3-5-haiku-20241022-v1:0",
  "prompt_id": "question_answering",
  "dataset_id": "w601sxs/simpleCoT",
  "split": "train",
  "max_records_per_job": 50000
}

Siphinde sibe nezinkinobho zokuzikhethela max_num_jobs (ukukhawulela inani eliphelele lemisebenzi, eliwusizo ekuhlolweni kwesilinganiso esincane) kanye max_records_per_batch.

Idathasethi ye-Amazon S3

Faka ifayela le-CSV noma le-Parquet kubhakede le-S3 bese ukopisha i-S3 URI. Ngokwesibonelo:aws s3 cp topics.csv s3://batch-inference-bucket-/inputs/jokes/topics.csv

Vula izinyathelo zakho zisebenza ngomshini wombuso we-Step Console bese uthumela okokufaka ngesakhiwo esilandelayo. Kufanele unikeze i- s3_uri kuma-s3 datasets.

Isibonelo, amamodeli we-anthropic ngokufakwa kwe-Amazon S3, sebenzisa ikhodi elandelayo:

{
"s3_uri": "s3://batch-inference-bucket-/inputs/jokes/topics.csv",
"job_name_prefix": "test-joke-job1",
"model_id": "anthropic.claude-3-haiku-20240307-v1:0",
"prompt_id": "joke_about_topic"
}

Le khasi prompt_id iwa- joke_about_topic Amamephu ku-template esheshayo ngaphakathi prompt_templates.pyonokhiye wokufomatha we topicokumele kube ngomunye wamakholomu kufayela le-CSV lokufaka.

Khiqiza ukushumeka kwe-batch

Ukukhiqiza ukushumeka ngemodeli efana ne-amazon titan umbhalo oshumekiwe v2, awudingi ukunikeza a prompt_idkepha udinga ukuqiniseka ukuthi ifayela lakho le-CSV linekholomu ebizwa ngokuthi input_text ngombhalo ofuna ukushumeka. Ngokwesibonelo:

{
"s3_uri": "s3://batch-inference-bucket-/inputs/embeddings/embedding_input.csv",
"job_name_prefix": "test-embeddings-job1",
"model_id": "amazon.titan-embed-text-v2:0",
"prompt_id": null
}

Imisebenzi yesinyathelo iyasebenza

Umdwebo olandelayo ukhombisa isibonelo semisebenzi yesinyathelo ephumelelayo ukusebenza kwenziwa kwenziwa.

Ukugeleza kwenqubo

Lapho kuqalwa umshini wesinyathelo, uqeda lezi zinyathelo ezilandelayo:

  1. Ukufakwa kokufakwa kwangaphambili ukulungiselela okokufaka kwe-batch umsebenzi we-ID yakho ethile yemodeli kanye nethempulethi esheshayo. Le khasi BaseProcessor Isigaba esingabonakali singanwetshwa ngokushesha kwabanye abahlinzeki bemodeli, njengeMeta Llama 3 noma i-Amazon Nova.
  2. Imisebenzi ye-archestrate batch imfashini eqhutshwa ngumcimbi. Sigcina ukusungulwa kwangaphakathi kwemisebenzi etafuleni elishukumisayo futhi sikugcine kuvuselelwa lapho i-Amazon Bedrock ikhipha imicimbi ehlobene nemicimbi yesimo somsebenzi. Lokhu kubuyekezwa bese kudluliselwa emuva kumsebenzi wesinyathelo usebenzisa Lindela ukubuyiselwa emuva komsebenzi iphethini yokuhlanganisa. Usebenzisa imephu ye-SFN, siyaqiniseka ukuthi umthamo omkhulu wemisebenzi efanayo ugcinwa kuze kube yilapho amarekhodi eseculwe.
  3. Gijimani ngemuva kokuphuma kwe-batch okuphumayo ukwenza ezinye izimpendulo ezikhanyayo bese uhlanganisa izimpendulo zemodeli emuva kudatha yokufaka yokuqala usebenzisa inkambu eqoshiwe njengokhiye wokujoyina njengokhiye wokujoyina njengokhiye wokujoyina njengokhiye wokujoyina njengokhiye wokujoyina njengokhiye wokujoyina njengokhiye wokujoyina njengokhiye wokujoyina. Imininingwane yokukhipha incike ngohlobo lwemodeli oyisebenzisayo. Amamodeli asuselwa embhalweni, intambo yokuphuma izoba kwikholamu entsha ebizwa ngokuthi response.

Gada umshini wakho wombuso njengoba ugijimisa imisebenzi. Inani eliphakeme lemisebenzi efanayo lilawulwa yi-AWS CDK Context Demotion in cdk.json (Ukhiye: maxConcurrentJobs). Izindlela eziya kumafayili wakho we-PARQUET amafayela azohlanganiswa emikhawulweni evela ekubulaweni.

Amafayela we-Parquet akhiphayo azoqukatha amakholomu afanayo nefayela lakho lokufaka eceleni kwezimpendulo ezikhiqizwayo.

Ngemodeli yesizukulwane sombhalo, intambo yokuphuma izoba kwikholamu entsha ebizwa ngokuthi responsenjengoba kukhonjisiwe kusikrini esilandelayo sokukhipha isampula.

Isicelo sesampula nempendulo

Amamodeli wokushumeka, okukhiphayo (uhlu lwezintanta) kuzoba kukholamu entsha ebizwa ngokuthi embeddingnjengoba kukhonjisiwe ku-skrini elandelayo.

Isampula yesampula

Awekho ama-slas aqinisekisiwe e-batch aphahlo apifere. Izikhathi ze-rusttimes zizohluka ngokuya ngesidingo semodeli oyifunayo ngesikhathi sesicelo sakho. Isibonelo, ukucubungula amarekhodi ayizigidi ezingama-2.2 kwidathashi ye-Simplecot, ukubulawa kwasakazwa emisebenzini emi-45 yokucubungula ngakunye, okuphezulu kwemisebenzi engama-20 evamile ngesikhathi esinikeziwe. Ekuhlolweni kwethu nge-anthropic's Claude Haiku 3.5 ku us-east-1 Isifunda, ukubulawa komsebenzi ngamunye kuthathe isilinganiso samahora ayi-9, isikhathi esiphelele sokuqeda ukuqeda cishe amahora angama-27.

Hlanza

Ukugwema ukufuya izindleko ezingezekile, ungahlanza izinsizakusebenza ze-Stack ngokusebenza cdk destroy.

Ukugcina

Kulokhu okuthunyelwe, sichaze ukwakhiwa okungenasici kokwenza ukucubungula ama-batch amakhulu ngokusebenzisa i-Amazon Bedrock Batch Ipharent. Sihlole ukusebenzisa ikhambi lamacala ahlukahlukene okusetshenziswa, kufaka phakathi ilebula ledatha enkulu kanye nokushumeka isizukulwane. Ungakha futhi idatha enkulu yokwenziwa yedatha kusuka kumodeli yothisha esetshenziselwa ukuqeqesha imodeli yabafundi njengengxenye yenqubo yokuhlonza imodeli.

Isixazululo sitholakala esidlangalaleni e-GitHub Repo. Ngeke silinde ukubona ukuthi ukubeke kanjani lo mbuso ukuze usebenzele amacala akho okusebenzisa.


Mayelana nababhali

Swagat Kulkarni Ingabe ukwakhiwa kwezixazululo eziphezulu ku-AWS kanye nodokotela osebenzayo we-AI ai. Unothando lokusiza amakhasimende axazulule izinselelo zangempela zomhlaba ezisebenzisa izinsizakalo ze-Cloud-Natitity Services kanye nokufunda ngomshini. Ngemuva eliqinile ekushayeleni uguquko lwedijithali emikhakheni ehlukahlukene, i-swagat ilethe izixazululo ezinomthelela ezinika amandla okusha kanye nesilinganiso. Ngaphandle komsebenzi, uyakujabulela ukuhamba, ukufunda nokupheka.

Evan veimald Ungunjiniyela wokufunda wedatha noMshini nge-AWS Professional Services, lapho asiza khona amakhasimende ama-AWS athuthukise futhi athumele izixazululo ze-ML ezinhlobonhlobo zezimboni vermicals. Ngaphambi kokujoyina ama-AWS, wathola ama-MS avela eCarnegie Mellon University, lapho aqhuba khona ucwaningo ekuxhumaneni kwe-Advanced Everatuf nase-AI. Ngaphandle komsebenzi, uyakujabulela ukuhamba ngebhayisekili nentaba ekhuphuka.

Shreyas Subramannian Ungusosayensi wedatha oyinhloko futhi usiza amakhasimende ngokusebenzisa i-AI ekhiqizayo kanye nokufunda okujulile ukuxazulula izinselelo zebhizinisi labo kusetshenziswa izinsizakalo ze-AWS njenge-Amazon Bedrock ne-Agentcore. UDkt Subramannian unomthelela ocwaningweni onqenqemeni olujulile ekufundeni okujulile, ama-agentic ai, amamodeli wesisekelo kanye namasu wokusiza ngezincwadi eziningana, amaphepha kanye namalungelo obunikazi egameni lakhe. Eqhakaza lakhe njengamanje e-Amazon, uDkt Subramannian usebenza nabaholi abahlukahlukene besayensi kanye namaqembu acwaninga ngaphakathi nangaphandle kwama-Amazon, asiza ukuqondisa amakhasimende ukuba axazulule izinkinga ze-algorithms ezibucayi. Ngaphandle kwe-AWS, uDkt Subramannian ungumbuyekezi wesazi wamaphepha we-AI kanye nokuxhaswa ngezimali ngezinhlangano ezinjenge-neurips, iCML, iCLR, NSF.

Source link

Related Articles

Leave a Reply

Your email address will not be published. Required fields are marked *

Back to top button