Kusukela ku-Prompt kuya kumodeli yobuso bokugona obuthunyelwe

Amaphrojekthi amaningi e-ML awehluleki ngenxa yokukhethwa kwemodeli. Bahluleka phakathi nendawo engcolile: ukuthola idathasethi efanele, ukuhlola ukusebenziseka, ukubhala ikhodi yokuqeqesha, ukulungisa amaphutha, izingodo zokufunda, ukulungisa iphutha lemiphumela ebuthakathaka, ukuhlola okuphumayo, nokupakisha imodeli kwabanye.
Yilapho i-ML Intern ingena khona. Akuyona nje i-AutoML yokukhethwa kwemodeli nokushuna. Isekela ukuhamba komsebenzi okubanzi kobunjiniyela be-ML: ucwaningo, ukuhlolwa kwedathasethi, ukubhala amakhodi, ukwenziwa komsebenzi, ukulungisa iphutha, nokulungiswa kobuso obungangani. Kulesi sihloko, sihlola ukuthi i-ML Intern ingakwazi yini ukushintsha umbono ube yi-artifact ye-ML esebenzayo ngokushesha nokuthi ingabe ifanelwe indawo kusitaki sakho se-AI noma cha.
Iyini i-ML Intern
I-ML Intern iwumsizi womthombo ovulekile womsebenzi wokufunda womshini, owakhiwe eduze kwe-Hugging Face ecosystem. Ingasebenzisa amadokhumenti, amaphepha, amasethi edatha, ama-repos, imisebenzi, kanye nekhompyutha yefu ukuze ihambise umsebenzi we-ML phambili.
Ngokungafani ne-AutoML evamile, ayigxili kuphela ekukhetheni amamodeli nokuqeqeshwa. Kuyasiza futhi ngezingxenye ezingcolile mayelana nokuqeqeshwa: ukucwaninga izindlela, ukuhlola idatha, ukubhala imibhalo, ukulungisa amaphutha, nokulungiselela imiphumela yokwabelana.
Cabanga nge-AutoML njengomshini wokwakha imodeli. I-ML Intern iseduze nozakwethu we-ML omncane. Ingasiza ukufunda, ukuhlela, ikhodi, ukusebenzisa, nokubika, kodwa isadinga ukugadwa.
Umgomo Wephrojekthi
Ngalokhu kudlula, nginikeze i-ML Intern umsebenzi owodwa wokufunda womshini osebenzayo: ukwakha imodeli yokuhlukanisa umbhalo elebula amathikithi osekelo lwekhasimende ngohlobo lokukhishwa.
Imodeli idinga ukusebenzisa isethi yedatha ye-Hugging Face yomphakathi, ukulungisa kahle i-transformer engasindi, uhlole imiphumela ngokunemba, i-macro F1, ne-matrix yokudideka, futhi ulungiselele imodeli yokugcina ukuze ishicilelwe ku-Hugging Face Hub.
Ukuze ngihlole i-ML Intern kahle, ngisebenzise iphrojekthi eyodwa ephelele esikhundleni sokubonisa izici ezingazodwa. Umgomo bekungekona nje ukubona ukuthi ingakwazi yini ukukhiqiza ikhodi, kodwa ukuthi ingahamba yini ekuhambeni komsebenzi okugcwele kwe-ML: ucwaningo, ukuhlolwa kwedathasethi, ukukhiqizwa kombhalo, ukulungisa iphutha, ukuqeqeshwa, ukuhlola, ukushicilela, nokudalwa kwedemo.
Lokhu kwenza isilingo sasondelana nephrojekthi ye-ML yangempela, lapho impumelelo incike kokungaphezu kokukhetha imodeli.
Manje, ake sibone ukuhamba ngesinyathelo ngesinyathelo:
Isinyathelo 1: Kuqalwe ngokwaziswa okucacile kwephrojekthi
Ngiqale ngokunikeza u-ML Intern umsebenzi othile esikhundleni sesicelo esingacacile.
Build a text classification model that labels customer support tickets by issue type.1. Use a public Hugging Face dataset.
2. Use a lightweight transformer model.
3. Evaluate the model using accuracy, macro F1, and a confusion matrix.
4. Prepare the final model for publishing on the Hugging Face Hub.Do not run any expensive training job without my approval.
Lokhu kwaziswa kuchaze umgomo, uhlobo lwemodeli, indlela yokuhlola, ukulethwa kokugcina, kanye nomthetho wokuphepha wokubala.

Isinyathelo sesi-2: Ucwaningo lwesethi yedatha nokukhetha
I-ML Intern iseshe amasethi edatha omphakathi afanelekile futhi yakhetha isethi yedatha yosekelo lwekhasimende le-Bitext. Ihlonze izinkambu eziwusizo: umyalo njengombhalo ofakiwe, isigaba njengelebula yokuhlukanisa, kanye nenjongo njengenhloso ehlaziywe kahle.
Yabe isifingqa idathasethi:
| Imininingwane yesethi yedatha | Umphumela |
| Isethi yedatha | bitext/Bitext-customer-support-llm-chatbot-training-dataset |
| Imigqa | 26,872 |
| Izigaba | 11 |
| Izinhloso | 27 |
| Ubude bombhalo obumaphakathi | Izinhlamvu ezingama-47 |
| Amanani angekho | Lutho |
| Izimpinda | 8.3% |
| Inkinga enkulu | Ukungalingani kwekilasi okulingene |

Isinyathelo sesi-3: Ukuhlolwa kwentuthu nokulungisa iphutha
Ngaphambi kokuqeqesha imodeli egcwele, u-ML Intern wabhala umbhalo wokuqeqesha futhi wawuhlola kusampula encane.
Ukuhlolwa kwentuthu kutholakele izindaba! Ikholomu yelebula idinga ukuguqulelwa kuye ClassLabelkanye nomsebenzi wemethrikhi odingekayo ukuze kusingathwe izimo lapho isethi yokuhlola encane ingazange ibe nazo zonke izigaba eziyi-11.
I-ML Intern ilungise zombili izinkinga futhi yaqinisekisa ukuthi iskripthi siphele.

Isinyathelo sesi-4: Uhlelo lokuqeqesha nokugunyazwa
Ngemuva kokuthi iskripthi siphumelele ukuhlolwa kwentuthu, i-ML Intern idale uhlelo lokuqeqesha.
| Into | Hlela |
| Imodeli | distilbert/distilbert-base-uncased |
| Amapharamitha | 67M |
| Amakilasi | 11 |
| Izinga lokufunda | 2e-5 |
| Izinkathi | 5 |
| Usayizi weqoqo | 32 |
| I-metric ehamba phambili | IMacro F1 |
| Izindleko ze-GPU ezilindelwe | Cishe u-$0.20 |
Lokhu bekuyindawo yokuhlola ukugunyazwa. I-ML Intern ayizange iqalise umsebenzi wokuqeqesha ngokuzenzakalelayo.


Isinyathelo sesi-5: Ukubuyekeza kwangaphambilini kokuqeqeshwa
Ngaphambi kokugunyaza ukuqeqeshwa, ngicele i-ML Intern ukuthi yenze isibuyekezo sokugcina.
Before proceeding, do a final pre-training review.Check:
1. any risk of data leakage
2. whether class imbalance needs handling
3. whether hyperparameters are reasonable
4. expected baseline performance vs fine-tuned performance
5. any potential failure casesThen confirm if the setup is ready for training.

I-ML Intern ihlole ukuvuza, ukungalingani kwekilasi, ama-hyperparameter, ukusebenza kwesisekelo, kanye namacala okuhluleka okungenzeka. Iphethe ngokuthi isethaphu isikulungele ukuqeqeshwa.

Isinyathelo sesi-6: Ukulawula ikhompuyutha kanye nokubuyela emuva kwe-CPU
I-ML Intern izamile ukuqalisa umsebenzi wokuqeqesha kuzingxenyekazi zekhompyutha ze-Hugging Face GPU, kodwa umsebenzi wenqatshwa ngenxa yokuthi indawo yamagama yayingenawo amakhredithi atholakalayo.
Esikhundleni sokuma, i-ML Intern ishintshele kubhokisi lesihlabathi le-CPU lamahhala. Lokhu bekuhamba kancane, kodwa kwavumela iphrojekthi ukuthi iqhubeke ngaphandle kokubala ikhompuyutha.
Ngabe sengisebenzisa umyalo oqinile wokuqeqesha:
Proceed with the training job using the approved plan, but keep compute cost low.While running:
1. log training loss and validation metrics
2. monitor for overfitting
3. save the best checkpoint
4. use early stopping if validation macro F1 stops improving
5. stop the job immediately if errors or abnormal loss appear
6. keep the run within the estimated budgetML Intern optimized the CPU run and continued safely.


Isinyathelo sesi-7: Inqubekela phambili yokuqeqesha
Phakathi nokuqeqeshwa, i-ML Intern yaqapha ukulahleka nokuqinisekisa amamethrikhi.
Ukulahlekelwa kwehle ngokushesha phakathi nenkathi yokuqala, okubonisa ukuthi imodeli yayifunda. Iphinde yabheka ukugcwala ngokweqile kuwo wonke ama-epoch.
| Inkathi | Ukunemba | IMacro F1 | Isimo |
| 1 | 99.76% | 99.78% | Isiqalo esiqinile |
| 2 | 99.68% | 99.68% | Ukucwilisa kancane |
| 3 | 99.88% | 99.88% | Indawo yokuhlola engcono kakhulu |
| 4 | 99.80% | 99.80% | Ukwehla kancane |
| 5 | 99.80% | 99.80% | Indawo yokuhlola engcono kakhulu igciniwe |
Indawo yokuhlola engcono kakhulu yavela ku-epoch 3.


Isinyathelo sesi-8: Umbiko wokugcina wokuqeqeshwa
Ngemva kokuqeqeshwa, i-ML Intern ibike umphumela wokugcina.
| Imethrikhi | Umphumela |
| Hlola ukunemba | 100.00% |
| IMacro F1 | 100.00% |
| Isikhathi sokuqeqesha | 59.6 imizuzu |
| Isikhathi esiphelele | 60.1 imizuzu |
| Izingxenyekazi zekhompuyutha | CPU sandbox |
| Bala izindleko | $0.00 |
| Indawo yokuhlola engcono kakhulu | Isiqephu 3 |
| I-repo yemodeli | Janvi17/customer-support-ticket-classifier |
Lokhu kubonise ukuthi iphrojekthi ephelele ingaqedwa ngisho nangaphandle kwamakhredithi e-GPU.


Isinyathelo sesi-9: Ukuhlola okuphelele
Okulandelayo, ngicele i-ML Intern ukuthi idlulele ngale kwamamethrikhi ajwayelekile.
Evaluate the final model thoroughly.Include:
1. accuracy
2. macro F1
3. per-class precision, recall, F1
4. confusion matrix analysis
5. 5 examples where the model is wrong
6. explanation of failure patternsThe model achieved perfect results on the held-out test set. Every class had precision, recall, and F1 of 1.0.
Kodwa i-ML Intern nayo yabheka ijula. Ihlaziye ukuzethemba nezimo eziseduze nemingcele ukuze iqonde ukuthi imodeli ingase ibe ntekenteke kuphi.

Isinyathelo 10: Ukuhlaziywa kokwehluleka
Ngenxa yokuthi isethi yokuhlola ibingenawo amaphutha, i-ML Intern ihlole imodeli ngezibonelo ezinzima.
| Uhlobo lokwehluleka | Isibonelo | Inkinga |
| Ukuphika | “Ungangibuyiseli, vele ulungise umkhiqizo” | Imodeli egxile “ekubuyiseleni imali” |
| Okokufaka okungaqondakali | “Ngingaxhumana kanjani nomuntu mayelana nenkinga yami yokuthumela?” | Amalebula amaningi angaba khona |
| Ama-typos anzima | “Angifuni ukukhuluma nendoda” | Ama-typos adida imodeli |
| Gibberish | “asdfghjkl” | Alikho ikilasi elingaziwa |
| Izinhloso eziningi | “Inkonzo yakho yokulethwa imbi, ngifuna ukukhononda” | Kuphoqeleke ukuthi ukhethe ilebula eyodwa |
Lokhu kwakubalulekile ngoba kwenza ukuhlaziya kuthembeke kakhulu. Imodeli yenze kahle kakhulu kusethi yokuhlola, kodwa ibisenezingozi zokukhiqiza.

Isinyathelo 11: Iziphakamiso zokuthuthukisa
Ngemva kokuhlola, ngacela i-ML Intern ukuthi iphakamise intuthuko ngaphandle kokuqalisa omunye umsebenzi wokuqeqesha.
Kunconyiwe:
| Ukuthuthukiswa | Kungani kusiza |
| I-Typo ne-paraphrase augmentation | Ithuthukisa ukuqina kumbhalo wangempela ongcolile |
| Iklasi ONGAZIWAYO | Iphatha ama-gibberish kanye nokokufaka okungahlobene |
| Ukushelela ilebula | Yehlisa ukuzethemba ngokweqile |
I UNKNOWN ikilasi belibaluleke kakhulu ngoba imodeli njengamanje kufanele ihlale ikhetha esinye sezigaba zosekelo ezaziwayo.

Isinyathelo 12: Ikhadi eliyimodeli kanye nokushicilela kobuso obungangana
Okulandelayo, ngicele i-ML Intern ukuthi ilungiselele imodeli ezoshicilelwa.
Prepare the model for publishing on Hugging Face Hub.Create:
1. model card
2. inference example
3. dataset attribution
4. evaluation summary
5. limitations and risks
I-ML Intern idale imodeli yekhadi eligcwele. Ifake isibaluli sedathasethi, amamethrikhi, imiphumela yekilasi ngalinye, imininingwane yokuqeqeshwa, izibonelo zokucatshangelwa, imikhawulo, nobungozi.

Isinyathelo 13: Idemo ye-Gradio
Ekugcineni, ngicele i-ML Intern ukuthi yenze idemo.
Create a simple Gradio demo for this model.The app should:
1. take a support ticket as input
2. return predicted category
3. show confidence score
4. include example inputs
I-ML Intern idale uhlelo lokusebenza lwe-Gradio futhi yalusebenzisa njenge-Hugging Face Space.
Idemo ifake ibhokisi lombhalo, isigaba esibikezelwe, isikolo sokuzethemba, ukuhlukaniswa kwekilasi, nokokufaka okuyisibonelo.
Isixhumanisi sedemo:


Nansi imodeli esetshenzisiwe:

I-ML Intern ayizange nje iqeqeshe imodeli. Idlule ku-loop yobunjiniyela be-ML egcwele: ukuhlela, ukuhlola, ukulungisa amaphutha, ukuzivumelanisa nemikhawulo yokubala, ukuhlola, ukubhala, nokuthunyelwa.
Amandla Nezingozi ze-ML Intern
Njengoba usufundile manje, i-ML Intern iyamangalisa. Kodwa iza nesabelo sakho samandla nezingozi:
| Amandla | Izingozi |
| Icwaninga ngaphambi kokufaka ikhodi | Ingase ikhethe idatha engafanele |
| Ubhala abuye ahlole imibhalo | Ingathemba amamethrikhi adukisayo |
| Ilungisa amaphutha avamile | Ingase iphakamise ukulungisa okubuthakathaka |
| Isiza ukushicilela ama-artifact | Ingase idalule izindleko noma ubungozi bedatha |
Indlela ephephe kunazo zonke ilula. Vumela i-ML Intern yenze umsebenzi ophindaphindayo, kodwa igcine umuntu elawula idatha, abale, ukuhlola, nokushicilela.
ML Intern vs AutoML
I-AutoML ivamise ukuqala ngedathasethi elungisiwe. Uchaza ikholomu eqondisiwe kanye nemethrikhi. Bese i-AutoML isesha imodeli enhle.
I-ML Intern iqala ngaphambili. Ingaqala ngomgomo wolimi lwemvelo. Isiza ngocwaningo, ukuhlela, ukuhlolwa kwedathasethi, ukukhiqiza amakhodi, ukulungisa iphutha, ukuqeqesha, ukuhlola, nokushicilela.
| Indawo | I-AutoML | ML Intern |
| Iphuzu lokuqala | Isethi yedatha elungisiwe | Umgomo wolimi lwemvelo |
| Ukugxila okuyinhloko | Ukuqeqeshwa kwemodeli | Ukuhamba komsebenzi okugcwele kwe-ML |
| Umsebenzi wesethi yedatha | Inomkhawulo | Isesha futhi ihlole idatha |
| Ukulungisa iphutha | Inomkhawulo | Iphatha amaphutha nokulungiswa |
| Okukhiphayo | Imodeli noma ipayipi | Ikhodi, amamethrikhi, ikhadi lemodeli, idemo |
I-AutoML ihamba phambili emisebenzini ehleliwe. I-ML Intern ingcono kokugeleza komsebenzi wobunjiniyela be-ML okungcolile.
I-ML Intern ayikhawulelwe ekuhlukaniseni umbhalo. Ingase futhi isekele ukuhlolwa kwesitayela se-Kaggle. Nazi ezinye zezindlela zokusetshenziswa ze-ML Intern:
| Sebenzisa icala | Kungani i-ML Intern isiza |
| Ukulungiswa kahle kwesithombe nevidiyo | Iphatha ucwaningo, ikhodi, nokuhlola |
| Ukuhlukaniswa kwezokwelapha | Isiza ngokusesha idathasethi kanye nokuzivumelanisa nemodeli |
| Kaggle workflows | Isekela ukuphindaphinda, ukulungisa amaphutha, nokuthunyelwa |
Lezi zibonelo zibonisa isithembiso esibanzi. I-ML Intern iwusizo uma umsebenzi uhlanganisa ukufunda, ukuhlela, ukubhala amakhodi, ukuhlola, ukuthuthukisa, kanye nokuthunyelwa kwemikhumbi.
Isiphetho
I-ML Intern iwusizo kakhulu uma siyeka ukuyiphatha njengomlingo futhi siqala ukuyiphatha njengomsizi wobunjiniyela we-ML omncane. Ingasiza ngokuhlela, ukubhala amakhodi, ukulungisa iphutha, ukuqeqesha, ukuhlola, ukupakisha, kanye nokusatshalaliswa. Kodwa isadinga umuntu ukuze agade izinqumo mayelana nedatha, ukubala, ukuhlola, nokushicilela. Kule phrojekthi, abantu bahlala belawula izindawo zokuhlola ezibalulekile. I-ML Intern ibiphethe umsebenzi omningi wobunjiniyela ophindaphindwayo. Lelo inani langempela: ukungafaki esikhundleni onjiniyela be-ML kodwa ukusiza imibono eminingi ye-ML isuke ekwazisweni iye ku-artifact esebenzayo.
imibuzo ejwayelekile ukubuzwa
I-A. ML Intern ingumsizi womthombo ovulekile osiza ngocwaningo lwe-ML, ukubhala amakhodi, ukulungisa amaphutha, ukuqeqesha, ukuhlola, nokushicilela.
A. I-AutoML igxile kakhulu ekuqeqeshweni okuyimodeli, kuyilapho i-ML Intern isekela ukuhamba komsebenzi okuphelele kobunjiniyela be-ML.
A. Cha. Iphatha imisebenzi ephindaphindwayo, kodwa abantu kusadingeka baqondise idatha, babale, bahlole, bashicilele.
Ngena ngemvume ukuze uqhubeke ufunda futhi ujabulele okuqukethwe okukhethwe ngochwepheshe.



