Machine Learning

Ungawakha Kanjani Ama-Guardrali Esebenzayo Yezobuchwepheshe Yezicelo ze-AI

ngokulawula nokuqiniseka kokuphepha. Ama-Guardrali ahlinzeka lokho kwizicelo ze-AI. Kepha bangakhiwa kanjani kuzinhlelo zokusebenza?

Ama-Guardrali ambalwa asungulwa ngisho nangaphambi kokuqala ukufaka amakhodi. Okokuqala, kukhona onogada abangokomthetho abanikezwe nguhulumeni, njengoMthetho we-EU AI, obeka amacala amukelekile futhi avinjelwe ukusetshenziswa kwe-AI. Bese kuba khona ama-Guardrali abekwe yinkampani. Lawa ma-Guardrails akhombisa ukuthi asetshenziswa yiphi amacala inkampani ethola ukwamukelelwa ukusetshenziswa kwe-AI, zombili ngokuya ngezokuphepha nokuziphatha. Lawa ma-Guardrails amabili ahlunga amacala okusetshenziswa kwe-AI ukutholwa.

Ngemuva kokuwela izinhlobo ezimbili zokuqala ze-Guardrails, icala lokusebenzisa elamukelekayo lifinyelela eqenjini lobunjiniyela. Lapho iqembu le-Engineering lisebenzisa icala lokusebenzisa, baphinde bafake ama-Guardrali ezobuchwepheshe ukuqinisekisa ukusetshenziswa kwedatha okuphephile nokugcina ukusebenza okulindelekile kwesicelo. Sizohlola lolu hlobo lwesithathu lwe-Guardrail esihlokweni.

Ama-Guardrails aphezulu wezobuchwepheshe ezingxenyeni ezihlukene zesicelo se-AI

Ama-Guardrali adalwe endaweni yokufaka, imodeli, kanye nezendlalelo zokuphuma. Ngamunye ukhonza injongo eyingqayizivele:

  • Isendlalelo sedatha: Ama-Guardralis ku-Nearser yedatha aqinisekisa ukuthi noma iyiphi into ebucayi, eyinkinga, noma engalungile ayingenisi uhlelo.
  • Ungqimba wemodeli: Kuhle ukwakha ama-Guarderails kulolu ngqimba ukuze uqiniseke ukuthi imodeli isebenza njengoba bekulindelekile.
  • Ungqimba lokuphuma: I-Outlover Wearger Guardrails iqinisekisa ukuthi imodeli ayinikezi izimpendulo ezingalungile ngokuzethemba okuphezulu – usongo olujwayelekile ngezinhlelo ze-AI.
Isithombe nguMlobi

1. Isendlalelo sedatha

Ake sidlule kumelwe sibe nogada engqikithini yedatha:

(i) ukuqinisekiswa kokufaka kanye nokuhlanza

Into yokuqala okufanele uyihlole kunoma yisiphi isicelo se-AI uma idatha yokufaka isefomethi efanele futhi ayiqukethe noma yiluphi ulimi olungafanele noma oluhlaselayo. Empeleni kulula kakhulu ukukwenza lokho njengoba iningi ledatha linikeza imisebenzi eyakhelwe ngaphakathi ye-SQL ngokufana kwephethini. Isibonelo, uma ikholomu kufanele ibe yi-alphanumeric, khona-ke ungaqinisekisa uma amanani esefomethi elilindelekile usebenzisa iphethini elula ye-regex. Ngokufanayo, imisebenzi iyatholakala ukwenza isheke lokuhlekisa (ulimi olungalungile noma oluhlaselayo) kuzinhlelo zefu ezinjenge-Microsoft Azure. Kepha ungahlala wakhe umsebenzi wangokwezifiso uma i-database yakho ingenayo.

Data validation:
– The query below only takes entries from the customer table where the customer_email_id is in a valid format
SELECT * FROM customers WHERE REGEXP_LIKE(customer_email_id, '^[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,}$' );
—-----------------------------------------------------------------------------------------
Data sanitization:
– Creating a custom profanity_check function to detect offensive language
CREATE OR REPLACE FUNCTION offensive_language_check(INPUT VARCHAR)
RETURNS BOOLEAN
LANGUAGE SQL
AS $$
 SELECT REGEXP_LIKE(
   INPUT
   '\b(abc|...)\b', — list of offensive words separated by pipe
 );
$$;
– Using the custom profanity_check function to filter out comments with offensive language
SELECT user_comments from customer_feedback where offensive_language_check(user_comments)=0;

(ii) Ukuvikelwa kwedatha okubucayi

Okunye ukucatshangelwa kokhiye ekwakheni uhlelo lokusebenza lwe-AI oluphephile kuyaqinisekisa ukuthi ayikho idatha ye-PII efinyelela ungqimba wemodeli. Onjiniyela abaningi bedatha basebenza namaqembu asebenza ngokuwela ukuhlaba umkhosi wonke amakholomu e-PII ematafuleni. Kukhona namathuluzi wokuzenzakalelayo we-PII athobekile atholakalayo, angenza imininingwane yedatha futhi ahlaze amakholomu we-PII ngosizo lwamamodeli we-ML. Amakholomu ajwayelekile we-PII yile: Igama, ikheli le-imeyili, inombolo yocingo, usuku lokuzalwa, inombolo yezokuphepha (i-SSN), inombolo yephasiphothi, inombolo yelayisense yokushayela, nenombolo ye-biometric. Ezinye izibonelo ze-PII eziqondile ziyimininingwane yezempilo noma imininingwane yezezimali.

Indlela ejwayelekile yokuvimbela le datha ukuthi ingangenisi uhlelo ukusebenzisa indlela yokuhlonza de-. Lokhu kungaba lula njengokususa idatha ngokuphelele, noma ukusebenzisa amasu wokuhlikihla noma ama-pseusubly usebenzisa i-hashing – into imodeli engakwazi ukuyichaza.

– Hashing PII data of customers for data privacy 
SELECT SHA2(customer_name, 256) AS encrypted_customer_name, SHA2(customer_email, 256) AS encrypted_customer_email, … FROM customer_data

(iii) Ukutholwa nokuncishiswa kwe-BIA

Ngaphambi kokuthi imininingwane ingene ungqimba wemodeli, okunye ukubheka ukuqinisekisa ukuthi kunembile futhi akunamsebenzi. Ezinye izinhlobo ezivamile zokukhetha yilezi:

  • Ukukhetha ukukhetha: Idatha yokufaka ayiphelele futhi ayimeleli kahle izethameli eziphelele ezihlosiwe.
  • Ukusinda Ukukhetha: Kunemininingwane eminingi ye-Happy Path, okwenza kube nzima ukuthi imodeli isebenze ezimweni ezihlulekile.
  • Ubuhlanga noma inhlangano yokubandakanya: Imininingwane ithanda ubulili obuthile noma umjaho othile ngenxa yamaphethini adlule noma ubandlululo.
  • Ukukalwa noma ilebula u-Bias: Idatha ayilungile ngenxa yephutha lokulebula noma i-bias kumuntu olibhalile.
  • I-ARARE umcimbi Brianias: Idatha yokufaka ayinawo wonke amacala onqenqema, inikeze isithombe esingaphelele.
  • Ukubandlululwa Kwesikhashana: Idatha yokufaka iphelelwe yisikhathi futhi ayimeleli kahle umhlaba wamanje.

Ngenkathi ngifisa futhi ukuthi kube nohlelo olulula lokuthola ukucwasana okunjalo, empeleni lo msebenzi we-grunt. Usosayensi wedatha kufanele ahlale phansi, aqhube imibuzo, futhi ahlole idatha yazo zonke izimo ukuthola noma yikuphi ukukhetha. Isibonelo, uma wakhe uhlelo lokusebenza lwezempilo futhi awunayo idatha eyanele yeqembu elithile leqembu noma i-BMI, khona-ke kunethuba eliphakeme lokuqasha.

– Identifying if any age group data or BMI group data is missing
select age_group, count(*) from users_data group by age_group;
select BMI, count(*) from users_data group by BMI;

(iv) Ukutholakala kwedatha yesikhathi

Esinye isici sokuqinisekisa isikhathi sedatha. Imininingwane yesokudla neyafanele kumele itholakale kumamodeli ukuze isebenze kahle. Amanye amamodeli angadinga idatha yesikhathi sangempela, okumbalwa okudinga eduze kwesikhathi sangempela, futhi kwabanye, i-batch yanele. Noma ngabe yiziphi izidingo zakho, uhlelo lokuqapha ukuthi idatha yakamuva edingekayo iyatholakala.

Isibonelo, uma abaphathi bezigaba bavuselela amanani entengo emikhiqizweni njalo phakathi kwamabili ngokuya ngamandla emakethe, khona-ke imodeli yakho kumele ibe nedatha ehle ivusemukele ngemuva kwamabili. Ungaba nezinhlelo ezisendaweni yokuqwashisa noma nini lapho idatha iqinile, noma ungakha ukusebenzisa ukuqwashisa ngesendlalelo sedatha yedatha, ukuqapha amapayipi e-ETL ukuze ufike ngesikhathi.

–Creating an alert if today’s data is not available
SELECT CASE WHEN TO_DATE(last_updated_timestamp) != TO_DATE(CURRENT_TIMESTAMP()) THEN 'FRESH' ELSE 'STALE' END AS table_freshness_status FROM product_data;

(v) Ubuqotho bedatha

Ukugcina ubuqotho kubalulekile nokunemba okuyisibonelo. Ubuqotho bedatha bubhekisa ukunemba, ukuphelela, kanye nokwethenjwa kwedatha. Noma iyiphi idatha yakudala, engafanele, futhi engalungile ohlelweni izokwenza okuphumayo kuhambe. Isibonelo, uma kwakha i-Chatbot ebhekene namakhasimende, khona-ke kufanele kube nokufinyelela kumafayela wenqubomgomo yakamuva yenkampani. Ukufinyelela kumadokhumenti angalungile kungahle kuholele ekutheni imodeli ihlanganisa amagama amaningi kusuka kumafayili amaningi futhi inikeze impendulo engalungile ngokuphelele kwikhasimende. Futhi usazobekwa icala elisemthethweni ngalo. Njengendlela i-Air Canada okwakudingeka ukuthi ibe nezindleko zendiza yamakhasimende lapho i-Chatbot yayo ithembisa ngokungafanele ukubuyiselwa kwemali.

Azikho izindlela eziqondile zokuqinisekisa ubuqotho. Kudinga abahlaziyi bedatha nonjiniyela ukuthola izandla zabo zingcolile, qinisekisa amafayela / idatha, bese uqinisekisa ukuthi kuphela idatha yakamuva / efanele ethunyelwa ungqimba lwemodeli. Ukugcina ubuqotho bedatha kuyindlela engcono kakhulu yokulawula ama-hallucinations, ngakho-ke imodeli ayenzi noma imuphi udoti kuyo, udoti ngaphandle.

2. Ungqimba wemodeli

Ngemuva kwesendlalelo sedatha, izindawo zokuhlola ezilandelayo zingakhiwa engqimbeni yemodeli:

(i) Izimvume zomsebenzisi ezisuselwa endimeni

Ukuvikela i-AI Model uryer kubalulekile ukuvikela noma yiziphi izinguquko ezingagunyaziwe ezingase zikwazi ukwethula izimbungulu noma ukukhetha ezinhlelweni. Kuyadingeka futhi ukuvikela noma yikuphi ukuvuza kwedatha. Kufanele ulawule ukuthi ngubani onokufinyelela kulolu ngqimba. Indlela evamile ye-IT yethula ukulawulwa kokufinyelela okususelwa endimeni, lapho abasebenzi abasezingxenyeni ezigunyaziwe kuphela, ezifana nonjiniyela bokufunda bemishini, ososayensi bedatha, noma onjiniyela bedatha, bangafinyelela ungqimba lwemodeli.

Isibonelo, onjiniyela be-deleps bangaba nokufinyelela okufundwayo kuphela njengoba bekungafanele bashintshe logic model. Onjiniyela be-ML bangaba nezimvume zokubhala ezifundwayo. Ukusungula i-RBAC ngumkhuba obalulekile wokuphepha wokugcina ubuqotho bemodeli.

(ii) Ukucwaningwa kwama-Bias

Ukusingathwa kwe-Bias kuhlala kuyinqubo eqhubekayo. Ingangena ngokuhamba kwesikhathi ohlelweni, noma ngabe ukwenze wonke amasheke adingekayo engxenyeni yokufaka. Eqinisweni, ezinye izingxoxo, ikakhulukazi zokuqinisekisa zibafundisi, zivame ukuthuthuka engxenyeni eyisibonelo. Kungukukhetha okwenzekayo lapho imodeli ifakwe ngokweqile kwidatha, ayishiyi ndawo ye-nuances. Uma kwenzeka noma yikuphi ukweqiwa ngokweqile, imodeli idinga ukulinganiswa okuncane. Ukulinganisa kwe-Sprine kuyindlela edumile yokulinganisa amamodeli. Kwenza izinguquko ezincane kwidatha ukuqinisekisa ukuthi wonke amachashazi axhunyiwe.

import numpy as np
import scipy.interpolate as interpolate
import matplotlib.pyplot as plt
from sklearn.metrics import brier_score_loss


# High level Steps:
#Define input (x) and output (y) data for spline fitting
#Set B-Spline parameters: degree & number of knots
#Use the function splrep to compute the B-Spline representation
#Evaluate the spline over a range of x to generate a smooth curve.
#Plot original data and spline curve for visual comparison.
#Calculate the Brier score to assess prediction accuracy.
#Use eval_spline_calibration to evaluate the spline on new x values.
#As a final step, we need to analyze the plot by:
# Check for fit quality (good fit, overfitting, underfitting), validating consistency with expected trends, and interpreting the Brier score for model performance.


######## Sample Code for the steps above ########


# Sample data: Adjust with your actual data points
x_data = np.array([...])  # Input x values, replace '...' with actual data
y_data = np.array([...])  # Corresponding output y values, replace '...' with actual data


# Fit a B-Spline to the data
k = 3  # Degree of the spline, typically cubic spline (cubic is commonly used, hence k=3)
num_knots = 10  # Number of knots for spline interpolation, adjust based on your data complexity
knots = np.linspace(x_data.min(), x_data.max(), num_knots)  # Equally spaced knot vector over data range


# Compute the spline representation
# The function 'splrep' computes the B-spline representation of a 1-D curve
tck = interpolate.splrep(x_data, y_data, k=k, t=knots[1:-1])


# Evaluate the spline at the desired points
x_spline = np.linspace(x_data.min(), x_data.max(), 100)  # Generate x values for smooth spline curve
y_spline = interpolate.splev(x_spline, tck)  # Evaluate spline at x_spline points


# Plot the results
plt.figure(figsize=(8, 4))
plt.plot(x_data, y_data, 'o', label='Data Points')  # Plot original data points
plt.plot(x_spline, y_spline, '-', label='B-Spline Calibration')  # Plot spline curve
plt.xlabel('x') 
plt.ylabel('y')
plt.title('Spline Calibration') 
plt.legend() 
plt.show()  


# Calculate Brier score for comparison
# The Brier score measures the accuracy of probabilistic predictions
y_pred = interpolate.splev(x_data, tck)  # Evaluate spline at original data points
brier_score = brier_score_loss(y_data, y_pred)  # Calculate Brier score between original and predicted data
print("Brier Score:", brier_score) 


# Placeholder for calibration function
# This function allows for the evaluation of the spline at arbitrary x values
def eval_spline_calibration(x_val):
   return interpolate.splev(x_val, tck)  # Return the evaluated spline for input x_val

(iii) llm njengejaji

I-LLM (imodeli enkulu yolimi) Njengoba ijaji liyindlela ethokozisayo kumamodeli aqinisekise, lapho kusetshenziswa khona i-LLM eyodwa ukwahlulela umphumela wenye i-LLM. Ingena esikhundleni sokungenelela kwesandla futhi isekela ukusebenzisa ukuqinisekiswa kwempendulo esikalini.

Ukusebenzisa i-LLM njengejaji, udinga ukwakha ngokushesha okuzohlola okuphumayo. Umphumela osheshayo kumele ube yinqubo yokulinganisa, efana nesikolo noma isikhundla.

A sample prompt for reference:
Assign a helpfulness score for the response based on the company’s policies, where 1 is the highest score and 5 is the lowest

Lokhu kukhishwa okusheshayo kungasetshenziselwa ukudala uhlaka lokuqapha noma nini lapho imiphumela ingalindelekile.

Uthiphu: Ingxenye engcono kakhulu yentuthuko yezobuchwepheshe yakamuva ukuthi awudingi ngisho nokwakha i-LLM ukusuka ekuqaleni. Kukhona izixazululo ze-plug-and-play etholakalayo, njenge-meta lama, ongayilanda futhi ugijime ezakhiweni.

(iv) Ukuhleleka okuhle okuqhubekayo

Ukuze uthole impumelelo yesikhathi eside yanoma imuphi amamodeli, ukulungiswa okuhle okuqhubekayo kubalulekile. Yilapho imodeli ihlanjululwa njalo ngokunemba. Indlela elula yokufeza lokhu ukwethula ukuqinisa ukuqiniswa ngempendulo yabantu, lapho izibuyekezo zabantu zikala umphumela wemodeli, futhi imodeli ifunda kuyo. Kepha le nqubo inamandla amakhulu. Ukukwenza esikalini, udinga i-automation.

Indlela ejwayelekile ehlenge kahle yindlela ephansi ye-Low-Canction (Lora). Kule ndlela, udala ungqimba oluhlukile oluqeqeshekayo olunengqondo yokwenza kahle. Ungakhuphula ukunemba kokukhipha ngaphandle kokushintsha imodeli eyisisekelo. Isibonelo, wakhela uhlelo lokuncoma lwepulatifomu yokusakazwa, futhi izincomo zamanje azibangelwa ukuchofoza. Eseceleni kweLora, wakhela ngokuhlukile lapho uhlela khona amaqoqo ezinhlamvu zababukeli ngemikhuba efanayo yokubuka bese usebenzisa idatha ye-cluster ukwenza izincomo. Le ungqimba ingasetshenziselwa ukwenza izincomo ize zize ukufeza ukunemba okufunayo.

3. Ungqimba lokuphuma

Lawa amanye amasheke wokugcina enziwe endaweni yokuphuma ukuze aphephe:

(i) Ukuhlungwa kokuqukethwe kolimi, inhlamba, ukuvimba igama elingukhiye

Ifana nesendlalelo sokufaka, ukuhlunga kwenziwa futhi engxenyeni yokuphuma ukuthola noma yiluphi ulimi oluhlaselayo. Lokhu kuhlola okuphindwe kabili kuqinisekisa ukuthi akukho okuhlangenwe nakho okubi komsebenzisi.

(ii) ukuqinisekiswa kwempendulo

Amanye amasheke ayisisekelo ezimpendulweni zemodeli angenziwa nokwakha uhlaka olususelwa kumthetho. Lawa masheke angafaka okulula, njengefomethi yokukhipha, amanani amukelekile, nokuningi. Kungenziwa kalula kuzo zombili i-Python ne-SQL.

– Simple rule-based checking to flag invalid response
select
CASE
WHEN  THEN ‘INVALID’
WHEN  THEN ‘INVALID’
ELSE ‘VALID’  END as OUTPUT_STATUS
from
output_table;

(iii) umkhawulo wokuzethemba kanye nezimbangela zabantu

Ayikho imodeli ye-AI ephelele, futhi lokho kulungile inqobo nje uma ungabandakanya umuntu nomaphi lapho kudingeka khona. Kunamathuluzi e-AI atholakalayo lapho ungasebenza khona lapho usebenzisa i-AI nokuthi ungaqala nini ukusungula umuntu ovuthanga lomuntu. Kungenzeka futhi ukushintsha lesi senzo ngokungenisa umkhawulo wokuzethemba. Noma nini lapho imodeli ikhombisa ukuzethemba okuphansi kokukhipha, kuvuselele isicelo kumuntu ngempendulo enembile.

import numpy as np
import scipy.interpolate as interpolate
# One option to generate a confidence score is using the B-spline or its derivatives for the input data
# scipy has interpolate.splev function takes two main inputs:
# 1. x: The x values at which you want to evaluate the spline 
# 2. tck: The tuple (t, c, k) representing the knots, coefficients, and degree of the spline. This can be generated using make_splrep (or the older function splrep) or manually constructed
# Generate the confidence scores and remove the values outside 0 and 1 if present
predicted_probs = np.clip(interpolate.splev(input_data, tck), 0, 1)

# Zip the score with input data
confidence_results = list(zip(input_data, predicted_probs))

# Come up with a threshold and identify all inputs that do not meet the threshold, and use it for manual verification
threshold = 0.5
filtered_results = [(i, score) for i, score in confidence_results if score <= threshold]

# Records that can be routed for manual/human verification
for i, score in filtered_results:
   print(f"x: {i}, Confidence Score: {score}")

(iv) Ukuqapha okuqhubekayo nokuqwashisa

Njenganoma iluphi uhlelo lwesoftware, amamodeli we-AI adinga futhi ukugawulwa kokugawula nokuqwashisa ngohlaka olungabona amaphutha alindelekile (futhi angalindelekile). Ngalo logada, unefayela elinemininingwane ye-log yazo zonke izinyathelo futhi nesaziso esizenzakalelayo lapho izinto zingahambi kahle.

(v) Ukulandela Ukulandela

Ukusingathwa okuningi kwe-Compliance kwenzeka ngaphambi kokuphuma kwesendlalelo. Amacala okusebenzisa amukelekayo avumelekile aphothulwa esigabeni sokuqala sokuqoqa isidingo sesigaba uqobo. Noma iyiphi idatha ebucayi i-hashed kungqimba lokufaka. Ngaphandle kwalokhu, uma kukhona noma yiziphi izidingo zokulawula, njengokubethelwa kwanoma iyiphi idatha, engenziwa endaweni yokuphuma ngohlaka olulula lokubusa.

Balance ai ngobuchwepheshe bomuntu

Ama-Guardrails akusiza ukuthi wenze okungcono kakhulu kwe-AI automation ngenkathi usagcina ukulawula okuthile ngenqubo. Ngimboze zonke izinhlobo ezivamile zama-Guardrails kungadingeka ukuthi uzibekele emazingeni ahlukene wemodeli.

Ngaphandle kwalokhu, uma uhlangabezana nanoma yisiphi isici esingaba nomthelela ekutholweni okulindeleke ukuthi kulindeleke, khona-ke ungabeka nogada ngalokho. Le ndatshana akuyona ifomula elihleliwe, kepha umhlahlandlela wokuthola (nokulungisa) okuvinjelwe umgwaqo ojwayelekile. Ekugcineni, uhlelo lwakho lokusebenza lwe-AI kumele lwenze lokho okwakusho ukuthi: shintsha umsebenzi omatasa ngaphandle kwekhanda. Futhi ama-Gudderails asiza ukufeza lokho.

Source link

Related Articles

Leave a Reply

Your email address will not be published. Required fields are marked *

Back to top button