How to design a fully functional business assistant with Retrieval Augmentation and policy Guardrails using open source AI

In this tutorial, we explore how to build a compact but powerful business AI Assistant that runs passively on colob. We start by integrating Retrieval-Augmented Generation Generage (RAG) using Faiss for Readeval and Flan-T5 for text generation, both completely open source and free. As we develop, we embed business policies such as data management, access control, and PII protection directly into workflows, ensuring our system is smart and compliant. Look Full codes here.
!pip -q install faiss-cpu transformers==4.44.2 accelerate sentence-transformers==3.0.1
from typing import List, Dict, Tuple
import re, textwrap, numpy as np, torch
from sentence_transformers import SentenceTransformer
import faiss
from transformers import pipeline, AutoTokenizer, AutoModelForSeq2SeqLM
GEN_MODEL = "google/flan-t5-base"
EMB_MODEL = "sentence-transformers/all-MiniLM-L6-v2"
gen_tok = AutoTokenizer.from_pretrained(GEN_MODEL)
gen_model = AutoModelForSeq2SeqLM.from_pretrained(GEN_MODEL, device_map="auto")
generate = pipeline("text2text-generation", model=gen_model, tokenizer=gen_tok)
emb_device = "cuda" if torch.cuda.is_available() else "cpu"
emb_model = SentenceTransformer(EMB_MODEL, device=emb_device)
We start by setting up our environment and uploading the necessary models. We start Flan-T5 with full text and minilm embedding presentations. We ensure that both models are configured to automatically use the GPU when available, so our pipeline runs smoothly. Look Full codes here.
DOCS = [
{"id":"policy_sec_001","title":"Data Security Policy",
"text":"All customer data must be encrypted at rest (AES-256) and in transit (TLS 1.2+). Access is role-based (RBAC). Secrets are stored in a managed vault. Backups run nightly with 35-day retention. PII includes name, email, phone, address, PAN/Aadhaar."},
{"id":"policy_ai_002","title":"Responsible AI Guidelines",
"text":"Use internal models for confidential data. Retrieval sources must be logged. No customer decisioning without human-in-the-loop. Redact PII in prompts and outputs. All model prompts and outputs are stored for audit for 180 days."},
{"id":"runbook_inc_003","title":"Incident Response Runbook",
"text":"If a suspected breach occurs, page on-call SecOps. Rotate keys, isolate affected services, perform forensic capture, notify DPO within regulatory SLA. Communicate via the incident room only."},
{"id":"sop_sales_004","title":"Sales SOP - Enterprise Deals",
"text":"For RFPs, use the approved security questionnaire responses. Claims must match policy_sec_001. Custom clauses need Legal sign-off. Keep records in CRM with deal room links."}
]
def chunk(text:str, chunk_size=600, overlap=80):
w = text.split()
if len(w) <= chunk_size: return [text]
out=[]; i=0
while i < len(w):
j=min(i+chunk_size, len(w)); out.append(" ".join(w[i:j]))
if j==len(w): break
i = j - overlap
return out
CORPUS=[]
for d in DOCS:
for i,c in enumerate(chunk(d["text"])):
CORPUS.append({"doc_id":d["id"],"title":d["title"],"chunk_id":i,"text":c})
We created a small corporate style document set to model internal policies and procedures. We then break these long documents into manageable chunks so they can be imported and retrieved efficiently. This attraction helps our AI assistant handle the context with better accuracy. Look Full codes here.
def build_index(chunks:List[Dict]) -> Tuple[faiss.IndexFlatIP, np.ndarray]:
vecs = emb_model.encode([c["text"] for c in chunks], normalize_embeddings=True, convert_to_numpy=True)
index = faiss.IndexFlatIP(vecs.shape[1]); index.add(vecs); return index, vecs
INDEX, VECS = build_index(CORPUS)
PII_PATTERNS = [
(re.compile(r"bd{10}b"), ""),
(re.compile(r"b[A-Z0-9._%+-]+@[A-Z0-9.-]+.[A-Z]{2,}b", re.I), ""),
(re.compile(r"bd{12}b"), ""),
(re.compile(r"b[A-Z]{5}d{4}[A-Z]b"), "")
]
def redact(t:str)->str:
for p,r in PII_PATTERNS: t = p.sub(r, t)
return t
POLICY_DISALLOWED = [
re.compile(r"b(share|exfiltrate)b.*b(raw|all)b.*bdatab", re.I),
re.compile(r"bdisableb.*bencryptionb", re.I),
]
def policy_check(q:str):
for r in POLICY_DISALLOWED:
if r.search(q): return False, "Request violates security policy (data exfiltration/encryption tampering)."
return True, ""
We have embedded all the chunks that use sentence transformers and store them in the faiss index for quick retrieval. We introduce PII redaction rules and policy checks to prevent data misuse. By doing this, we ensure that our provider adheres to security protection and compliance guidelines. Look Full codes here.
def retrieve(query:str, k=4)->List[Dict]:
qv = emb_model.encode([query], normalize_embeddings=True, convert_to_numpy=True)
scores, idxs = INDEX.search(qv, k)
return [{**CORPUS[i], "score": float(s)} for s,i in zip(scores[0], idxs[0])]
SYSTEM = ("You are an enterprise AI assistant.n"
"- Answer strictly from the provided CONTEXT.n"
"- If missing info, say what is unknown and suggest the correct policy/runbook.n"
"- Keep it concise and cite titles + doc_ids inline like [Title (doc_id:chunk)].")
def build_prompt(user_q:str, ctx_blocks:List[Dict])->str:
ctx = "nn".join(f"[{i+1}] {b['title']} (doc:{b['doc_id']}:{b['chunk_id']})n{b['text']}" for i,b in enumerate(ctx_blocks))
uq = redact(user_q)
return f"SYSTEM:n{SYSTEM}nnCONTEXT:n{ctx}nnUSER QUESTION:n{uq}nnINSTRUCTIONS:n- Cite sources inline.n- Keep to 5-8 sentences.n- Preserve redactions."
def answer(user_q:str, k=4, max_new_tokens=220)->Dict:
ok,msg = policy_check(user_q)
if not ok: return {"answer": f"❌ {msg}", "ctx":[]}
ctx = retrieve(user_q, k=k); prompt = build_prompt(user_q, ctx)
out = generate(prompt, max_new_tokens=max_new_tokens, do_sample=False)[0]["generated_text"].strip()
return {"answer": out, "ctx": ctx}
We design a retrieval function to retrieve the relevant document segments for each user query. We then created a structured integrated context and Flan-T5 questions to generate specific answers. This step ensures that our helper generates legitimate responses. Look Full codes here.
def eval_query(user_q:str, ctx:List[Dict])->Dict:
terms = [w.lower() for w in re.findall(r"[a-zA-Z]{4,}", user_q)]
ctx_text = " ".join(c["text"].lower() for c in ctx)
hits = sum(t in ctx_text for t in terms)
return {"terms": len(terms), "hits": hits, "hit_rate": round(hits/max(1,len(terms)), 2)}
QUERIES = [
"What encryption and backup rules do we follow for customer data?",
"Can we auto-answer RFP security questionnaires? What should we cite?",
"If there is a suspected breach, what are the first three steps?",
"Is it allowed to share all raw customer data externally for testing?"
]
for q in QUERIES:
res = answer(q, k=3)
print("n" + "="*100); print("Q:", q); print("nA:", res["answer"])
if res["ctx"]:
ev = eval_query(q, res["ctx"]); print("nRetrieved Context (top 3):")
for r in res["ctx"]: print(f"- {r['title']} [{r['doc_id']}:{r['chunk_id']}] score={r['score']:.3f}")
print("Eval:", ev)
We test our system using sample tamprise queries that prevent encryption, RFPs, and incident procedures. We display retrieved documents, responses, and simple hit scores to check relevance. In this demo, we see our AI business assistant perform safe and accurate returns.
In conclusion, we successfully created a self-contained business plan for the business that retrieves, analyzes, and responds to business tests while maintaining strong guardrails. We appreciate how we can combine Faiss to find the return, sentence transformers to insert promises, and flan-T5 to generate an internal information engine simulation. As we conclude, we recognize that this simple application of faith can serve as a blueprint for scalable, intuitive, and consistent business delivery.
Look Full codes here. Feel free to take a look at ours GitHub page for tutorials, code and notebooks. Also, feel free to follow us Kind of stubborn and don't forget to join ours 100K + ML Subreddit and sign up Our newsletter. Wait! Do you telegraph? Now you can join us by telegraph.
AsifAzzaq is the CEO of MarktechPost Media Inc.. as a visionary entrepreneur and developer, Asifi is committed to harnessing the power of social intelligence for good. His latest effort is the launch of the intelligence media platform, MarktechPpost, which stands out for its deep understanding of machine learning and deep learning stories that are technically sound and easily understood by a wide audience. The platform sticks to more than two million monthly views, which shows its popularity among the audience.
Follow Marktechpost: Add us as a favorite source on Google.



