How to build an Agentic-Tree-Tree-Tree System with Intelligent Reseng Routing, self-testing, and refinement?

nimda October 27, 2025

0 15 5 minutes read

How to build an Agentic-Tree-Tree-Tree System with Intelligent Reseng Routing, self-testing, and refinement?

In this tutorial, we develop an advanced Generative Retrieval-Augmented Generated General-Augmented Green (Rag) system that goes beyond simple answering. We design it to get smart questions from the right information sources, perform self-checks to check the quality of the answers, and analyze the analysis with advanced accuracy. We implement the entire program using open source tools such as Faiss, SevencetranRanformers, and Flan-T5. As we progress, we test how the routing, retrieval, generation and self-analysis and self-analysis of the pipeline-style pipeline simulates the real-world situation. Look Full codes here.

print("🔧 Setting up dependencies...")
import subprocess
import sys
def install_packages():
   packages = ['sentence-transformers', 'transformers', 'torch', 'faiss-cpu', 'numpy', 'accelerate']
   for package in packages:
       print(f"Installing {package}...")
       subprocess.check_call([sys.executable, '-m', 'pip', 'install', '-q', package])
try:
   import faiss
except ImportError:
   install_packages()
   print("✓ All dependencies installed! Importing modules...n")
import torch
import numpy as np
from sentence_transformers import SentenceTransformer
from transformers import pipeline
import faiss
from typing import List, Dict, Tuple
import warnings
warnings.filterwarnings('ignore')
print("✓ All modules loaded successfully!n")

We start by installing all the necessary dependencies, including transformers, Faiss, and SendectetTreanMer, to ensure a smooth local execution. We ensure the installation and include important modules such as numpy, Pytorch, and Faiss for embedding, retrieval, and generation. We ensure that all libraries load successfully before proceeding with the main pipeline. Look Full codes here.

class VectorStore:
   def __init__(self, embedding_model="all-MiniLM-L6-v2"):
       print(f"Loading embedding model: {embedding_model}...")
       self.embedder = SentenceTransformer(embedding_model)
       self.documents = []
       self.index = None
   def add_documents(self, docs: List[str], sources: List[str]):
       self.documents = [{"text": doc, "source": src} for doc, src in zip(docs, sources)]
       embeddings = self.embedder.encode(docs, show_progress_bar=False)
       dimension = embeddings.shape[1]
       self.index = faiss.IndexFlatL2(dimension)
       self.index.add(embeddings.astype('float32'))
       print(f"✓ Indexed {len(docs)} documentsn")
   def search(self, query: str, k: int = 3) -> List[Dict]:
       query_vec = self.embedder.encode([query]).astype('float32')
       distances, indices = self.index.search(query_vec, k)
       return [self.documents[i] for i in indices[0]]

We design the VectorStore class to store and retrieve documents efficiently using the Faiss match-based search. We embed each document using the transformer model and create a quick retrieval index. This allows us to quickly retrieve the most relevant context for any incoming query. Look Full codes here.

class QueryRouter:
   def __init__(self):
       self.categories = {
           'technical': ['how', 'implement', 'code', 'function', 'algorithm', 'debug'],
           'factual': ['what', 'who', 'when', 'where', 'define', 'explain'],
           'comparative': ['compare', 'difference', 'versus', 'vs', 'better', 'which'],
           'procedural': ['steps', 'process', 'guide', 'tutorial', 'how to']
       }
   def route(self, query: str) -> str:
       query_lower = query.lower()
       scores = {}
       for category, keywords in self.categories.items():
           score = sum(1 for kw in keywords if kw in query_lower)
           scoresAgentic AI = score
       best_category = max(scores, key=scores.get)
       return best_category if scores[best_category] > 0 else 'factual'

We introduce a class of questions to classify questions as objective, technical, factual, comparative, or procedural. We use keyword matching to find which category matches the input query. This routing step ensures that the retrieval strategy is adaptive and robust to different query styles. Look Full codes here.

class AnswerGenerator:
   def __init__(self, model_name="google/flan-t5-base"):
       print(f"Loading generation model: {model_name}...")
       self.generator = pipeline('text2text-generation', model=model_name, device=0 if torch.cuda.is_available() else -1, max_length=256)
       device_type = "GPU" if torch.cuda.is_available() else "CPU"
       print(f"✓ Generator ready (using {device_type})n")
   def generate(self, query: str, context: List[Dict], query_type: str) -> str:
       context_text = "nn".join([f"[{doc['source']}]: {doc['text']}" for doc in context])
      
Context:
{context_text}


Question: {query}


Answer:"""
       answer = self.generator(prompt, max_length=200, do_sample=False)[0]['generated_text']
       return answer.strip()
   def self_check(self, query: str, answer: str, context: List[Dict]) -> Tuple[bool, str]:
       if len(answer) < 10:
           return False, "Answer too short - needs more detail"
       context_keywords = set()
       for doc in context:
           context_keywords.update(doc['text'].lower().split()[:20])
       answer_words = set(answer.lower().split())
       overlap = len(context_keywords.intersection(answer_words))
       if overlap < 2:
           return False, "Answer not grounded in context - needs more evidence"
       query_keywords = set(query.lower().split())
       if len(query_keywords.intersection(answer_words)) < 1:
           return False, "Answer doesn't address the query - rephrase needed"
       return True, "Answer quality acceptable"

We created a Presgenerator class to handle response creation and self-testing. Using the Flan-T5 model, we generate text responses based on the received text. After that, we conduct a self-evaluation to check the length of the answer, the basis of the context, and the relevance, to ensure that our result is meaningful and accurate. Look Full codes here.

class AgenticRAG:
   def __init__(self):
       self.vector_store = VectorStore()
       self.router = QueryRouter()
       self.generator = AnswerGenerator()
       self.max_iterations = 2
   def add_knowledge(self, documents: List[str], sources: List[str]):
       self.vector_store.add_documents(documents, sources)
   def query(self, question: str, verbose: bool = True) -> Dict:
       if verbose:
           print(f"n{'='*60}")
           print(f"🤔 Query: {question}")
           print(f"{'='*60}")
       query_type = self.router.route(question)
       if verbose:
           print(f"📍 Route: {query_type.upper()} query detected")
       k_docs = {'technical': 2, 'comparative': 4, 'procedural': 3}.get(query_type, 3)
       iteration = 0
       answer_accepted = False
       while iteration < self.max_iterations and not answer_accepted:
           iteration += 1
           if verbose:
               print(f"n🔄 Iteration {iteration}")
           context = self.vector_store.search(question, k=k_docs)
           if verbose:
               print(f"📚 Retrieved {len(context)} documents from sources:")
               for doc in context:
                   print(f"   - {doc['source']}")
           answer = self.generator.generate(question, context, query_type)
           if verbose:
               print(f"💡 Generated answer: {answer[:100]}...")
           answer_accepted, feedback = self.generator.self_check(question, answer, context)
           if verbose:
               status = "✓ ACCEPTED" if answer_accepted else "✗ REJECTED"
               print(f"🔍 Self-check: {status}")
               print(f"   Feedback: {feedback}")
           if not answer_accepted and iteration < self.max_iterations:
               question = f"{question} (provide more specific details)"
               k_docs += 1
       return {'answer': answer, 'query_type': query_type, 'iterations': iteration, 'accepted': answer_accepted, 'sources': [doc['source'] for doc in context]}

We integrate all components in the agenticrag system, namely OrcheSTRates Routing, retrieval, generation and quality testing. The system reviewed its answers based on the self-test response, correcting the question or expanding the context where necessary. This creates a dynamic tree-driven tree rag that automatically improves performance. Look Full codes here.

def main():
   print("n" + "="*60)
   print("🚀 AGENTIC RAG WITH ROUTING & SELF-CHECK")
   print("="*60 + "n")
   documents = [
       "RAG (Retrieval-Augmented Generation) combines information retrieval with text generation. It retrieves relevant documents and uses them as context for generating accurate answers."
   ]
   sources = ["Python Documentation", "ML Textbook", "Neural Networks Guide", "Deep Learning Paper", "Transformer Architecture", "RAG Research Paper"]
   rag = AgenticRAG()
   rag.add_knowledge(documents, sources)
   test_queries = ["What is Python?", "How does machine learning work?", "Compare neural networks and deep learning"]
   for query in test_queries:
       result = rag.query(query, verbose=True)
       print(f"n{'='*60}")
       print(f"📊 FINAL RESULT:")
       print(f"   Answer: {result['answer']}")
       print(f"   Query Type: {result['query_type']}")
       print(f"   Iterations: {result['iterations']}")
       print(f"   Accepted: {result['accepted']}")
       print(f"{'='*60}n")
if __name__ == "__main__":
   main()

We finish the demo by loading a small database and test queries with the agentic rag pipeline. We see how the model routes, retrieves, and processes the responses step by step, printing the results in the middle of the transparency. Finally, we ensure that our system successfully delivers accurate, verified answers using only local integration.

In conclusion, we develop a functional agentic aveltic rag framework that independently retrieves, reasons, and evaluates its responses. We demonstrate how the system transforms different types of questions, evaluates their answers, and improves them with an answer that includes, all without weight, the local environment. Through this work, we are deepening our understanding of the properties of the rag and we are also learning how the aventic components can modify critical systems caused by intelligent agents.

Look Full codes here. Feel free to take a look at ours GitHub page for tutorials, code and notebooks. Also, feel free to follow us Kind of stubborn and don't forget to join ours 100K + ML Subreddit and sign up Our newsletter. Wait! Do you telegraph? Now you can join us by telegraph.

AsifAzzaq is the CEO of MarktechPost Media Inc.. as a visionary entrepreneur and developer, Asifi is committed to harnessing the power of social intelligence for good. His latest effort is the launch of a media intelligence platform, MarktechPpost, which stands out for its deep understanding of machine learning and deep learning stories that are technically sound and easily understood by a wide audience. The platform sticks to more than two million monthly views, which shows its popularity among the audience.

Follow Marktechpost: Add us as a favorite source on Google.

Source link

nimda October 27, 2025

0 15 5 minutes read