Code implementation of an Agentic AI framework that performs literature review, hypothesis generation, experimental design, discovery and scientific reporting

nimda November 28, 2025

0 13 5 minutes read

Code implementation of an Agentic AI framework that performs literature review, hypothesis generation, experimental design, discovery and scientific reporting

In this course, we build a complete science discovery agent step by step and learn how each component works together to create a unified workflow and unified research. We start by uploading our books, creating retrieval modules and LLM, then when meeting megents research papers, generate hypotheses, design tests, and produce formal reports. Using the snippets mentioned below, we see how a powerful pipeline emerges naturally, allowing us to explore a scientific question from the depths of curiosity to a full analysis within a single, integrated system. Look Full codes here.

import sys, subprocess


def install_deps():
   pkgs = ["transformers", "scikit-learn", "numpy"]
   subprocess.check_call([sys.executable, "-m", "pip", "install", "-q"] + pkgs)


try:
   from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
   from sklearn.feature_extraction.text import TfidfVectorizer
   from sklearn.metrics.pairwise import cosine_similarity
   import numpy as np
except ImportError:
   install_deps()
   from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
   from sklearn.feature_extraction.text import TfidfVectorizer
   from sklearn.metrics.pairwise import cosine_similarity
   import numpy as np


from dataclasses import dataclass
from typing import List, Dict, Any


np.random.seed(42)


LITERATURE = [
   {"id": "P1","title": "Self-Supervised Protein Language Models for Structure Prediction","field": "computational biology",
    "abstract": "We explore transformer-based protein language models trained on millions of sequences. The models learn residue-level embeddings that improve secondary structure prediction and stability estimation."},
   {"id": "P2","title": "CRISPR Off-Target Detection Using Deep Learning","field": "genome editing",
    "abstract": "We propose a convolutional neural network architecture for predicting CRISPR-Cas9 off-target effects directly from genomic sequences, achieving state-of-the-art accuracy on GUIDE-seq datasets."},
   {"id": "P3","title": "Foundation Models for Scientific Equation Discovery","field": "scientific ML",
    "abstract": "Large language models are combined with symbolic regression to recover governing equations from noisy experimental observations in physics and fluid dynamics."},
   {"id": "P4","title": "Active Learning for Materials Property Optimization","field": "materials science",
    "abstract": "We integrate Bayesian optimization with graph neural networks to actively select candidate materials that maximize target properties while reducing experimental cost."},
   {"id": "P5","title": "Graph-Based Retrieval for Cross-Domain Literature Review","field": "NLP for science",
    "abstract": "We construct a heterogeneous citation and concept graph over multi-domain scientific papers and show that graph-aware retrieval improves cross-domain literature exploration."},
]


corpus_texts = [p["abstract"] + " " + p["title"] for p in LITERATURE]
vectorizer = TfidfVectorizer(stop_words="english")
corpus_matrix = vectorizer.fit_transform(corpus_texts)


MODEL_NAME = "google/flan-t5-small"
tokenizer = AutoTokenizer.from_pretrained(MODEL_NAME)
model = AutoModelForSeq2SeqLM.from_pretrained(MODEL_NAME)


def generate_text(prompt: str, max_new_tokens: int = 256) -> str:
   inputs = tokenizer(prompt, return_tensors="pt", truncation=True)
   outputs = model.generate(**inputs, max_new_tokens=max_new_tokens, num_beams=4, early_stopping=True)
   return tokenizer.decode(outputs[0], skip_special_tokens=True)

We've laid the foundation for our scientific agent by loading libraries, preparing a corpus of books, and initializing our language model. We build the TF-IDF Vericizer and embed all the drawings to return the correct documents. With the model loaded and the data sorted, we create the computational backbone for everything that follows. Look Full codes here.

@dataclass
class PaperHit:
   paper: Dict[str, Any]
   score: float


class LiteratureAgent:
   def __init__(self, vectorizer, corpus_matrix, papers: List[Dict[str, Any]]):
       self.vectorizer = vectorizer
       self.corpus_matrix = corpus_matrix
       self.papers = papers


   def search(self, query: str, k: int = 3) -> List[PaperHit]:
       q_vec = self.vectorizer.transform([query])
       sims = cosine_similarity(q_vec, self.corpus_matrix)[0]
       idxs = np.argsort(-sims)[:k]
       hits = [PaperHit(self.papers[i], float(sims[i])) for i in idxs]
       return hits

We use the book search feature of our agent. We transform user queries into a vector space and identify the most relevant scientific papers using the cosini equation. With this, we give our program the ability to get its thinking from a very close job. Look Full codes here.

@dataclass
class ExperimentPlan:
   system: str
   hypothesis: str
   variables: Dict[str, Any]
   protocol: List[str]


@dataclass
class ExperimentResult:
   plan: ExperimentPlan
   metrics: Dict[str, float]


class ExperimentAgent:
   def design_experiment(self, question: str, hypothesis: str, hits: List[PaperHit]) -> ExperimentPlan:
       top_field = hits[0].paper["field"] if hits else "computational science"
       protocol = [
           f"Construct dataset combining ideas from: {', '.join(h.paper['id'] for h in hits)}.",
           "Split data into train/validation/test.",
           "Compare baseline model vs. augmented model implementing the hypothesis.",
           "Evaluate using appropriate metrics and perform ablation analysis.",
       ]
       variables = {
           "baseline_model": "sequence CNN",
           "augmented_model": "protein language model + CNN",
           "n_train_samples": 5000,
           "n_validation_samples": 1000,
           "metric": "AUROC",
       }
       system = f"{top_field} system related to: {question}"
       return ExperimentPlan(system=system, hypothesis=hypothesis, variables=variables, protocol=protocol)


   def run_experiment(self, plan: ExperimentPlan) -> ExperimentResult:
       base = 0.78 + 0.02 * np.random.randn()
       gain = abs(0.05 + 0.01 * np.random.randn())
       metrics = {
           "baseline_AUROC": round(base, 3),
           "augmented_AUROC": round(base + gain, 3),
           "estimated_gain": round(gain, 3),
       }
       return ExperimentResult(plan=plan, metrics=metrics)

We design and simulate experiments based on the literature retrieved and the hypothesis generated. We define dynamic automation, develop a protocol, and generate synthetic metrics that mimic the power of real scientific experiments. This allows us to move from theoretical ideas to practical testing. Look Full codes here.

class ReportAgent:
   def write_report(self, question: str, hits: List[PaperHit], plan: ExperimentPlan, result: ExperimentResult) -> str:
       related_work = "n".join(f"- {h.paper['title']} ({h.paper['field']})" for h in hits)
       protocol_str = "n".join(f"- {step}" for step in plan.protocol)
       prompt = f"""
You are an AI research assistant writing a concise research-style report.


Research question:
{question}


Hypothesis:
{plan.hypothesis}


Relevant prior work:
{related_work}


Planned experiment:
System: {plan.system}
Variables: {plan.variables}
Protocol:
{protocol_str}


Simulated results:
{result.metrics}


Write a clear report with the following sections:
1. Background
2. Proposed Approach
3. Experimental Setup
4. Results and Discussion
5. Limitations and Future Work
"""
       return generate_text(prompt.strip(), max_new_tokens=320)

We produce a full style research report using the LLM. We include the hypothesis, protocol, results, and related work in an organized text in clearly defined sections. This allows us to convert the pipeline's wake results into scientifically structured communications. Look Full codes here.

class ScientificAgent:
   def __init__(self):
       self.lit_agent = LiteratureAgent(vectorizer, corpus_matrix, LITERATURE)
       self.exp_agent = ExperimentAgent()
       self.report_agent = ReportAgent()


   def propose_hypothesis(self, question: str, hits: List[PaperHit]) -> str:
       context = " ".join(h.paper["abstract"] for h in hits)
       prompt = f"""
You are an AI scientist. Given a research question and related abstracts,
propose a single, testable hypothesis in 2-3 sentences.


Research question:
{question}


Related abstracts:
{context}
"""
       return generate_text(prompt.strip(), max_new_tokens=96)


   def run_pipeline(self, question: str) -> str:
       hits = self.lit_agent.search(question, k=3)
       hypothesis = self.propose_hypothesis(question, hits)
       plan = self.exp_agent.design_experiment(question, hypothesis, hits)
       result = self.exp_agent.run_experiment(plan)
       report = self.report_agent.write_report(question, hits, plan, result)
       return report


if __name__ == "__main__":
   research_question = (
       "How can protein language model embeddings improve CRISPR off-target "
       "prediction compared to sequence-only CNN baselines?"
   )
   agent = ScientificAgent()
   final_report = agent.run_pipeline(research_question)
   print(final_report)

We plan the entire pipeline, search for a hypothesis, create an experiment, run a simulation, and write a report. We then run the program through the actual research question and look at the complete workflow in action. This step brings all the modules together for a unified science agent.

In conclusion, we see how a compact database can emerge from an active AI researcher capable of searching, consulting, simulating, and summarizing. We understand how each snippet contributes to the full pipeline and that the components of the aventic enhance each other when combined. Also, we put ourselves in a strong position to extend the agent with richer literature, more logical models, and more complex logic, pushing our scientific experiments further with every iteration.

Look Full codes here. Feel free to take a look at ours GitHub page for tutorials, code and notebooks. Also, feel free to follow us Kind of stubborn and don't forget to join ours 100K + ML Subreddit and sign up Our newsletter. Wait! Do you telegraph? Now you can join us by telegraph.

AsifAzzaq is the CEO of MarktechPost Media Inc.. as a visionary entrepreneur and developer, Asifi is committed to harnessing the power of social intelligence for good. His latest effort is the launch of a media intelligence platform, MarktechPpost, which stands out for its deep understanding of machine learning and deep learning stories that are technically sound and easily understood by a wide audience. The platform sticks to more than two million monthly views, which shows its popularity among the audience.

Follow Marktechpost: Add us as a favorite source on Google.

Source link

nimda November 28, 2025

0 13 5 minutes read