Guide with codes of creating a brain-inspired brain of AI agent with Hugging Foel Models

nimda August 30, 2025

0 10 5 minutes read

Guide with codes of creating a brain-inspired brain of AI agent with Hugging Foel Models

In the study, we resumed the spirit of a HRM) using a free face model working in your area. We are traveling with a lack of light but organized but where we do where they are designers and explorers. By breaking the problems, it will resolve the objects of Python, and including the final response, we can find that Hierarhical edition and murder can improve the performance of thoughts and killings. This process enables us to see, in real time, that the brain-inspired brain crack can apply without requiring a large size of model or APIs expensive. Look Paper and full codes.

!pip -q install -U transformers accelerate bitsandbytes rich


import os, re, json, textwrap, traceback
from typing import Dict, Any, List
from rich import print as rprint
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline


MODEL_NAME = "Qwen/Qwen2.5-1.5B-Instruct"
DTYPE = torch.bfloat16 if torch.cuda.is_available() else torch.float32

We start by installing the required libraries and loaded the QWEN2.5-1.5B teaching model from the face of the face. We put the type of data based on the GPU availability to ensure the correct performance of colob.

tok = AutoTokenizer.from_pretrained(MODEL_NAME, use_fast=True)
model = AutoModelForCausalLM.from_pretrained(
   MODEL_NAME,
   device_map="auto",
   torch_dtype=DTYPE,
   load_in_4bit=True
)
gen = pipeline(
   "text-generation",
   model=model,
   tokenizer=tok,
   return_full_text=False
)

We are loading the Tokenzer and model, prepared to run at 4-bit to function properly, and fold everything in the text generation to work with the model easily on Colob. Look Paper and full codes.

def chat(prompt: str, system: str = "", max_new_tokens: int = 512, temperature: float = 0.3) -> str:
   msgs = []
   if system:
       msgs.append({"role":"system","content":system})
   msgs.append({"role":"user","content":prompt})
   inputs = tok.apply_chat_template(msgs, tokenize=False, add_generation_prompt=True)
   out = gen(inputs, max_new_tokens=max_new_tokens, do_sample=(temperature>0), temperature=temperature, top_p=0.9)
   return out[0]["generated_text"].strip()


def extract_json(txt: str) -> Dict[str, Any]:
   m = re.search(r"{[sS]*}$", txt.strip())
   if not m:
       m = re.search(r"{[sS]*?}", txt)
   try:
       return json.loads(m.group(0)) if m else {}
   except Exception:
       # fallback: strip code fences
       s = re.sub(r"^```.*?n|n```$", "", txt, flags=re.S)
       try:
           return json.loads(s)
       except Exception:
           return {}

It describes relief activities: The chat function allows us to move to move to model in system system and sample controls, where extract_json helps the formal JSON's reliability from model, no matter the answer. Look Paper and full codes.

def extract_code(txt: str) -> str:
   m = re.search(r"```(?:python)?s*([sS]*?)```", txt, flags=re.I)
   return (m.group(1) if m else txt).strip()


def run_python(code: str, env: Dict[str, Any] | None = None) -> Dict[str, Any]:
   import io, contextlib
   g = {"__name__": "__main__"}; l = {}
   if env: g.update(env)
   buf = io.StringIO()
   try:
       with contextlib.redirect_stdout(buf):
           exec(code, g, l)
       out = l.get("RESULT", g.get("RESULT"))
       return {"ok": True, "result": out, "stdout": buf.getvalue()}
   except Exception as e:
       return {"ok": False, "error": str(e), "trace": traceback.format_exc(), "stdout": buf.getvalue()}


PLANNER_SYS = """You are the HRM Planner.
Decompose the TASK into 2–4 atomic, code-solvable subgoals.
Return compact JSON only: {"subgoals":[...], "final_format":""}."""


SOLVER_SYS = """You are the HRM Solver.
Given SUBGOAL and CONTEXT vars, output a single Python snippet.
Rules:
- Compute deterministically.
- Set a variable RESULT to the answer.
- Keep code short; stdlib only.
Return only a Python code block."""


CRITIC_SYS = """You are the HRM Critic.
Given TASK and LOGS (subgoal results), decide if final answer is ready.
Return JSON only: {"action":"submit"|"revise","critique":"...", "fix_hint":""}."""


SYNTH_SYS = """You are the HRM Synthesizer.
Given TASK, LOGS, and final_format, output only the final answer (no steps).
Follow final_format exactly."""

We put two important pieces: Use of the application updates. Extract_code works with Python Snippets from removing model, while Run_python acts safely to those snippets and to relieve its results. In time, we define four motives, planning, solver, criticism, and synthesizer, which directs the model to violate jobs, ensure accurate, and eventually produce a clean response. Look Paper and full codes.

def plan(task: str) -> Dict[str, Any]:
   p = f"TASK:n{task}nReturn JSON only."
   return extract_json(chat(p, PLANNER_SYS, temperature=0.2, max_new_tokens=300))


def solve_subgoal(subgoal: str, context: Dict[str, Any]) -> Dict[str, Any]:
   prompt = f"SUBGOAL:n{subgoal}nCONTEXT vars: {list(context.keys())}nReturn Python code only."
   code = extract_code(chat(prompt, SOLVER_SYS, temperature=0.2, max_new_tokens=400))
   res = run_python(code, env=context)
   return {"subgoal": subgoal, "code": code, "run": res}


def critic(task: str, logs: List[Dict[str, Any]]) -> Dict[str, Any]:
   pl = [{"subgoal": L["subgoal"], "result": L["run"].get("result"), "ok": L["run"]["ok"]} for L in logs]
   out = chat("TASK:n"+task+"nLOGS:n"+json.dumps(pl, ensure_ascii=False, indent=2)+"nReturn JSON only.",
              CRITIC_SYS, temperature=0.1, max_new_tokens=250)
   return extract_json(out)


def refine(task: str, logs: List[Dict[str, Any]]) -> Dict[str, Any]:
   sys = "Refine subgoals minimally to fix issues. Return same JSON schema as planner."
   out = chat("TASK:n"+task+"nLOGS:n"+json.dumps(logs, ensure_ascii=False)+"nReturn JSON only.",
              sys, temperature=0.2, max_new_tokens=250)
   j = extract_json(out)
   return j if j.get("subgoals") else {}


def synthesize(task: str, logs: List[Dict[str, Any]], final_format: str) -> str:
   packed = [{"subgoal": L["subgoal"], "result": L["run"].get("result")} for L in logs]
   return chat("TASK:n"+task+"nLOGS:n"+json.dumps(packed, ensure_ascii=False)+
               f"nfinal_format: {final_format}nOnly the final answer.",
               SYNTH_SYS, temperature=0.0, max_new_tokens=120).strip()


def hrm_agent(task: str, context: Dict[str, Any] | None = None, budget: int = 2) -> Dict[str, Any]:
   ctx = dict(context or {})
   trace, plan_json = [], plan(task)
   for round_id in range(1, budget+1):
       logs = [solve_subgoal(sg, ctx) for sg in plan_json.get("subgoals", [])]
       for L in logs:
           ctx_key = f"g{len(trace)}_{abs(hash(L['subgoal']))%9999}"
           ctx[ctx_key] = L["run"].get("result")
       verdict = critic(task, logs)
       trace.append({"round": round_id, "plan": plan_json, "logs": logs, "verdict": verdict})
       if verdict.get("action") == "submit": break
       plan_json = refine(task, logs) or plan_json
   final = synthesize(task, trace[-1]["logs"], plan_json.get("final_format", "Answer: "))
   return {"final": final, "trace": trace}

We use complete HRM LOOP: We plan to converts, resolve each manufacturing and running Python (photographing), then we criticize the last system, and combine the final system. We include the orchestrates of these gatherings in HRM_GENT, contains middle-contemporary results as well as the criticism means “graze.” Look Paper and full codes.

ARC_TASK = textwrap.dedent("""
Infer the transformation rule from train examples and apply to test.
Return exactly: "Answer: ", where  is a Python list of lists of ints.
""").strip()
ARC_DATA = {
   "train": [
       {"inp": [[0,0],[1,0]], "out": [[1,1],[0,1]]},
       {"inp": [[0,1],[0,0]], "out": [[1,0],[1,1]]}
   ],
   "test": [[0,0],[0,1]]
}
res1 = hrm_agent(ARC_TASK, context={"TRAIN": ARC_DATA["train"], "TEST": ARC_DATA["test"]}, budget=2)
rprint("n[bold]Demo 1 — ARC-like Toy[/bold]")
rprint(res1["final"])


WM_TASK = "A tank holds 1200 L. It leaks 2% per hour for 3 hours, then is refilled by 150 L. Return exactly: 'Answer: '."
res2 = hrm_agent(WM_TASK, context={}, budget=2)
rprint("n[bold]Demo 2 — Word Math[/bold]")
rprint(res2["final"])


rprint("n[dim]Rounds executed (Demo 1):[/dim]", len(res1["trace"]))

We use the two demos to ensure the agent: Arc style work where we apply from two trains and use it in the test grid, and the word problem that looks at the reconciliation of numbers. We call HM_Gent for each work, print last responses, and display the amount of consultation with the Arc Run consultation.

In conclusion, we see that what we have created is more than a simple show; It is a window to make a conflicting thinking can make small models make small models above their weight. By planning, solving, and criticism, we empower the free face model to perform jobs in amazing stability. We leave the deep appreciation for inspired structures, when they are a couple of tools, open, enables us to check the consultative benches and intellectual tests without getting high costs. The handmade trip shows us that advanced work flow such as understanding is available to anyone who is willing to farm, Tateth, and read.

Look Paper and full codes. Feel free to look our GITHUB page for tutorials, codes and letters of writing. Also, feel free to follow it Sane and don't forget to join ours 100K + ml subreddit Then sign up for Our newspaper.

Asphazzaq is a Markteach Media Inc. According to a View Business and Developer, Asifi is committed to integrating a good social intelligence. His latest attempt is launched by the launch of the chemistrylife plan for an intelligence, MarktechPost, a devastating intimate practice of a machine learning and deep learning issues that are clearly and easily understood. The platform is adhering to more than two million moon visits, indicating its popularity between the audience.

Source link

nimda August 30, 2025

0 10 5 minutes read