How to Build Authoritative Agents Aligned with Facilitated Consultation and Adaptive Decision-Making Using Open Models

In this course, we explore how to create an effective agent that aligns its actions with ethical and organizational values. We use an open source binding of facial models available in colob to simulate the decision-making process that balances goal attainment with ethical deliberation. With this agreement, we show how we can combine a “policy” policy that proposes actions and a “behavior” model that checks and allows us to see API compatibility without interacting with any APIs. Look Full codes here.
!pip install -q transformers torch accelerate sentencepiece
import torch
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM, AutoModelForCausalLM
def generate_seq2seq(model, tokenizer, prompt, max_new_tokens=128):
inputs = tokenizer(prompt, return_tensors="pt")
with torch.no_grad():
output_ids = model.generate(
**inputs,
max_new_tokens=max_new_tokens,
do_sample=True,
top_p=0.9,
temperature=0.7,
pad_token_id=tokenizer.eos_token_id if tokenizer.eos_token_id is not None else tokenizer.pad_token_id,
)
return tokenizer.decode(output_ids[0], skip_special_tokens=True)
def generate_causal(model, tokenizer, prompt, max_new_tokens=128):
inputs = tokenizer(prompt, return_tensors="pt")
with torch.no_grad():
output_ids = model.generate(
**inputs,
max_new_tokens=max_new_tokens,
do_sample=True,
top_p=0.9,
temperature=0.7,
pad_token_id=tokenizer.eos_token_id if tokenizer.eos_token_id is not None else tokenizer.pad_token_id,
)
full_text = tokenizer.decode(output_ids[0], skip_special_tokens=True)
return full_text[len(prompt):].strip()
We start by setting up our environment and importing the libraries that are important to face diving. We describe two text-generating tasks using sequence-based and causal models. This allows us to easily generate both feedback and creativity later in the course. Look Full codes here.
policy_model_name = "distilgpt2"
judge_model_name = "google/flan-t5-small"
policy_tokenizer = AutoTokenizer.from_pretrained(policy_model_name)
policy_model = AutoModelForCausalLM.from_pretrained(policy_model_name)
judge_tokenizer = AutoTokenizer.from_pretrained(judge_model_name)
judge_model = AutoModelForSeq2SeqLM.from_pretrained(judge_model_name)
device = "cuda" if torch.cuda.is_available() else "cpu"
policy_model = policy_model.to(device)
judge_model = judge_model.to(device)
if policy_tokenizer.pad_token is None:
policy_tokenizer.pad_token = policy_tokenizer.eos_token
if judge_tokenizer.pad_token is None:
judge_tokenizer.pad_token = judge_tokenizer.eos_token
We load two small open models—distilgpt2 as our action generator and flan-t5—small as the Ethics reviewer. We prepare both models and tokenizers for CPU or GPU killer, ensuring smooth operation in colob. This setup provides the basis for agent reasoning and ethical evaluation. Look Full codes here.
class EthicalAgent:
def __init__(self, policy_model, policy_tok, judge_model, judge_tok):
self.policy_model = policy_model
self.policy_tok = policy_tok
self.judge_model = judge_model
self.judge_tok = judge_tok
def propose_actions(self, user_goal, context, n_candidates=3):
base_prompt = (
"You are an autonomous operations agent. "
"Given the goal and context, list a specific next action you will take:nn"
f"Goal: {user_goal}nContext: {context}nAction:"
)
candidates = []
for _ in range(n_candidates):
action = generate_causal(self.policy_model, self.policy_tok, base_prompt, max_new_tokens=40)
action = action.split("n")[0]
candidates.append(action.strip())
return list(dict.fromkeys(candidates))
def judge_action(self, action, org_values):
judge_prompt = (
"You are the Ethics & Compliance Reviewer.n"
"Evaluate the proposed agent action.n"
"Return fields:n"
"RiskLevel (LOW/MED/HIGH),n"
"Issues (short bullet-style text),n"
"Recommendation (approve / modify / reject).nn"
f"ORG_VALUES:n{org_values}nn"
f"ACTION:n{action}nn"
"Answer in this format:n"
"RiskLevel: ...nIssues: ...nRecommendation: ..."
)
verdict = generate_seq2seq(self.judge_model, self.judge_tok, judge_prompt, max_new_tokens=128)
return verdict.strip()
def align_action(self, action, verdict, org_values):
align_prompt = (
"You are an Ethics Alignment Assistant.n"
"Your job is to FIX the proposed action so it follows ORG_VALUES.n"
"Keep it effective but safe, legal, and respectful.nn"
f"ORG_VALUES:n{org_values}nn"
f"ORIGINAL_ACTION:n{action}nn"
f"VERDICT_FROM_REVIEWER:n{verdict}nn"
"Rewrite ONLY IF NEEDED. If original is fine, return it unchanged. "
"Return just the final aligned action:"
)
aligned = generate_seq2seq(self.judge_model, self.judge_tok, align_prompt, max_new_tokens=128)
return aligned.strip()
We describe the class of the main agent that produces, examines, and examines the action. Here, we design methods to propose electoral actions, evaluate their compatibility, and rewrite them to match values. This structure helps us to think about the reasons, judgement, and correction of specific action steps. Look Full codes here.
def decide(self, user_goal, context, org_values, n_candidates=3):
proposals = self.propose_actions(user_goal, context, n_candidates=n_candidates)
scored = []
for act in proposals:
verdict = self.judge_action(act, org_values)
aligned_act = self.align_action(act, verdict, org_values)
scored.append({"original_action": act, "review": verdict, "aligned_action": aligned_act})
def extract_risk(vtext):
for line in vtext.splitlines():
if "RiskLevel" in line:
lvl = line.split(":", 1)[-1].strip().upper()
if "LOW" in lvl:
return 0
if "MED" in lvl:
return 1
if "HIGH" in lvl:
return 2
return 3
scored_sorted = sorted(scored, key=lambda x: extract_risk(x["review"]))
final_choice = scored_sorted[0]
report = {
"goal": user_goal,
"context": context,
"org_values": org_values,
"candidates_evaluated": scored,
"final_plan": final_choice["aligned_action"],
"final_plan_rationale": final_choice["review"],
}
return report
We use a complete decision-making pipeline that connects generation, judgment, and alignment. We assign risk scores to each individual's performance and automatically select the most aligned anationaly. This section captures how an agent can evaluate and improve its decisions before finalizing an action. Look Full codes here.
org_values_text = (
"- Respect privacy; do not access personal data without consent.n"
"- Follow all laws and safety policies.n"
"- Avoid discrimination, harassment, or harmful manipulation.n"
"- Be transparent and truthful with stakeholders.n"
"- Prioritize user well-being and long-term trust over short-term gain."
)
demo_goal = "Increase customer adoption of the new financial product."
demo_context = (
"The agent works for a bank outreach team. The target customers are small family businesses. "
"Regulations require honest disclosure of risks and fees. Cold-calling minors or lying about terms is illegal."
)
agent = EthicalAgent(policy_model, policy_tokenizer, judge_model, judge_tokenizer)
report = agent.decide(demo_goal, demo_context, org_values_text, n_candidates=4)
def pretty_report(r):
print("=== ETHICAL DECISION REPORT ===")
print(f"Goal: {r['goal']}n")
print(f"Context: {r['context']}n")
print("Org Values:")
print(r["org_values"])
print("n--- Candidate Evaluations ---")
for i, cand in enumerate(r["candidates_evaluated"], 1):
print(f"nCandidate {i}:")
print("Original Action:")
print(" ", cand["original_action"])
print("Ethics Review:")
print(cand["review"])
print("Aligned Action:")
print(" ", cand["aligned_action"])
print("n--- Final Plan Selected ---")
print(r["final_plan"])
print("nWhy this plan is acceptable (review snippet):")
print(r["final_plan_rationale"])
pretty_report(report)
We define the organization's values, create a real-world scenario, and use agent behavior to generate its final plan. Finally, we print a detailed report that shows the actions of the election, the review, and the decision of the selected behavior. At this point, we see how our agent integrates behaviors directly into its thinking process.
In conclusion, we understand well how an agent can think not only about what he should do but also about whether he will do it. We witness how the system learns to identify risks, correct them, and agree on its actions with human and organizational principles. This work helps us to see that the synchronization of value and behavior are not inappropriate machines but effective ways that we can move to aventic systems to make them safer, more reliable.
Look Full codes here. Feel free to take a look at ours GitHub page for tutorials, code and notebooks. Also, feel free to follow us Kind of stubborn and don't forget to join ours 100K + ML Subreddit and sign up Our newsletter. Wait! Do you telegraph? Now you can join us by telegraph.
AsifAzzaq is the CEO of MarktechPost Media Inc.. as a visionary entrepreneur and developer, Asifi is committed to harnessing the power of social intelligence for good. His latest effort is the launch of a media intelligence platform, MarktechPpost, which stands out for its deep understanding of machine learning and deep learning stories that are technically sound and easily understood by a wide audience. The platform sticks to more than two million monthly views, which shows its popularity among the audience.
Follow Marktechpost: Add us as a favorite source on Google.



