How to Create an Effective Ai-Style Conversional Student Website

nimda November 13, 2025

0 8 4 minutes read

How to Create an Effective Ai-Style Conversional Student Website

In this tutorial, we build our GPT-tyle-typay chat system from scratch using a local face model. We start by uploading a lightweight-tured-rated intuitive model, and wrap it inside a structured conversational framework that includes the role of the program, the user's memory, and the assistant's responses. We describe how the agent interprets context, constructs messages, and optionally uses built-in small tools to retrieve location data or generated search results. Finally, we have a fully functional, transformative model that works like a fully customizable GPT. Look Full codes here.

!pip install transformers accelerate sentencepiece --quiet
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM
from typing import List, Tuple, Optional
import textwrap, json, os

We start by installing the essential libraries and importing the required modules. We make sure that the environment has all the necessary things, such as converters, torch, and sentence, ready to use. This setup allows us to work seamlessly with facial models within Google Colab. Look Full codes here.

MODEL_NAME = "microsoft/Phi-3-mini-4k-instruct"
BASE_SYSTEM_PROMPT = (
   "You are a custom GPT running locally. "
   "Follow user instructions carefully. "
   "Be concise and structured. "
   "If something is unclear, say it is unclear. "
   "Prefer practical examples over corporate examples unless explicitly asked. "
   "When asked for code, give runnable code."
)
MAX_NEW_TOKENS = 256

We prepare our model name, quickly define the system that controls the operation of the assistant, and set the parameters of the token. We establish how we should respond, concise, organized, and effective. This section describes the basis of Model Womel'iness's Wornely and teaching style. Look Full codes here.

print("Loading model...")
tokenizer = AutoTokenizer.from_pretrained(MODEL_NAME)
if tokenizer.pad_token_id is None:
   tokenizer.pad_token_id = tokenizer.eos_token_id
model = AutoModelForCausalLM.from_pretrained(
   MODEL_NAME,
   torch_dtype=torch.float16 if torch.cuda.is_available() else torch.float32,
   device_map="auto"
)
model.eval()
print("Model loaded.")

We load the Tokenizer and Model from the face mask in memory and prepare them for humility. We automatically convert the device map based on the available hardware, ensuring GPU acceleration when possible. Once loaded, our model is ready to generate responses. Look Full codes here.

ConversationHistory = List[Tuple[str, str]]
history: ConversationHistory = [("system", BASE_SYSTEM_PROMPT)]


def wrap_text(s: str, w: int = 100) -> str:
   return "n".join(textwrap.wrap(s, width=w))


def build_chat_prompt(history: ConversationHistory, user_msg: str) -> str:
   prompt_parts = []
   for role, content in history:
       if role == "system":
           prompt_parts.append(f"<|system|>n{content}n")
       elif role == "user":
           prompt_parts.append(f"<|user|>n{content}n")
       elif role == "assistant":
           prompt_parts.append(f"<|assistant|>n{content}n")
   prompt_parts.append(f"<|user|>n{user_msg}n")
   prompt_parts.append("<|assistant|>n")
   return "".join(prompt_parts)

We start the history of the conversation, starting with the role of the program, and create a quick builder to format the messages. We explain how the user becomes the user and how the assistant organizes in a consistent conversion structure. This ensures that the model always correctly understands the context of the conversation. Look Full codes here.

def local_tool_router(user_msg: str) -> Optional[str]:
   msg = user_msg.strip().lower()
   if msg.startswith("search:"):
       query = user_msg.split(":", 1)[-1].strip()
       return f"Search results about '{query}':n- Key point 1n- Key point 2n- Key point 3"
   if msg.startswith("docs:"):
       topic = user_msg.split(":", 1)[-1].strip()
       return f"Documentation extract on '{topic}':n1. The agent orchestrates tools.n2. The model consumes output.n3. Responses become memory."
   return None

We feature a lightweight router that defies our GPT capabilities for simulating tasks such as searching or retrieving documents. We are well aware of finding special prefixes such as “Search:” or “Documents:” in user queries. This simple agentic design gives awareness to our helper. Look Full codes here.

def generate_reply(history: ConversationHistory, user_msg: str) -> str:
   tool_context = local_tool_router(user_msg)
   if tool_context:
       user_msg = user_msg + "nnUseful context:n" + tool_context
   prompt = build_chat_prompt(history, user_msg)
   inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
   with torch.no_grad():
       output_ids = model.generate(
           **inputs,
           max_new_tokens=MAX_NEW_TOKENS,
           do_sample=True,
           top_p=0.9,
           temperature=0.6,
           pad_token_id=tokenizer.eos_token_id
       )
   decoded = tokenizer.decode(output_ids[0], skip_special_tokens=True)
   reply = decoded.split("<|assistant|>")[-1].strip() if "<|assistant|>" in decoded else decoded[len(prompt):].strip()
   history.append(("user", user_msg))
   history.append(("assistant", reply))
   return reply


def save_history(history: ConversationHistory, path: str = "chat_history.json") -> None:
   data = [{"role": r, "content": c} for (r, c) in history]
   with open(path, "w") as f:
       json.dump(data, f, indent=2)


def load_history(path: str = "chat_history.json") -> ConversationHistory:
   if not os.path.exists(path):
       return [("system", BASE_SYSTEM_PROMPT)]
   with open(path, "r") as f:
       data = json.load(f)
   return [(item["role"], item["content"]) for item in data]

We describe the main response function of the response, including history, context, and softening the model to produce consistent results. We also added functions to save and load background dialogs for persistence. This snippet forms the working core of our GPT. Look Full codes here.

print("n--- Demo turn 1 ---")
demo_reply_1 = generate_reply(history, "Explain what this custom GPT setup is doing in 5 bullet points.")
print(wrap_text(demo_reply_1))


print("n--- Demo turn 2 ---")
demo_reply_2 = generate_reply(history, "search: agentic ai with local models")
print(wrap_text(demo_reply_2))


def interactive_chat():
   print("nChat ready. Type 'exit' to stop.")
   while True:
       try:
           user_msg = input("nUser: ").strip()
       except EOFError:
           break
       if user_msg.lower() in ("exit", "quit", "q"):
           break
       reply = generate_reply(history, user_msg)
       print("nAssistant:n" + wrap_text(reply))


# interactive_chat()
print("nCustom GPT initialized successfully.")

We test all setups by working with Demo Products and showing the responses that are generated. We have also created an optional Chat Loop to chat directly with the assistant. At the end, we ensure that our custom GPT works in your environment and responds intelligently in real time.

In conclusion, we designed and released a custom GPT-style status changer without relying on any external services. We saw how spatial models can be made to interact with each other through accelerated orchestrations, lightweight toolpaths, and interactive memory management. This method enables us to understand the internal logic behind GPT's GPT systems. It gives us the ability to test our rules, behaviors, and integrations in a transparent and completely offline way.

Look Full codes here. Feel free to take a look at ours GitHub page for tutorials, code and notebooks. Also, feel free to follow us Kind of stubborn and don't forget to join ours 100K + ML Subreddit and sign up Our newsletter. Wait! Do you telegraph? Now you can join us by telegraph.

AsifAzzaq is the CEO of MarktechPost Media Inc.. as a visionary entrepreneur and developer, Asifi is committed to harnessing the power of social intelligence for good. His latest effort is the launch of a media intelligence platform, MarktechPpost, which stands out for its deep understanding of machine learning and deep learning stories that are technically sound and easily understood by a wide audience. The platform sticks to more than two million monthly views, which shows its popularity among the audience.

Follow Marktechpost: Add us as a favorite source on Google.

Source link

nimda November 13, 2025

0 8 4 minutes read