Building a LangGraph Agent from Scratch

nimda February 17, 2026

0 11 8 minutes read

Name “An AI agent” is one of the most popular right now.” They came after the LLM propaganda, when people realized that the latest LLM skills are impressive but they can only do the jobs they are clearly trained for.” In that sense, traditional LLMs do not have the tools to allow them to do anything outside of their scope of knowledge.

RAG

To deal with this, Retrieval-Augmented Generation (RAG) later it is introduced to get more context from external data sources and inject it into the information, so that LLM is aware of more context. We can probably say that RAG makes LLM more informative, but for complex problems, the LLM + RAG method still fails when the solution path is not known in advance.

RAG pipe

Agents

Agents are an amazing concept built around the LLMs they present situation, decision makingagain memory. Agents can be thought of as a set of predefined tools that analyze results and store them in memory for later use before producing a final response.

LangGraph

LangGraph is a popular framework used to create agents. As the name suggests, agents are constructed using graphs with nodes and edges.

Nodes represent the state of an agent, which changes over time. Edges define control flow by specifying transition rules and conditions between nodes.

To better understand LangGraph in practice, we will go through a detailed example. Although LangGraph may seem too verbose for the problem below, it is often more effective for complex problems with large graphs.

First, we need to install the required libraries.

langgraph==1.0.5
langchain-community==0.4.1
jupyter==1.1.1
notebook==7.5.1
langchain[openai]

Then we import the required modules.

import os
from dotenv import load_dotenv

import json
import random
from pydantic import BaseModel
from typing import Optional, List, Dict, Any

from langgraph.graph import StateGraph, START, END
from langchain_openai import ChatOpenAI
from langchain_core.messages import HumanMessage
from langchain.chat_models import init_chat_model
from langchain.tools import tool

from IPython.display import Image, display

We will also need to build a .env file and add the OPENAI_API_KEY Where:

OPENAI_API_KEY=...

Then, with load_dotenv()we can load environment variables into the program.

load_dotenv()

More functionality

The function below will be useful for us to display the constructed graphs.

def display_graph(graph):
    return display(Image(graph.get_graph().draw_mermaid_png()))

Agent

Let's start the GPT-5-nano based agent using a simple command:

llm = init_chat_model("openai:gpt-5-nano")

The country

In our example, we will create an agent that can answer questions about soccer. Its reasoning process will be based on the statistics returned about the players.

To do that, we need to define the situation. In our case, it will be an entity that contains all the information LLM needs about the player. To define a condition, we need to write a class that benefits from it pydantic.BaseModel:

class PlayerState(BaseModel):
    question: str
    selected_tools: Optional[List[str]] = None
    name: Optional[str] = None
    club: Optional[str] = None
    country: Optional[str] = None
    number: Optional[int] = None
    rating: Optional[int] = None
    goals: Optional[List[int]] = None
    minutes_played: Optional[List[int]] = None
    summary: Optional[str] = None

When moving between LangGraph nodes, each node takes as input an instance of PlayerState that specifies how to process the state. Our task will be to explain how that situation is processed.

Tools

First, we'll describe some of the tools an agent can use. A a tool it can almost be thought of as an additional function that an agent can call to get the information needed to answer a user's query.

To define a tool, we need to write a function with @itool the decorator. It is important to use clear parameter names and function docstrings, as the agent will consider them when deciding whether to call the tool based on the input context.

To simplify our examples, we will use dummy data instead of real data obtained from external sources, which is often the case for production applications.

In the first tool, we will return information about the player's club and country by name.

@tool
def fetch_player_information_tool(name: str):
    """Contains information about the football club of a player and its country"""
    data = {
        'Haaland': {
            'club': 'Manchester City',
            'country': 'Norway'
        },
        'Kane': {
            'club': 'Bayern',
            'country': 'England'
        },
        'Lautaro': {
            'club': 'Inter',
            'country': 'Argentina'
        },
        'Ronaldo': {
            'club': 'Al-Nassr',
            'country': 'Portugal'
        }
    }
    if name in data:
        print(f"Returning player information: {data[name]}")
        return data[name]
    else:
        return {
            'club': 'unknown',
            'country': 'unknown'
        }

def fetch_player_information(state: PlayerState):
    return fetch_player_information_tool.invoke({'name': state.name})

You might ask why we put a tool inside another function, which seems like extreme engineering. In fact, these two functions have different responsibilities.

Work get_player_info() takes a state as a parameter and corresponds to the LangGraph framework. Extracts the name field and calls the tool that operates on the parameter level.

It provides a clear separation of concerns and allows easy use of the same tool across multiple graph nodes.

Then there's a function like that that gets the player's jersey number:

@tool
def fetch_player_jersey_number_tool(name: str):
    "Returns player jersey number"
    data = {
        'Haaland': 9,
        'Kane': 9,
        'Lautaro': 10,
        'Ronaldo': 7
    }
    if name in data:
        print(f"Returning player number: {data[name]}")
        return {'number': data[name]}
    else:
        return {'number': 0}

def fetch_player_jersey_number(state: PlayerState):
    return fetch_player_jersey_tool.invoke({'name': state.name})

In the third tool, we will be downloading the FIFA player rating:

@tool
def fetch_player_rating_tool(name: str):
    "Returns player rating in the FIFA"
    data = {
        'Haaland': 92,
        'Kane': 89,
        'Lautaro': 88,
        'Ronaldo': 90
    }
    if name in data:
        print(f"Returning rating data: {data[name]}")
        return {'rating': data[name]}
    else:
        return {'rating': 0}

def fetch_player_rating(state: PlayerState):
    return fetch_player_rating_tool.invoke({'name': state.name})

Now, let's write a few more graph node functions that will retrieve external data. We won't label them as tools like before, which means they won't be something the agent decides to call or not.

def retrieve_goals(state: PlayerState):
    name = state.name
    data = {
        'Haaland': [25, 40, 28, 33, 36],
        'Kane': [33, 37, 41, 38, 29],
        'Lautaro': [19, 25, 27, 24, 25],
        'Ronaldo': [27, 32, 28, 30, 36]
    }
    if name in data:
        return {'goals': data[name]}
    else:
        return {'goals': [0]}

Here is a graph node that returns the number of minutes played over the last few seasons.

def retrieve_minutes_played(state: PlayerState):
    name = state.name
    data = {
        'Haaland': [2108, 3102, 3156, 2617, 2758],
        'Kane': [2924, 2850, 3133, 2784, 2680],
        'Lautaro': [2445, 2498, 2519, 2773],
        'Ronaldo': [3001, 2560, 2804, 2487, 2771]
    }
    if name in data:
        return {'minutes_played': data[name]}
    else:
        return {'minutes_played': [0]}

Below is a node that retrieves the player name from the user query.

def extract_name(state: PlayerState):
    question = state.question
    prompt = f"""
You are a football name extractor assistant.
Your goal is to just extract a surname of a footballer in the following question.
User question: {question}
You have to just output a string containing one word - footballer surname.
    """
    response = llm.invoke([HumanMessage(content=prompt)]).content
    print(f"Player name: ", response)
    return {'name': response}

Now is the time when things get interesting. Remember the three tools we described above? Thanks to them, we can now create a scheduler that will ask the agent to choose a specific tool to call based on the context of the situation:

def planner(state: PlayerState):
    question = state.question
    prompt = f"""
You are a football player summary assistant.
You have the following tools available: ['fetch_player_jersey_number', 'fetch_player_information', 'fetch_player_rating']
User question: {question}
Decide which tools are required to answer.
Return a JSON list of tool names, e.g. ['fetch_player_jersey_number', 'fetch_rating']
    """
    response = llm.invoke([HumanMessage(content=prompt)]).content
    try:
        selected_tools = json.loads(response)
    except:
        selected_tools = []
    return {'selected_tools': selected_tools}

In our case, we will ask the agent to create a summary of the soccer player. It will decide by itself which tool to call to get more data. Docstrings under tools play an important role: they provide the agent with additional context about the tools.

Below is our final graph, which will take most of the fields found in the previous steps and call LLM to do the final summary.

def write_summary(state: PlayerState):
    question = state.question
    data = {
        'name': state.name,
        'country': state.country,
        'number': state.number,
        'rating': state.rating,
        'goals': state.goals,
        'minutes_played': state.minutes_played,
    }
    prompt = f"""
You are a football reporter assistant.
Given the following data and statistics of the football player, you will have to create a markdown summary of that player.
Player data:
{json.dumps(data, indent=4)}
The markdown summary has to include the following information:

- Player full name (if only first name or last name is provided, try to guess the full name)
- Player country (also add flag emoji)
- Player number (also add the number in the emoji(-s) form)
- FIFA rating
- Total number of goals in last 3 seasons
- Average number of minutes required to score one goal
- Response to the user question: {question}
    """
    response = llm.invoke([HumanMessage(content=prompt)]).content
    return {"summary": response}

Graph construction

Now we have all the elements to build a graph. First, we initialize the graph using StateGraph builder. Then, we add nodes to that graph one by one using add_node() way. It takes two parameters: a string used to assign a name to the node, and a callable function associated with the node that takes the state of the graph as its only parameter.

graph_builder = StateGraph(PlayerState)
graph_builder.add_node('extract_name', extract_name)
graph_builder.add_node('planner', planner)
graph_builder.add_node('fetch_player_jersey_number', fetch_player_jersey_number)
graph_builder.add_node('fetch_player_information', fetch_player_information)
graph_builder.add_node('fetch_player_rating', fetch_player_rating)
graph_builder.add_node('retrieve_goals', retrieve_goals)
graph_builder.add_node('retrieve_minutes_played', retrieve_minutes_played)
graph_builder.add_node('write_summary', write_summary)

Currently, our graph contains only nodes. We need to add edges to it. Edges in LangGraph are directed and added with add_edge() method, which specifies the names of the start and end locations.

The only thing we need to consider is the scheduler, which behaves a little differently from other nodes. As shown above, it can return a file of selected_tools field, which contains 0 to 3 output nodes.

In this case, we need to use the add_conditional_edges() method takes three parameters:

The name of the editor node;
A callable function that takes a LangGraph node and returns a list of strings representing a list of field names must be called;
A string to map the dictionary from the second parameter to node names.

In our case, we will define the route_tools() node to simply return the state.selected_tools field as a result of a programming operation.

def route_tools(state: PlayerState):
    return state.selected_tools or []

After that we can create nodes:

graph_builder.add_edge(START, 'extract_name')
graph_builder.add_edge('extract_name', 'planner')
graph_builder.add_conditional_edges(
    'planner',
    route_tools,
    {
        'fetch_player_jersey_number': 'fetch_player_jersey_number',
        'fetch_player_information': 'fetch_player_information',
        'fetch_player_rating': 'fetch_player_rating'
    }
)
graph_builder.add_edge('fetch_player_jersey_number', 'retrieve_goals')
graph_builder.add_edge('fetch_player_information', 'retrieve_goals')
graph_builder.add_edge('fetch_player_rating', 'retrieve_goals')
graph_builder.add_edge('retrieve_goals', 'retrieve_minutes_played')
graph_builder.add_edge('retrieve_minutes_played', 'write_summary')
graph_builder.add_edge('write_summary', END)

START and END are LangGraph constants used to define the beginning and end of a graph.

The last step is to compile the graph. We can choose to visualize it using the helper function described above.

graph = graph_builder.compile()
display_graph(graph)

Getting started with LangGraph — Graph diagram

For example

Now we can finally use our graph! To do so, we can use the request method and pass a dictionary containing the query field with the user's custom query:

result = graph.invoke({
    'question': 'Will Haaland be able to win the FIFA World Cup for Norway in 2026 based on his recent performance and stats?'
})

And here is an example of the result we can get!

{'question': 'Will Haaland be able to win the FIFA World Cup for Norway in 2026 based on his recent performance and stats?',
 'selected_tools': ['fetch_player_information', 'fetch_player_rating'],
 'name': 'Haaland',
 'club': 'Manchester City',
 'country': 'Norway',
 'rating': 92,
 'goals': [25, 40, 28, 33, 36],
 'minutes_played': [2108, 3102, 3156, 2617, 2758],
 'summary': '- Full name: Erling Haalandn- Country: Norway 🇳🇴n- Number: N/A
- FIFA rating: 92n- Total goals in last 3 seasons: 97 (28 + 33 + 36)n- Average minutes per goal (last 3 seasons): 87.95 minutes per goaln- Will Haaland win the FIFA World Cup for Norway in 2026 based on recent performance and stats?n  - Short answer: Not guaranteed. Haaland remains among the world’s top forwards (92 rating, elite goal output), and he could be a key factor for Norway. However, World Cup success is a team achievement dependent on Norway’s overall squad quality, depth, tactics, injuries, and tournament context. Based on statistics alone, he strengthens Norway’s chances, but a World Cup title in 2026 cannot be predicted with certainty.'}

The nice thing is that we can see the whole state of the graph and analyze the tools the agent has chosen to make the final answer. The final snapshot looks great!

The conclusion

In this article, we explored the AI agents that opened a new chapter for LLMs. Equipped with advanced tools and decision-making, we now have a greater ability to solve complex tasks.

The example we saw in this article introduced us to LangGraph – one of the most popular frameworks for building agents. Its simplicity and beauty allow building complex decision chains. Although, in our simple example, LangGraph may seem overkill, it becomes more useful in larger projects where the state and graph structures are more complex.