How Agents Organize Jobs with To-Do Lists

we all do it naturally and regularly. In our personal lives, we often keep to-do lists to organize vacations, jobs, and everything in between.
At work, we rely on task trackers and project plans to keep teams on track. For engineers, it is also common to travel TO DO comments in the code as reminders of future changes.
Unsurprisingly, LLM agents also benefit from a clear to-do list to guide their planning.
To-do lists help agents plan and track complex tasks more effectively, making them especially useful for multi-tool communication and long-running tasks where progress needs to be visible.
Coding agents like OpenAI Codex, Cline, and Claude Code (which I use regularly) are prime examples of this concept.
They break down complex requests into an initial set of steps, organize them as dependent to-do lists, and update the plan in real time as tasks are completed or new information emerges.
This clarity enables agents to manage long sequences of actions, coordinate disparate tools, and track progress in a comprehensible manner.
In this article, we delve into how agents use to-do list capabilities, analyze the basic components of the scheduling process, and demonstrate its implementation with LangChain.
Content
(1) Example of a Scheduling Agent Scenario
(2) Key Components of Performance Skills
(3) Integrating it into Middleware
The corresponding code is available in this GitHub repo.
(1) Example of Status of Planning Agent
Let's look at an example situation to play our part.
We will stop a single agent travel planning and booking activities. The agent has access to:
In this example, these tools are mocked and do not make real reservations; are included to illustrate the agent's planning method and how it uses the to-do list.
Here is our implementation code planning agent on LangChain:
We enter a user query and view the to-do list from the agent configuration:

The use of structured note-taking with to-do lists allows agents to maintain persistent memory outside of the context window. This strategy improves the agent's ability to manage and maintain relational context over time.
The code setup is straightforward: create_agent creates an instance of the LLM agent, we quickly go through the system, select the model (GPT-5.1), and connect the tools.
Notable is TodoListMiddleware() something that is transferred to middleware parameter.
First, what is LangChain's middleware?
As the name suggests, it is a middle layer that allows you to run custom code before and after LLM calls.
Think of middleware as a programmable layer that allows us to inject code to monitor, modify, or extend its behavior.
It gives us control and visibility over the behavior of agents without changing their core logic. It can be used to modify input and output, manage retries or early exits, and implement safeguards (eg, guardrails, PII checks).
TodoListMiddleware is a built-in middleware that specifically provides to-do list management capabilities to agents. Next, we look at how TodoListMiddleware it works under the hood.
(2) Key Components of Performance Skills
The scheduling agent's to-do list management capabilities are limited to this four important parts:
- Something to do the job
- A to-do list
- A tool that writes and updates to-do lists
- An update to the operating system information
I TodoListMiddleware combines these elements to enable the agent's to-do list capabilities.
Let's take a closer look at each component and how it is used in the middleware code to be executed.
(2.1) Item to be made
A to-do item is the smallest unit in a to-do list, representing a single task. It is represented by two fields: job description and current status.
In LangChain, this is modeled as Todo type, defined using TypedDict:
I content field represents the description of the task the agent needs to perform next, while the status keeps track of whether the task has been started (pending), is processed (in_progress), or completed (completed).
Here is an example of what to do:
{
"content": "Compare flight options from Singapore to Tokyo",
"status": "completed"
},
(2.2) To-do list
Now that we have explained the structure of a Todo object, we examine how a set of artifacts is stored and tracked as part of an overall system.
We define a State object (PlanningState) to capture the program as list of things to do, which will be updated as the tasks are done:
I todos field is optional (NotRequired), which means that it may not exist when it is first started (ie, the agent may not have any jobs in its system).
OmitFromInput said that todos it is controlled internally by the middleware and should not be provided directly as user input.
A state is the agent's short-term memory, capturing recent interactions and key information to act appropriately based on prior context and knowledge.
In this case, the to-do list stays within the region so that the agent can refer to and update the tasks consistently throughout the session.
Here is an example of a to-do list:
todos: list[Todo] = [
{
"content": "Research visa requirements for Singapore passport holders visiting Japan",
"status": "completed"
},
{
"content": "Compare flight options from Singapore to Tokyo",
"status": "in_progress"
},
{
"content": "Book flights and hotels once itinerary is finalized",
"status": "pending"
}
]
(2.3) A tool for documenting and reviewing transactions
With the basic structure of the to-do list, we now need a tool for the LLM agent to manage and review as tasks are performed.
Here is a general way to describe our tool (write_todos):
I write_todos the tool returns a Command instructing the agent to update its to-do list and enter a message recording the change.
While i write_todos the structure is straightforward, the magic lies in the description (WRITE_TODOS_TOOL_DESCRIPTION) of the instrument.
When the agent invokes the instrument, the description of the instrument serves as an additional important notice, to guide it in how to use it properly and what things should be given.
Here is LangChain's (very long) description of the tool:
We can see that the description is very structured and precise, explaining when and how the tool is used, work conditions, and management rules.
It also provides clear guidelines for tracking complex tasks, breaking them down into clear steps, and reviewing them systematically.
Feel free to check out Deepagents' more detailed description of the tool you should do here
(2.4) System information update
The last feature to set up is the ability to update the agent's system information.
It is done by injection WRITE_TODOS_SYSTEM_PROMPT in the main notification, to inform the agent clearly that write_todos the tool is there.
Directs the agent when and why to use the toolprovides context for complex, multi-step tasks, and outlines best practices for maintaining and updating to-do lists:
(3) Integrating it into Middleware
Finally, all four important components are combined into a single class called TodoListMiddlewarewhich packs the power to make it a unified agent flow:
- Explain
PlanningStatetracking tasks as part of a to-do list - Create powerfully
write_todostool to update the list and make it accessible to the LLM - Inject
WRITE_TODOS_SYSTEM_PROMPTguiding the agent's planning and thinking
I WRITE_TODOS_SYSTEM_PROMPT is injected through the middleware's wrap_model_call (again awrap_model_call) method, which attaches it to the agent system message for every model call, as shown below:
Wrapping it up
Like humans, agents use to-do lists to break down complex problems, stay organized, and adapt in real time, enabling them to solve problems more effectively and accurately.
By using the LangChain middleware, we also gain a deeper understanding of how scheduled tasks can be scheduled, tracked, and executed by agents.
Check out this GitHub repo to run the code.



