Ai agents processing a series series and big dafuzi

nimda April 22, 2025

0 4 7 minutes read

Ai agents processing a series series and big dafuzi

Ai systems of AI, who are enabled by llms, can consult their objectives and take actions to achieve the final goal. It is not designed not to answer the questions, but to organize the sequence of functions, including processing data (IE Data Frames and time period). This skill opens many Real-World Real-World programs to democratic access to data analysis, such as reporting defaults, code questions, supporting data and deceptive.

Agents can contact Dataframes in two different ways:

reference Natural language – Llm reads a table as a string and try to make its mind based on the basis of its knowledge

by Creating and making code – Agent activated the tools to process the dataset as something.

Therefore, by combining the NLP power with the accuracy of the murder of the code, agents AI empowering users with interactive information and complex information and access information.

In this lesson, I will show how I can do it Data process and time series and AI agents. I will present a specific Python code that can be easily used in other similar charges (already a copy, stick, run) and walk on all Code codes to re-recover the article again.

Putting Time

Let's start with setup Ollama (pip install ollama==0.4.7), library that allows users to use Open-Source LLMs in your area, without requiring cloud-based services, provide additional control of the privacy and data function. As you work in your area, any chat data does not leave your machine.

First, you need to download Ollama from the website.

Then, in the fast-speed shell of your laptop, use the selection order of the selected llm. I'm going with Alaba EweAs wise and light.

After download is completed, you can download to Python and start the writing code.

import ollama
llm = "qwen2.5"

Let's examine the llm:

stream = ollama.generate(model=llm, prompt='''what time is it?''', stream=True)
for chunk in stream:
    print(chunk['response'], end='', flush=True)

Series of time

A series of time in a row of measured data Points later, used for analysis and prediction. It allows us to see how the variables changes over time, and is used to identify styles and seasonal patterns.

I will produce a fake series dataset to use it as an example.

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

## create data
np.random.seed(1) #<--for reproducibility
length = 30
ts = pd.DataFrame(data=np.random.randint(low=0, high=15, size=length),
                  columns=['y'],
                  index=pd.date_range(start='2023-01-01', freq='MS', periods=length).strftime('%Y-%m'))

## plot
ts.plot(kind="bar", figsize=(10,3), legend=False, color="black").grid(axis='y')

Usually, datasets of a series have a simple structure with a truly simple variance like column as time as an index.

Before turning it into a cable, I want to make sure everything is placed under column, so that we do not lose any information.

dtf = ts.reset_index().rename(columns={"index":"date"})
dtf.head()

Then, I will change the type of data from dataframe to Dictionary.

data = dtf.to_dict(orient='records')
data[0:5]

Finally, from the dictionary to the string.

str_data = "n".join([str(row) for row in data])
str_data

Now that we have a cord, can be Fastly installed that any language model is able to process. If you attach the dataset into a quick time, the LLM reads information as transparent text, but understands the composition and meaning based on patterns seen during training.

prompt = f'''
Analyze this dataset, it contains monthly sales data of an online retail product:
{str_data}
'''

We can reset the conversation easily with the llm. Please note that, now, this is not an agent as a tool, just using the language model. While processing numbers as a computer, the LLM can detect column names, time-based patterns, styles, and vendors, especially with small datasets. It can imitate analysis and define findings, but it will not make specific statistics independent, because they are a domain code as an agent.

Messages = [{"role":"system", "content":prompt}]

While the truth: # # User Q = Input ('🙂>') if q == "quit": Q}): Q "[]) Re = agent_["message"]["content"]

   
    

   
    ## Pred Print ("👽>", F "X1B[1;30m{res}x1b[0m")
    messages.append( {"role":"assistant", "content":res} )

The LLM recognizes numbers and understands the general context, the same way it might understand a recipe or a line of code.

As you can see, using LLMs to analyze time series is great for quick and conversational insights.

Agent

LLMs are good for brainstorming and lite exploration, while an Agent can run code. Therefore, it can handle more complex tasks like plotting, forecasting, and anomaly detection. So, let’s create the Tools.

Sometimes, it can be more effective to treat the “final answer” as a Tool. For example, if the Agent does multiple actions to generate intermediate results, the final answer can be thought of as the Tool that integrates all of this information into a cohesive response. By designing it this way, you have more customization and control over the results.

def final_answer(text:str) -> str:
    return text

tool_final_answer = {'type':'function', 'function':{
  'name': 'final_answer',
  'description': 'Returns a natural language response to the user',
  'parameters': {'type': 'object',
                'required': ['text']'Properties': {Scripture': 'Str' ':' Str ':' Str ',' Str ', Answer'} FINATION_

Then, Tool to enter Codes.

import io
import contextlib

def code_exec(code:str) -> str:
    output = io.StringIO()
    with contextlib.redirect_stdout(output):
        try:
            exec(code)
        except Exception as e:
            print(f"Error: {e}")
    return output.getvalue()

tool_code_exec = {'type':'function', 'function':{
  'name': 'code_exec',
  'description': 'Execute python code. Use always the function print() to get the output.',
  'parameters': {'type': 'object',
                'required': ['code'],
                'properties': {
                    'code': {'type':'str', 'description':'code to execute'},
}}}}

code_exec("from datetime import datetime; print(datetime.now().strftime('%H:%M'))")

In addition, I will include a few couples Utils jobs Use of tools and use agent.

dic_tools = {"final_answer":final_answer, "code_exec":code_exec}

# Utils
def use_tool(agent_res:dict, dic_tools:dict) -> dict:
    ## use tool
    if "tool_calls" in agent_res["message"].keys():
        for tool in agent_res["message"]["tool_calls"]:
            t_name, t_inputs = tool["function"]["name"], tool["function"]["arguments"]
            if f := dic_tools.get(t_name):
                ### calling tool
                print('🔧 >', f"x1b[1;31m{t_name} -> Inputs: {t_inputs}x1b[0m")
                ### tool output
                t_output = f(**tool["function"]["arguments"])
                print(t_output)
                ### final res
                res = t_output
            else:
                print('🤬 >', f"x1b[1;31m{t_name} -> NotFoundx1b[0m")
    ## don't use tool
    if agent_res['message']['content'] != '':
        res = agent_res["message"]["content"]
        t_name, t_inputs = '', ''
    return {'res':res, 'tool_used':t_name, 'inputs_used':t_inputs}

When the agent is trying to solve work, I want to track the tools used, which you receive, and the results that request. Itemation should only stop when the model is ready to give the final feedback.

def run_agent(llm, messages, available_tools):
    tool_used, local_memory = '', ''
    while tool_used != 'final_answer':
        ### use tools
        try:
            agent_res = ollama.chat(model=llm,
                                    messages=messages,                                                                                                              tools=[v for v in available_tools.values()])
            dic_res = use_tool(agent_res, dic_tools)
            res, tool_used, inputs_used = dic_res["res"], dic_res["tool_used"], dic_res["inputs_used"]
        ### error
        except Exception as e:
            print("⚠️ >", e)
            res = f"I tried to use {tool_used} but didn't work. I will try something else."
            print("👽 >", f"x1b[1;30m{res}x1b[0m")
            messages.append( {"role":"assistant", "content":res} )
        ### update memory
        if tool_used not in ['','final_answer']:
            local_memory += f"nTool used: {tool_used}.nInput used: {inputs_used}.nOutput: {res}"
            messages.append( {"role":"assistant", "content":local_memory} )
            available_tools.pop(tool_used)
            if len(available_tools) == 1:
                messages.append( {"role":"user", "content":"now activate the tool final_answer."} )
        ### tools not used
        if tool_used == '':
            break
    return res

About the coding tool, I have seen that agents are inclined to re-restore the Databamame for all action. So I will use a Memory is reinforced Reminding the model that the datasset is already there. The tact is often used to find a desirable way. Finally, memory strengthening helps you get logical and active interaction.

# Start chat messages = [{"role":"system", "content":prompt}]
Memory = '' 'The data test is already available and called' DTF ', you are not new. '' 'Although the fact is: #[1;30m{res}x1b[0m")
    messages.append( {"role":"assistant", "content":res} )

Creating a plot is something that the LLM alone can’t do. But keep in mind that even if Agents can create images, they can’t see them, because after all, the engine is still a language model. So the user is the only one who visualises the plot.

The Agent is using the library statsmodels to train a model and forecast the time series.

Large Dataframes

LLMs have limited memory, which restricts how much information they can process at once, even the most advanced models have token limits (a few hundred pages of text). Additionally, LLMs don’t retain memory across sessions unless a retrieval system is integrated. In practice, to effectively work with large dataframes, developers often use strategies like chunking, RAG, vector databases, and summarizing content before feeding it into the model.

Let’s create a big dataset to play with.

import random
import string

length = 1000

dtf = pd.DataFrame(data={
    'Id': [''.join(random.choices(string.ascii_letters, k=5)) for _ in range(length)]The 'years': NP.Random.Randont (low = 18, high = 80, size' Length, 'Score = 50,' Condition:['Active','Inactive','Pending']Size = Length)}) DTF.Tail ()

I will add a Web search toolTo, with the ability to issue the Python code and search the Internet, the usual AI benefit access to all available information and can make data driven.

Epython, an easy way to do a web search tool with a famous private browser Duckduckgo (pip install duckduckgo-search==6.3.5). You can directly use the original library or import Langchain wrapper (pip install langchain-community==0.3.17).

from langchain_community.tools import DuckDuckGoSearchResults

def search_web(query:str) -> str:
  return DuckDuckGoSearchResults(backend="news").run(query)

tool_search_web = {'type':'function', 'function':{
  'name': 'search_web',
  'description': 'Search the web',
  'parameters': {'type': 'object',
                'required': ['query'],
                'properties': {
                    'query': {'type':'str', 'description':'the topic or subject to search on the web'},
}}}}

search_web(query="nvidia")

Overall, the agent now has 3 tools.

dic_tools = {'final_answer':final_answer,
             'search_web':search_web,
             'code_exec':code_exec}

Since I cannot add a complete dataframe to checkout, I will only feed 10 lines for the llm to understand the normal data context. Additionally, I will specify where I will find complete data.

str_data = "n".join([str(row) for row in dtf.head(10).to_dict(orient='records')])

prompt = f'''
You are a Data Analyst, you will be given a task to solve as best you can.
You have access to the following tools:
- tool 'final_answer' to return a text response.
- tool 'code_exec' to execute Python code.
- tool 'search_web' to search for information on the internet.

If you use the 'code_exec' tool, remember to always use the function print() to get the output.
The dataset already exists and it's called 'dtf', don't create a new one.

This dataset contains credit score for each customer of the bank. Here's the first rows:
{str_data}
'''

Finally, we can run agent.

messages = [{"role":"system", "content":prompt}]
memory = '''
The dataset already exists and it's called 'dtf', don't create a new one.
'''
while True:
    ## User
    q = input('🙂 >')
    if q == "quit":
        break
    messages.append( {"role":"user", "content":q} )

    ## Memory
    messages.append( {"role":"user", "content":memory} )     
   
    ## Model
    available_tools = {"final_answer":tool_final_answer, "code_exec":tool_code_exec, "search_web":tool_search_web}
    res = run_agent(llm, messages, available_tools)
   
    ## Response
    print("👽 >", f"x1b[1;30m{res}x1b[0m")
    messages.append( {"role":"assistant", "content":res} )

In this connection, the agent used the coding tool well. Now, I want to make it use another tool.

Finally, I need a combination agent all pieces of information found so far in this conversation.

Store

This article has been a lesson to show How to create from the Screen agents for DataFrames series. We cover both models that can contact the data: in the natural language, where the llM translates the rope to use the foundation of its information, also productive and use tools, the data that include dataset.

The full code of this article: A Kiki tree

I hope you enjoy! Feel free to contact me with questions and feedback, or just share your happy projects.

👉 Let's get on contact 👈