Machine Learning

From data to news: KPI code codes relate

We usually need to investigate what is happening with KPIS: that we respond to anomomalies in our dashboards or just do it with Numbers Pusting. Based on my experience as a KPI analyst, I estimated that 80% of the familiar activities could also be solved in accordance with a simple checklist.

Here is a high standard of investigating KPI changes (You can get more details on the article “Anomal root cause of 101”):

  • Rate the maximum of metric changes understanding the extent of change.
  • Check the quality of data To ensure numbers are accurate and honest.
  • Mix the context About internal and external events that may have influenced change.
  • Cut and get the metric to identify which parts contribute to metric changes.
  • Mix the findings in the maximum abbreviation that includes hypotheses and ratings of their impacts in the main KPI.

With a clear plan of action, such activities can be automatically applied using AI agents. The newly discussed agents can be appropriate for the right way, such as their ability to enable and issue the code will help them evaluate the data properly, backwards. So, let's try to create such an agent using a framework for HuggingsCice Smoleates.

While working in our work, we will discuss the advanced features of the Smolagents:

  • Techniques to travel all kinds of encouragement to ensure the behavior you want.
  • To create a multi-agent agent that can explain the KPPi changes and link them to Root Causes.
  • Adding a display of flow with additional editing steps.

MVP to explain KPI changes

As usual, we will take a way to measure and start with a simple MVP, focusing on heating and waste of analysis. We will evaluate the simplified metric changes (income) separated in size (country). We will use the dataset in my previous article, “the KPI's concept change”.

Let's download details.

raw_df = pd.read_csv('absolute_metrics_example.csv', sep = 't')
df = raw_df.groupby('country')[['revenue_before', 'revenue_after_scenario_2']].sum()
  .sort_values('revenue_before', ascending = False).rename(
    columns = {'revenue_after_scenario_2': 'after', 
      'revenue_before': 'before'})
Photo by the writer

Next, let's start model. I have chosen Opelai GPT-4O-mini as my favorite option for simple jobs. However, the vegetation frame supports all types of models, so that you can use your favorite model. Then, we just need to create an agent and give you a job and the data.

from smolagents import CodeAgent, LiteLLMModel

model = LiteLLMModel(model_id="openai/gpt-4o-mini", 
  api_key=config['OPENAI_API_KEY']) 

agent = CodeAgent(
    model=model, tools=[], max_steps=10,
    additional_authorized_imports=["pandas", "numpy", "matplotlib.*", 
      "plotly.*"], verbosity_level=1 
)

task = """
Here is a dataframe showing revenue by segment, comparing values 
before and after.
Could you please help me understand the changes? Specifically:
1. Estimate how the total revenue and the revenue for each segment 
have changed, both in absolute terms and as a percentage.
2. Calculate the contribution of each segment to the total 
change in revenue.

Please round all floating-point numbers in the output 
to two decimal places.
"""

agent.run(
    task,
    additional_args={"data": df},
)

The agent returned a good result. We have received detailed statistics for the metric changes in each part and the impact of the high KPI.

{'total_before': 1731985.21, 'total_after': 
1599065.55, 'total_change': -132919.66, 'segment_changes': 
{'absolute_change': {'other': 4233.09, 'UK': -4376.25, 'France': 
-132847.57, 'Germany': -690.99, 'Italy': 979.15, 'Spain': 
-217.09}, 'percentage_change': {'other': 0.67, 'UK': -0.91, 
'France': -55.19, 'Germany': -0.43, 'Italy': 0.81, 'Spain': 
-0.23}, 'contribution_to_change': {'other': -3.18, 'UK': 3.29, 
'France': 99.95, 'Germany': 0.52, 'Italy': -0.74, 'Spain': 0.16}}}

Let's look at the code generated by the agent. Okay, but there is a single possible place. The llm also restarted the dataframe based on the replacement details. This method is good (especially when working with major details), as it can lead to errors and the use of higher Token.

import pandas as pd                                                                                                        
 
# Creating the DataFrame from the provided data                 
data = {                                                        
    'before': [632767.39, 481409.27, 240704.63, 160469.75,      
120352.31, 96281.86],                                           
    'after': [637000.48, 477033.02, 107857.06, 159778.76,       
121331.46, 96064.77]                                            
}                                                               
index = ['other', 'UK', 'France', 'Germany', 'Italy', 'Spain']  
df = pd.DataFrame(data, index=index)                            
                                                                
# Calculating total revenue before and after                    
total_before = df['before'].sum()                               
total_after = df['after'].sum()                                 
                                                                
# Calculating absolute and percentage change for each segment   
df['absolute_change'] = df['after'] - df['before']              
df['percentage_change'] = (df['absolute_change'] /              
df['before']) * 100                                             
                                                                
# Calculating total revenue change                              
total_change = total_after - total_before                       
                                                                
# Calculating contribution of each segment to the total change  
df['contribution_to_change'] = (df['absolute_change'] /         
total_change) * 100                                             
                                                                
# Rounding results                                              
df = df.round(2)                                                
                                                                
# Printing the calculated results                               
print("Total revenue before:", total_before)                    
print("Total revenue after:", total_after)                      
print("Total change in revenue:", total_change)                 
print(df)

It is good to address this problem before moving forward and building a complex system.

Tweaking Prompts

As the llm is simply following the instructions you provided, we will face this issue by finding immediate.

At first, I tried to make the work so far clear, educating the llm to use the flexibility.

task = """Here is a dataframe showing revenue by segment, comparing 
values before and after. The data is stored in df variable. 
Please, use it and don't try to parse the data yourself. 

Could you please help me understand the changes?
Specifically:
1. Estimate how the total revenue and the revenue for each segment 
have changed, both in absolute terms and as a percentage.
2. Calculate the contribution of each segment to the total change in revenue.

Please round all floating-point numbers in the output to two decimal places.
"""

Did not work. Therefore, the next step is to quickly test the program and see why it works in this way.

print(agent.prompt_templates['system_prompt'])

#... 
# Here are the rules you should always follow to solve your task:
# 1. Always provide a 'Thought:' sequence, and a 'Code:n```py' sequence ending with '```' sequence, else you will fail.
# 2. Use only variables that you have defined.
# 3. Always use the right arguments for the tools. DO NOT pass the arguments as a dict as in 'answer = wiki({'query': "What is the place where James Bond lives?"})', but use the arguments directly as in 'answer = wiki(query="What is the place where James Bond lives?")'.
# 4. Take care to not chain too many sequential tool calls in the same code block, especially when the output format is unpredictable. For instance, a call to search has an unpredictable return format, so do not have another tool call that depends on its output in the same block: rather output results with print() to use them in the next block.
# 5. Call a tool only when needed, and never re-do a tool call that you previously did with the exact same parameters.
# 6. Don't name any new variable with the same name as a tool: for instance don't name a variable 'final_answer'.
# 7. Never create any notional variables in our code, as having these in your logs will derail you from the true variables.
# 8. You can use imports in your code, but only from the following list of modules: ['collections', 'datetime', 'itertools', 'math', 'numpy', 'pandas', 'queue', 'random', 're', 'stat', 'statistics', 'time', 'unicodedata']
# 9. The state persists between code executions: so if in one step you've created variables or imported modules, these will all persist.
# 10. Don't give up! You're in charge of solving the task, not providing directions to solve it.
# Now Begin!

At the end of the speedy, we have commands "# 2. Use only variables that you have defined!". This can be interpreted as a strong rule of not using other variables. So, I changed her "# 2. Use only variables that you have defined or ones provided in additional arguments! Never try to copy and parse additional arguments."

modified_system_prompt = agent.prompt_templates['system_prompt']
    .replace(
        '2. Use only variables that you have defined!', 
        '2. Use only variables that you have defined or ones provided in additional arguments! Never try to copy and parse additional arguments.'
    )
agent.prompt_templates['system_prompt'] = modified_system_prompt

This only change did not help. Then, I checked the message.

╭─────────────────────────── New run ────────────────────────────╮
│                                                                │
│ Here is a pandas dataframe showing revenue by segment,         │
│ comparing values before and after.                             │
│ Could you please help me understand the changes?               │
│ Specifically:                                                  │
│ 1. Estimate how the total revenue and the revenue for each     │
│ segment have changed, both in absolute terms and as a          │
│ percentage.                                                    │
│ 2. Calculate the contribution of each segment to the total     │
│ change in revenue.                                             │
│                                                                │
│ Please round all floating-point numbers in the output to two   │
│ decimal places.                                                │
│                                                                │
│ You have been provided with these additional arguments, that   │
│ you can access using the keys as variables in your python      │
│ code:                                                          │
│ {'df':             before      after                           │
│ country                                                        │
│ other    632767.39  637000.48                                  │
│ UK       481409.27  477033.02                                  │
│ France   240704.63  107857.06                                  │
│ Germany  160469.75  159778.76                                  │
│ Italy    120352.31  121331.46                                  │
│ Spain     96281.86   96064.77}.                                │
│                                                                │
╰─ LiteLLMModel - openai/gpt-4o-mini ────────────────────────────╯

It has a command related to the use of additional issues "You have been provided with these additional arguments, that you can access using the keys as variables in your python code". We can try to make it well and clear. Unfortunately, this parameter was not exposed, so I had to find it in the source code. Finding a Python Package method, we can use the following code.

import smolagents 
print(smolagents.__path__)

Then I found agents.py file and change this thread to enter direct command.

self.task += f"""
You have been provided with these additional arguments available as variables 
with names {",".join(additional_args.keys())}. You can access them directly. 
Here is what they contain (just for informational purposes): 
{str(additional_args)}."""

It was a small hacking thing, but that sometimes happens according to the LLM structures. Don't forget to reload the package later, and it's good to go. Let's examine that it works now.

task = """
Here is a pandas dataframe showing revenue by segment, comparing values 
before and after. 

Your task will be understand the changes to the revenue (after vs before) 
in different segments and provide executive summary.
Please, follow the following steps:
1. Estimate how the total revenue and the revenue for each segment 
have changed, both in absolute terms and as a percentage.
2. Calculate the contribution of each segment to the total change 
in revenue.

Round all floating-point numbers in the output to two decimal places. 
"""
agent.logger.level = 1 # Lower verbosity level
agent.run(
    task,
    additional_args={"df": df},
)

Hooray! The problem is fixed. The agent is no longer a transparency of the installation and references df variable right instead. Here is the newly produced code.

import pandas as pd                                             
                                                                  
# Calculate total revenue before and after                      
total_before = df['before'].sum()                               
total_after = df['after'].sum()                                 
total_change = total_after - total_before                       
percentage_change_total = (total_change / total_before * 100)   
if total_before != 0 else 0                                     
                                                                
# Round values                                                  
total_before = round(total_before, 2)                           
total_after = round(total_after, 2)                             
total_change = round(total_change, 2)                           
percentage_change_total = round(percentage_change_total, 2)     
                                                                
# Display results                                               
print(f"Total Revenue Before: {total_before}")                  
print(f"Total Revenue After: {total_after}")                    
print(f"Total Change: {total_change}")                          
print(f"Percentage Change: {percentage_change_total}%")

Now, we are ready to continue to build a real agent that will solve our work.

Ai analysis agent of KPI

Finally, it is time to work on AI agent that will help us explain the KPI changes and build a higher summary.

Our agent will follow the program in order to evaluate the root root:

  • Rate the KPI Top KPI changes.
  • Cut out and remove the metric to understand which parts that make changes.
  • View events in Change Log to see if they can describe the metric changes.
  • Merchanda all findings in the upper maximum tip.

After many tests and several tweaks, I arrived with the promising effect. Here's the main correction I've done (we will discuss later):

  • Blow Multiple agents By adding another group member – change log log, able to access the change log and help describe KPI changes.
  • I have tried Models are more powerful like gpt-4o including gpt-4.1-mini since gpt-4o-mini had been inadequate. Using powerful models have not only been developed results, but also reduced the number of steps: with gpt-4.1-miniI found the last result after six steps, compared to 14-16 steps with gpt-4o-mini. This suggests that investing in expensive models can benefit the Agentic function flow.
  • Give me an agent with A complicated tool Analyzing ki-metal changes. The instrument makes all counts, while the llm may not specify the results. I discussed the way to change the KPI with a detailed analysis in my previous article.
  • I have re-transformed the quickness into Wipe up step guide on step Helping the agent is to stay track.
  • Add me to Edit Steps That promotes LYM agent to think of it first and organize the program all three Iterations.

After all repairs, I received the next summary from the agent, good beautiful.

Executive Summary:
Between April 2025 and May 2025, total revenue declined sharply by
approximately 36.03%, falling from 1,731,985.21 to 1,107,924.43, a
drop of -624,060.78 in absolute terms.
This decline was primarily driven by significant revenue 
reductions in the 'new' customer segments across multiple 
countries, with declines of approximately 70% in these segments.

The most impacted segments include:
- other_new: before=233,958.42, after=72,666.89, 
abs_change=-161,291.53, rel_change=-68.94%, share_before=13.51%, 
impact=25.85, impact_norm=1.91
- UK_new: before=128,324.22, after=34,838.87, 
abs_change=-93,485.35, rel_change=-72.85%, share_before=7.41%, 
impact=14.98, impact_norm=2.02
- France_new: before=57,901.91, after=17,443.06, 
abs_change=-40,458.85, rel_change=-69.87%, share_before=3.34%, 
impact=6.48, impact_norm=1.94
- Germany_new: before=48,105.83, after=13,678.94, 
abs_change=-34,426.89, rel_change=-71.56%, share_before=2.78%, 
impact=5.52, impact_norm=1.99
- Italy_new: before=36,941.57, after=11,615.29, 
abs_change=-25,326.28, rel_change=-68.56%, share_before=2.13%, 
impact=4.06, impact_norm=1.91
- Spain_new: before=32,394.10, after=7,758.90, 
abs_change=-24,635.20, rel_change=-76.05%, share_before=1.87%, 
impact=3.95, impact_norm=2.11

Based on analysis from the change log, the main causes for this 
trend are:
1. The introduction of new onboarding controls implemented on May 
8, 2025, which reduced new customer acquisition by about 70% to 
prevent fraud.
2. A postal service strike in the UK starting April 5, 2025, 
causing order delivery delays and increased cancellations 
impacting the UK new segment.
3. An increase in VAT by 2% in Spain as of April 22, 2025, 
affecting new customer pricing and causing higher cart 
abandonment.

These factors combined explain the outsized negative impacts 
observed in new customer segments and the overall revenue decline.

The llm agent also produced a pile of charts (they were part of our tool). For example, this one shows the impacts of the combination of the country and maturity.

Photo by the writer

Results look really fun. Now let's attack the depth in real use to understand how it works under the hood.

Multi-Ai's agent setup

We will start with our log agent. This MENTI will ask change log and try to get the causes possible in the root of the metric change. Since this agent does not need to perform complex tasks, we use it as a TowerCallinglent Tool. Because this MENTI will be called by another agent, we need to explain name including description Qualities.

@tool 
def get_change_log(month: str) -> str: 
    """
    Returns the change log (list of internal and external events that might have affected our KPIs) for the given month 

    Args:
        month: month in the format %Y-%m-01, for example, 2025-04-01
    """
    return events_df[events_df.month == month].drop('month', axis = 1).to_dict('records')

model = LiteLLMModel(model_id="openai/gpt-4.1-mini", api_key=config['OPENAI_API_KEY'])
change_log_agent = ToolCallingAgent(
    tools=[get_change_log],
    model=model,
    max_steps=10,
    name="change_log_agent",
    description="Helps you find the relevant information in the change log that can explain changes on metrics. Provide the agent with all the context to receive info",
)

As a manager's agent who cost this agent, we cannot control the question you receive. Therefore, I decided to quickly convert the additional context to install the extra context.

change_log_system_prompt = '''
You're a master of the change log and you help others to explain 
the changes to metrics. When you receive a request, look up the list of events 
happened by month, then filter the relevant information based 
on provided context and return back. Prioritise the most probable factors 
affecting the KPI and limit your answer only to them.
'''

modified_system_prompt = change_log_agent.prompt_templates['system_prompt'] 
  + 'nnn' + change_log_system_prompt

change_log_agent.prompt_templates['system_prompt'] = modified_system_prompt

Enabling the main agent to transfer tasks to the active jobs of the log, we just need to explain it to the managed_agents field.

agent = CodeAgent(
    model=model,
    tools=[calculate_simple_growth_metrics],
    max_steps=20,
    additional_authorized_imports=["pandas", "numpy", "matplotlib.*", "plotly.*"],
    verbosity_level = 2, 
    planning_interval = 3,
    managed_agents = [change_log_agent]
)

Let's look at how it works. First, we can quickly look at the new editorial system of the main agent. Now includes information about group members and commands for how to request help.

You can also give tasks to team members.
Calling a team member works the same as for calling a tool: simply, 
the only argument you can give in the call is 'task'.
Given that this team member is a real human, you should be very verbose 
in your task, it should be a long string providing informations 
as detailed as necessary.
Here is a list of the team members that you can call:
```python
def change_log_agent("Your query goes here.") -> str:
    """Helps you find the relevant information in the change log that 
    can explain changes on metrics. Provide the agent with all the context 
    to receive info"""
```

The murder log indicates that the main agent has successfully provided the work in the second agent and received the following answer.

<-- Primary agent calling the change log agent -->

─ Executing parsed code: ─────────────────────────────────────── 
  # Query change_log_agent with the detailed task description     
  prepared                                                        
  context_for_change_log = (                                      
      "We analyzed changes in revenue from April 2025 to May      
  2025. We found large decreases "                                
      "mainly in the 'new' maturity segments across countries:    
  Spain_new, UK_new, Germany_new, France_new, Italy_new, and      
  other_new. "                                                    
      "The revenue fell by around 70% in these segments, which    
  have outsized negative impact on total revenue change. "        
      "We want to know the 1-3 most probable causes for this      
  significant drop in revenue in the 'new' customer segments      
  during this period."                                            
  )                                                               
                                                                  
  explanation = change_log_agent(task=context_for_change_log)     
  print("Change log agent explanation:")                          
  print(explanation)                                              
 ──────────────────────────────────────────────────────────────── 

<-- Change log agent execution start -->
╭──────────────────── New run - change_log_agent ─────────────────────╮
│                                                                     │
│ You're a helpful agent named 'change_log_agent'.                    │
│ You have been submitted this task by your manager.                  │
│ ---                                                                 │
│ Task:                                                               │
│ We analyzed changes in revenue from April 2025 to May 2025.         │
│ We found large decreases mainly in the 'new' maturity segments      │
│ across countries: Spain_new, UK_new, Germany_new, France_new,       │
│ Italy_new, and other_new. The revenue fell by around 70% in these   │
│ segments, which have outsized negative impact on total revenue      │
│ change. We want to know the 1-3 most probable causes for this       │
│ significant drop in revenue in the 'new' customer segments during   │
│ this period.                                                        │
│ ---                                                                 │
│ You're helping your manager solve a wider task: so make sure to     │
│ not provide a one-line answer, but give as much information as      │
│ possible to give them a clear understanding of the answer.          │
│                                                                     │
│ Your final_answer WILL HAVE to contain these parts:                 │
│ ### 1. Task outcome (short version):                                │
│ ### 2. Task outcome (extremely detailed version):                   │
│ ### 3. Additional context (if relevant):                            │
│                                                                     │
│ Put all these in your final_answer tool, everything that you do     │
│ not pass as an argument to final_answer will be lost.               │
│ And even if your task resolution is not successful, please return   │
│ as much context as possible, so that your manager can act upon      │
│ this feedback.                                                      │
│                                                                     │
╰─ LiteLLMModel - openai/gpt-4.1-mini ────────────────────────────────╯

Using a smolagents framework, we can easily set a simple agent of a multi-agent, where the manager links and donates functions to group members with certain skills.

It is quickly applied

We started with a high-quality agreement describing unclear goal and guidance, but unfortunately, it did not work unexpectedly. Llms is not smart enough for the moment to find the way alone. Therefore, I immediately created a detailed step on the descriptive step and includes the detailed information of the developmental accounts.

task = """
Here is a pandas dataframe showing the revenue by segment, comparing values 
before (April 2025) and after (May 2025). 

You're a senior and experienced data analyst. Your task will be to understand 
the changes to the revenue (after vs before) in different segments 
and provide executive summary.

## Follow the plan:
1. Start by udentifying the list of dimensions (columns in dataframe that 
are not "before" and "after")
2. There might be multiple dimensions in the dataframe. Start high-level 
by looking at each dimension in isolation, combine all results 
together into the list of segments analysed (don't forget to save 
the dimension used for each segment). 
Use the provided tools to analyse the changes of metrics: {tools_description}. 
3. Analyse the results from previous step and keep only segments 
that have outsized impact on the KPI change (absolute of impact_norm 
is above 1.25). 
4. Check what dimensions are present in the list of significant segment, 
if there are multiple ones - execute the tool on their combinations 
and add to the analysed segments. If after adding an additional dimension, 
all subsegments show close different_rate and impact_norm values, 
then we can exclude this split (even though impact_norm is above 1.25), 
since it doesn't explain anything. 
5. Summarise the significant changes you identified. 
6. Try to explain what is going on with metrics by getting info 
from the change_log_agent. Please, provide the agent the full context 
(what segments have outsized impact, what is the relative change and 
what is the period we're looking at). 
Summarise the information from the changelog and mention 
only 1-3 the most probable causes of the KPI change 
(starting from the most impactful one).
7. Put together 3-5 sentences commentary what happened high-level 
and why (based on the info received from the change log). 
Then follow it up with more detailed summary: 
- Top-line total value of metric before and after in human-readable format, 
absolute and relative change 
- List of segments that meaningfully influenced the metric positively 
or negatively with the following numbers: values before and after, 
absoltue and relative change, share of segment before, impact 
and normed impact. Order the segments by absolute value 
of absolute change since it represents the power of impact. 

## Instruction on the calculate_simple_growth_metrics tool:
By default, you should use the tool for the whole dataset not the segment, 
since it will give you the full information about the changes.

Here is the guidance how to interpret the output of the tool
- difference - the absolute difference between after and before values
- difference_rate - the relative difference (if it's close for 
  all segments then the dimension is not informative)
- impact - the share of KPI differnce explained by this segment 
- segment_share_before - share of segment before
- impact_norm - impact normed on the share of segments, we're interested 
  in very high or very low numbers since they show outsized impact, 
  rule of thumb - impact_norm between -1.25 and 1.25 is not-informative 

If you're using the tool on the subset of dataframe keep in mind, 
that the results won't be aplicable to the full dataset, so avoid using it 
unless you want to explicitly look at subset (i.e. change in France). 
If you decided to use the tool on a particular segment 
and share these results in the executive summary, explicitly outline 
that we're diving deeper into a particular segment.
""".format(tools_description = tools_description)
agent.run(
    task,
    additional_args={"df": df},
)

Defining everything in detail was a terrible job, but it is necessary if we want the fixed results.

Edit Steps

The Smolagents Framework allows you to add to your Agentic flows. This encourages agent to start with the program and update after the specified number of steps. From my experience, this shows the most helpful in keeping the focus on the problem and change of actions to remain aligned with the original program and purpose. I definitely commend them to situations where complex thinking is needed.

Setup is easy as easy as to specify planning_interval = 3 to the code agent.

agent = CodeAgent(
    model=model,
    tools=[calculate_simple_growth_metrics],
    max_steps=20,
    additional_authorized_imports=["pandas", "numpy", "matplotlib.*", "plotly.*"],
    verbosity_level = 2, 
    planning_interval = 3,
    managed_agents = [change_log_agent]
)

That's all. Then, the agent provides the first demonstration of the first process.

────────────────────────── Initial plan ──────────────────────────
Here are the facts I know and the plan of action that I will 
follow to solve the task:
```
## 1. Facts survey

### 1.1. Facts given in the task
- We have a pandas dataframe `df` showing revenue by segment, for 
two time points: before (April 2025) and after (May 2025).
- The dataframe columns include:
  - Dimensions: `country`, `maturity`, `country_maturity`, 
`country_maturity_combined`
  - Metrics: `before` (revenue in April 2025), `after` (revenue in
May 2025)
- The task is to understand the changes in revenue (after vs 
before) across different segments.
- Key instructions and tools provided:
  - Identify all dimensions except before/after for segmentation.
  - Analyze each dimension independently using 
`calculate_simple_growth_metrics`.
  - Filter segments with outsized impact on KPI change (absolute 
normed impact > 1.25).
  - Examine combinations of dimensions if multiple dimensions have
significant segments.
  - Summarize significant changes and engage `change_log_agent` 
for contextual causes.
  - Provide a final executive summary including top-line changes 
and segment-level detailed impacts.
- Dataset snippet shows segments combining countries (`France`, 
`UK`, `Germany`, `Italy`, `Spain`, `other`) and maturity status 
(`new`, `existing`).
- The combined segments are uniquely identified in columns 
`country_maturity` and `country_maturity_combined`.

### 1.2. Facts to look up
- Definitions or descriptions of the segments if unclear (e.g., 
what defines `new` vs `existing` maturity).
  - Likely not mandatory to proceed, but could be requested from 
business documentation or change log.
- More details on the change log (accessible via 
`change_log_agent`) that could provide probable causes for revenue
changes.
- Confirmation on handling combined dimension splits - how exactly
`country_maturity_combined` is formed and should be interpreted in
combined dimension analysis.
- Data dictionary or description of metrics if any additional KPI 
besides revenue is relevant (unlikely given data).
- Dates confirm period of analysis: April 2025 (before) and May 
2025 (after). No need to look these up since given.

### 1.3. Facts to derive
- Identify all dimension columns available for segmentation:
  - By excluding 'before' and 'after', likely candidates are 
`country`, `maturity`, `country_maturity`, and 
`country_maturity_combined`.
- For each dimension, calculate change metrics using the given 
tool:
  - Absolute and relative difference in revenue per segment.
  - Impact, segment share before, and normed impact for each 
segment.
- Identify which segments have outsized impact on KPI change 
(|impact_norm| > 1.25).
- If multiple dimensions have significant segments, combine 
dimensions (e.g., country + maturity) and reanalyze.
- Determine if combined dimension splits provide meaningful 
differentiation or not, based on delta rate and impact_norm 
consistency.
- Summarize direction and magnitude of KPI changes at top-line 
level (aggregate revenue before and after).
- Identify top segments driving positive and negative changes 
based on ordered absolute absolute_change.
- Gather contextual insights from the change log agent regarding 
probable causes tied to significant segments and the May 2025 vs 
April 2025 period.

## 2. Plan

1. Identify all dimension columns present in the dataframe by 
listing columns and excluding 'before' and 'after'.
2. For each dimension identified (`country`, `maturity`, 
`country_maturity`, `country_maturity_combined`):
   - Use `calculate_simple_growth_metrics` on the full dataframe 
grouped by that dimension.
   - Extract segments with calculated metrics including 
impact_norm.
3. Aggregate results from all single-dimension analyses and filter
segments where |impact_norm| > 1.25.
4. Determine which dimensions these significant segments belong 
to.
5. If more than one dimension is represented in these significant 
segments, analyze the combined dimension formed by those 
dimensions (for example, combination of `country` and `maturity` 
or use existing combined dimension columns).
6. Repeat metric calculation using 
`calculate_simple_growth_metrics` on the combined dimension.
7. Examine if the combined dimension splits create meaningful 
differentiation - if all subsegments show close difference_rate 
and impact_norm, exclude the split.
8. Prepare a summary of significant changes:
   - Top-line KPIs before and after (absolute and relative 
changes).
   - List of impactful segments sorted by absolute absolute_change
that influenced overall revenue.
9. Provide the list of segments with details (values before, 
after, absolute and relative change, share before, impact, 
impact_norm).
10. Using this summarized information, query `change_log_agent` 
with full context:
    - Include significant segments, their relative changes, and 
periods (April to May 2025).
11. Process the agent's response to identify 1-3 main probable 
causes of the KPI changes.
12. Draft executive summary commentary:
    - High-level overview of what happened and why, based on log 
info.
    - Detailed summary including top-line changes and 
segment-level metrics impact.
13. Deliver the final answer using `final_answer` tool containing 
the above executive summary and data-driven insights.

Then, after three three steps, agent returns and update the program.

────────────────────────── Updated plan ──────────────────────────
I still need to solve the task I was given:
```

Here is a pandas dataframe showing the revenue by segment, 
comparing values before (April 2025) and after (May 2025). 

You're a senior and experienced data analyst. Your task will be 
understand the changes to the revenue (after vs before) in 
different segments 
and provide executive summary.

<... repeating the full initial task ...>
```

Here are the facts I know and my new/updated plan of action to 
solve the task:
```
## 1. Updated facts survey

### 1.1. Facts given in the task
- We have a pandas dataframe with revenue by segment, showing 
values "before" (April 2025) and "after" (May 2025).
- Columns in the dataframe include multiple dimensions and the 
"before" and "after" revenue values.
- The goal is to understand revenue changes by segment and provide
an executive summary.
- Guidance and rules about how to analyze and interpret results 
from the `calculate_simple_growth_metrics` tool are provided.
- The dataframe contains columns: country, maturity, 
country_maturity, country_maturity_combined, before, after.

### 1.2. Facts that we have learned
- The dimensions to analyze are: country, maturity, 
country_maturity, and country_maturity_combined.
- Analyzed revenue changes by dimension.
- Only the "new" maturity segment has significant impact 
(impact_norm=1.96 > 1.25), with a large negative revenue change (~
-70.6%).
- In the combined segment "country_maturity," the "new" segments 
across countries (Spain_new, UK_new, Germany_new, France_new, 
Italy_new, other_new) all have outsized negative impacts with 
impact_norm values all above 1.9.
- The mature/existing segments in those countries have smaller 
normed impacts below 1.25.
- Country-level and maturity-level segment dimension alone are 
less revealing than the combined country+maturity segment 
dimension which highlights the new segments as strongly impactful.
- Total revenue dropped substantially from before to after, mostly
driven by new segments shrinking drastically.

### 1.3. Facts still to look up
- Whether splitting the data by additional dimensions beyond 
country and maturity (e.g., country_maturity_combined) explains 
further heterogeneous impacts or if the pattern is uniform.
- Explanation/context from change log about what caused the major 
drop predominantly in new segments in all countries.
- Confirming whether any country within the new segment behaved 
differently or mitigated losses.

### 1.4. Facts still to derive
- A concise executive summary describing the top-level revenue 
change and identifying which segments explain the declines.
- Explanation involving the change log agent with summary of 
probable reasons for these outsized reductions in revenue in the 
new segments across countries for April-May 2025.

## 2. Plan

### 2.1. Verify if adding the additional dimension 
'country_maturity_combined' splits the impactful "new" segments 
into subsegments with significantly different impacts or if the 
change rates and normed impacts are relatively homogeneous. If 
homogeneous, we do not gain deeper insight and should disregard 
further splitting.

### 2.2. Summarize all significant segments identified with 
outsized impact_norm ≥ 1.25, including their before and after 
values, absolute and relative changes, segment shares before, 
impact, and normalized impact, ordered by absolute value of the 
change.

### 2.3. Query the change_log_agent with the full context: 
significant segments are the new country_maturity segments with 
large negative changes (~ -70%), timeframe April 2025 to May 2025,
and request top 1-3 most probable causes for the KPI revenue drop 
in these segments.

### 2.4. Based on the change log agent's response, synthesize a 
3-5 sentence high-level commentary explaining what happened 
broadly and why.

### 2.5. Draft a detailed executive summary including:
- Total revenue before and after in human-readable format with 
absolute and relative change.
- A list of significant segments driving these changes, in order 
by absolute impact, with detailed numbers (before, after, absolute
and relative change, segment share before, impact, normed impact).

### 2.6. Use the `final_answer` tool to produce the finalized 
executive summary report.

I really like the upgrading agent to re-write in the first work and stay focused on a big problem. Always meditating as this is helpful in real life, as the groups are often filled with the processes and lose it possible after what they are doing. It is very good to see good administrative habits associated in Agentic programs.

That's all! We have created a code agent that can analyze the Metrics for simple metric changes and test all the key nuances of the process.

You can find the full code and the murder logs.

Summary

We've tried a lot about codes agents and is now ready to draw conclusions. Our testing, we used a HuggingsCice Smolagents Framework of Code agents – the useful useful tool:

  • Simple combination and different llms (from local models through Ollama to government providers such as anthropic or Opelai),
  • Outstanding signing that makes it easier to understand the whole process of the agent problems and problems,
  • The ability to build complicated systems to enslave the multi-ai agent to set or edit features without great effort.

While smolagents are currently my Agentic Framework, it has its limitations:

  • Not to allow occasional fluctuations. For example, I had to change right away from the source code to get the behavior I wanted.
  • Support only ejerar-up-up setup
  • No support for long-term memory is out of the box, which means you start from the beginning with all the work.

Thank you so much for reading this article. I hope this article had understanding to you.

Indication

This article is inspired by the “Codeting agents to synchronize the face of Socembling” a short course of relief.

Source link

Related Articles

Leave a Reply

Your email address will not be published. Required fields are marked *

Back to top button