Product lifecycle score: How I reduced critical incidents by 35% with integrated monitoring and N8N automation

For SAAS (software as a service) companies, monitoring and managing their product data is essential. For those who fail to understand this, when they see the incident. The damage has already been done. Difficult companies, This can be fatal.
To prevent this, I created an N8N Workflow connected to their database that will analyze the data every day, counting when there is an incident. In this case, the log and notification system will start investigating as soon as possible. I also built a dashboard so the team could see the results in real time.
Total
The B2B Saas platform responsible for data visualization and automated reporting serves approximately 4500 customers, distributed across three categories:
- Small Business
- Central market
- Business
Weekly product usage exceeds 30000 active accounts with strong reliance on real-time data (pipelines, APIs, dashboards, background operations).
The product team is very active:
- Growth (acquisition, activation, onboarding)
- Revenue (Price, ARPU, Churn)
- SRE / infrastructure (reliability, availability)
- Data engineering (pipelines, data mining)
- Customer support and success

Last year, the company saw a rising number of incidents. Between October and December, the number of incidents increased from 250 to 450, an 80% increase. During this expansion, there were more than 45 high-profile and serious incidents that affected thousands of users. The most affected metrics are:
- API_Error_Rande
- Checkout_success_ande
- net_mrr_delta
- data_freatness_lag_minutes
- Churn_Rande

When events occur, a The company was judged by its customers based on how they treat and respond. While The product team is respected How they handled it and made sure it never happens again.
Having an incident once is possible, but having the same incident twice is wrong.
Business impact
- More flexibility in recurring revenue
- Visible Decrease in active accounts over several consecutive weeks
- Many Business customers report and complain about outdated dashboard (up to 45+ minutes late)

In total, between 30000 and 60000 users are involved. Customer confidence in product reliability also suffered. Among the unrenewed, 45% indicated that it was their main reason.
Why is this story so important?
As a data platform, a company cannot afford to:
- Slow or stale data
- API error
- Plumbing failure
- Missed or Delayed sync
- Wrong dashboard
- Churns (Downgrades, Cancellations)

Inside, events spread across several systems:
- Product tracking ideas
- Slack for alerts
- PostgreSQL for storage
- Even on Google Sheets for customer support

There was no single source of truth. The product team must cross-check and double-check every detail, looking for trends and integrations. It was investigating and solving the puzzle, causing them to lose so many hours every week.
The solution: Modifying the beginning of the incident system with N8N and creating a data dashboard. Therefore, events are detected, tracked, resolved and understood.
Why N8N?
Currently, there are several automation platforms and solutions. But not everyone has the same needs and wants. Choosing the right one that follows the need is important.
Certain requirements had to be able to access the database without the required API (N8N supports the API), to have a visible workflow and the nodes of a non-technical person to understand, nodes with codes without skills, options caused by costs in measurements. So, among the existing platforms like zapier, do or N8N, the choice was the last one.
Designing a Product Health Score

First, key metrics need to be determined and created.
Impact: Simple function of hearity + delta + users users
impact_score = (
severity_weights[severity] * 10
+ abs(delta_pct) * 0.8
+ np.log1p(affected_users)
)
impact_score = round(float(impact_score), 2)
Authority: Derived from source + influence
if severity == "critical" or impact_score > 60:
priority = "P1"
elif severity == "high" or impact_score > 40:
priority = "P2"
elif severity == "medium":
priority = "P3"
else:
priority = "P4"
Product life score
def compute_product_health_score(incidents, metrics):
"""
Score = 100 - sum(penalties)
Production version handles 15+ factors
"""
# Key insight: penalties have different max weights
penalties = {
'volume': min(40, incident_rate * 13), # 40% max
'severity': calculate_severity_sum(incidents), # 25% max
'users': min(15, log(users) / log(50000) * 15), # 15% max
'trends': calculate_business_trends(metrics) # 20% max
}
score = 100 - sum(penalties.values())
if score >= 80: return score, "🟢 Stable"
elif score >= 60: return score, "🟡 Under watch"
else: return score, "🔴 At risk"
Designing an automatic discovery program with N8N

This program is compiled 4 streams:
- Stream 1: Retrieves the latest actionable metrics, identifies unusual gaps in Churn MRR, and creates events when needed.

const rows = items.map(item => item.json);
if (rows.length < 8) {
return [];
}
rows.sort((a, b) => new Date(a.date) - new Date(b.date));
const values = rows.map(r => parseFloat(r.churn_mrr || 0));
const lastIndex = rows.length - 1;
const lastRow = rows[lastIndex];
const lastValue = values[lastIndex];
const window = 7;
const baselineValues = values.slice(lastIndex - window, lastIndex);
const mean = baselineValues.reduce((s, v) => s + v, 0) / baselineValues.length;
const variance = baselineValues
.map(v => Math.pow(v - mean, 2))
.reduce((s, v) => s + v, 0) / baselineValues.length;
const std = Math.sqrt(variance);
if (std === 0) {
return [];
}
const z = (lastValue - mean) / std;
const deltaPct = mean === 0 ? null : ((lastValue - mean) / mean) * 100;
if (z > 2) {
const anomaly = {
date: lastRow.date,
metric_name: 'churn_mrr',
baseline_value: mean,
actual_value: lastValue,
z_score: z,
delta_pct: deltaPct,
severity:
deltaPct !== null && deltaPct > 50 ? 'high'
: deltaPct !== null && deltaPct > 25 ? 'medium'
: 'low',
};
return [{ json: anomaly }];
}
return [];
- Stream 2: Monitors enter usage metrics to detect sudden drops in adoption or engagement.
Incidents are logged with severity, context, and notifications to the product team.

- Broadcast 3: With every open event, you collect additional context from the database (eg Churn by country or program), use AI to generate a clear diagram, send the next summary report, send an email and update the event

- Stream 4: Every morning, the workflow combines all the events from the previous day, creates a visualization page for the documentation and sends a report to the leadership team

We posted areas of similar findings for 8 different metrics, adjusting the Z-Score method based on whether increases or decreases were problematic.
This page AID agent gets more context With SQL queries (Churn by country, by system, by segment) to produce accurate roots that cause hypotheses. And all this data is collected and sent in a daily email.
Workflow creates daily summary reports that include all events by metric and severity, distributed by email and slaked.
Dashboard
The dashboard consolidates all signals in one place. Automatic product life score with Base 0-100 calculated by:
- type of events
- size weight
- Open a settled vs
- Number of affected users
- Business Trends (MRR)
- Usage Patterns (active accounts)
Segmentation to identify which customer groups are most affected:

Secure Heatmap Weekly and Gaming Charts to find repeating patterns:

And a detailed observational scenario by:
- Business context
- Size and proportion
- The root cause of the hypothesis
- The type of events
- An AI acronym to accelerate communication and diagnostics from the N8N workflow

Diagnosis:
The product life score marked the actual product 24/100 in shape “At Risk” with:
- 45 High and critical events
- The 36 Events in the last 7 days
- 33,385 Estimated user impact on Churn and Dau
- Several gaps in Api_Error_Rande will also drop in Checkout_success_Reece

The biggest impact on parts:
- Business → critical new data issues
- Central market → recurrent events in the abdominal cavity
- It's soft → Fluctuations in OnChussition

The result
The goal of this dashboard is not only to analyze events and identify patterns but to allow the organization to react quickly with a detailed view.

We noticed a 35% reduction in serious events after two months. The SRE & Data teams identified a recurring cause of major bottlenecks, due to heterogeneous data, and were able to fix it and look for maintenance. The response time that occurs regularly because of the summarization thanks to the AI summarizes the AI and all the metrics, allowing them to know where to investigate.
AI-Powered Powered root

Using AI can save a lot of time. Especially when you need to investigate different information, and you don't know where to start. Adding an AI agent to the loop can it saves you a lot of time Because of its data processing speed. To find this, a Detailed prompt it is necessary because the agent will take the place of the person. Therefore, to have the most accurate results, even AI needs to understand the context and accept some guidance. Besides, it can investigate and draw wrong conclusions. Don't forget Make sure you have a full understanding of the cause of the issue.
You are a Product Data & Revenue Analyst.
We detected an incident:
{{ $json.incident }}
Here is churn MRR by country (top offenders first):
{{ $json.churn_by_country }}
Here is churn MRR by plan:
{{ $json.churn_by_plan }}
1. Summarize what happened in simple business language.
2. Identify the most impacted segments (country, plan).
3. Propose 3-5 plausible hypotheses (product issues, price changes, bugs, market events).
4. Propose 3 concrete next steps for the Product team.
It is important to note that when the results are available, a final check is necessary to ensure that the analysis is done correctly. AI is a tool, But it also doesn't go well, so it's not just for you; A useful tool. In this system, AI will suggest Top 3 possible causes for each incident.

Better alignment with the leadership team and data-driven reporting. Everything came about with data fueled by deep analysis, not sentiment or reports on Segmentation. This led to an improved process.
Conclusion & Conclusion
In conclusion, building a brand health dashboard has several benefits:
- See negative trends (mrr, dau, engagement) earlier
- Reduce critical incidents by identifying root cause patterns
- Understand the real business impact (affected users, Churn risk)
- Break the road memory based on risk and impact
- Organize product, data, SRE, and revenue around a single source of truth
That's exactly what most companies lack: A unified data approach.

Using the N8N workflow has helped in two ways: Being able to solve problems more quickly and gather information in one place. The automation tool helped reduce the time spent on this task as the business was still running.
Studies of product groups

- Start simple: Creating an automated system and dashboard needs to be clearly defined. You build a product for customers, you build a product for your partners. It is important that you understand the needs of each group as they are your primary users. With that in mind, have a product that will be your MVP and answer all your needs first. Then you can improve it by adding features or metrics.
- UNIED metrics More there is a complete discovery: We must remember that it will be because of them that time will be saved, and understanding. Good acquisition is important, but if the metrics are wrong, the time saved will be wasted by teams looking for metrics scattered in different areas
- Automation saves 10 hours a week of literature investigation: By changing other activities of the notebook and repetition, you will save hours investigating, in accordance with the workflow of the incident, we know exactly where to investigate first and the hypothesis of the cause and even a specific action to take.
- Write everything down: Appropriate and detailed documentation is objective and will allow all parties involved to have a clear understanding and view of what is happening. Documents are also part of the data.
Who am I ?
I am Yassin, a project manager who expanded into data science to bridge the gap between business systems and technical systems. Learning Python, SQL, and analytics enabled me to design product insights and automated workflows to communicate what teams needed. Let's connect on LinkedIn



