Machine Learning

4 Lines You Should Include in Your Claude Skill

I was asked to do something new at work: Given the data dump of unstructured text data, we have provided a detailed PDF report detailing what customers are saying about our products this quarter.

So I wrote a clear message. It gave Claude a detailed set of instructions. Feed it a data set. It gave me the output. I brought it.

But as the stakeholders and I reviewed the deliverables in depth, we saw things that were increasingly troubling.

Claude was mistaken for confidence.

No which is wrong wrong, like false facts that come out of nowhere. More like… overconfidence which is wrong. It will generate a quarterly insight report and say something like:

“Negative sentiment in the Dress department increased by 23% this quarter, indicating a significant change in customer satisfaction that requires immediate attention from the product team.”

Sounds great. Except that the spike was driven almost entirely by one popular item that launched mid-quarter with a known measurement error. One product. Not every department.

Claude didn't know. And my information doesn't tell you to care.

Image created by the author using Claude

Ability to Quarterly Customer Review Report

I'll walk you through Claude's ability to create a quarterly customer sentiment report from an unstructured product review text, delivered as a PDF to participants.

Obviously, I won't share the actual dataset I analyzed at work. The dataset I am using is Women's E-Commerce Clothing Review Dataset from Kaggle (CC0 license). Contains 23,000 real, anonymous customer reviews across all apparel departments (Tops, Dresses, Bottoms, Jackets, and more) with text, star ratings, and product metadata. References to the company in the reviews have been changed to “seller.”

Ability to:

  • Read an excerpt of the current quarter's updates
  • Collect them by the door
  • Discover trends and concerns
  • Write a professional PDF summary for the product leadership team

Here is the first warning:

She is a data analyst who produces a quarterly customer experience report for an e-commerce retailer of women's clothing. Given this quarter's customer reviews (including review text, star ratings, and department), write a professional stakeholder report that includes:

– An emotional summary of the quarter

– Key themes by department (Hats, Dresses, Jackets, Coats)

– 2-3 outstanding ideas from the review text

– A short recommendation of the product team

Be professional and clear.

When you're done with this task, please create a skill titled analyze-analyze and save your commands there.

What “Incorrect Confidence” Really Looks Like

Here's an example of what Claude produced with the absurd skill above, in a section where the Dress department had a lot of negative reviews:

“Negative sentiment in the Dress department has increased significantly this quarter, as customers often cite fit and fit issues. This suggests that the retailer's sizing standards may differ from customer expectations – a trend that, if not corrected, will erode brand loyalty in this important category.”

The real meaning? One garment (one SKU) was delivered in week 7 with batch quality problem. Updates were almost entirely with that one thing. The entire Dress department was working well.

Claude didn't really invent anything. There was no context as to why the pattern existed. And without that context, it did what LLMs do: fill the gap with a narrative that resonates.

Image created by the author using Claude

Correction: 4 lines MUST be included

Line 1: Tell Claude Which Theme is Missing

You do NOT have access to product launch calendars, inventory records, promotional campaigns, or individual SKU level history. DO NOT attribute department-level trends to product-wide causes. Report the patterns you see in the text; don't explain why they exist unless the review itself makes it clear.

This one commandment eliminates a major phase of self-doubt. Without you, Claude will always reach for a strategic narrative because that's what a good analyst does, and Claude is trying to be a good analyst.

The problem is that an honest critic also knows something he doesn't know. They said “We're seeing higher quality complaints on Hats this quarter. This could be attributed to a later launch but we'll need SKU level data to confirm.” Claude won't say that unless you tell him.

Line 2: Explain What “Important” Really Means

Claude likes the name which is important. It uses all the time. And it almost never explains it.

Only flag a change in sentiment as “significant” if it represents a change of more than 15 percentage points in the positive/negative ratio compared to the previous quarter, OR if the theme appears in more than 20% of reviews in a given department. For small signals, use language like “small rise” or “small growth.” Do not use the word “significant” or “significant” for anything below these standards. Always report the actual amount of the change number and your claim.

You can adjust the 15% and 20% thresholds to whatever makes sense for your data. The point is to ground Claude's language in something real.

Despite this, Claude will call both the 3-point review spike in complaints and the actual drop in sentiment of 30 points “significant”. Your stakeholders will start tuning in. And if something important happens, they won't be able to.

Line 3: Force the Worthy to Confidence in Every Vision

Before each detail, put a confidence label in brackets: [Data-Supported], [Possible]or [Speculative].

Use it [Data-Supported] only if the understanding directly follows the review text provided. Use it [Possible] where the understanding is the common sense from the text. Use it [Speculative] when you make assumptions about causes or context that are not in the reviews themselves.

When I first added this line, I had high expectations [Data-Supported] tags. What I actually got was a mix of all three, which told me very well that Claude had been closing in on enemies in my previous reports without me noticing.

An example of what the output looks like after adding this line:

Photo produced by the author using Claude

Now your participants can see exactly what is solid and what is guesswork. That is a very reliable report.

Line 4: You need Claude to State the Analysis Limits

At the end of the report, include a section called “What This Report Won't Tell You.” List 2-3 things that would be needed to reach firm conclusions, for example, SKU-level review breakdowns, return rates, or repeat purchase data.

This line forces Claude to acknowledge the edges of his analysis. It also gives your stakeholders a clear guide on what questions to investigate further, which is actually the most important thing an analyst can do.

Here is the output:

Image created by the author using Claude

How to Use Claude to Refine the Skill

Writing a skill once is not enough. You need to test it and improve it as you would iterate on the model.

Step 1: Apply the skill to known examples.

Filter the dataset in a time window where you already know what happened. (Quarter with product recall, seasonal promotion, period with unusually high return rates, etc.) See what Claude has to say. Does it use the word “important” correctly? Does it state facts/figures when appropriate?

Step 2: Feed your output to Claude and ask him to check.

Claude is good at holding back his overconfidence when you openly ask him for it.

Here is the quarterly report produced by the AI ​​analyst. Review all the information in this report and mark any of these:

– Make causal claims without direct evidence in the review text

– Use words like “important” or “significant” without justification

– Add individual product stories to broader trends

– Assume that the context is not in the dataset (introduce calendars,

inventory, purchase history)

For each item marked, suggest an updated version that is most appropriately fenced.

Step 3: Add a subsection for each failure you find.

Every time Claude produces a report that is misinformed or overconfident, you feel that you have added a new barrier to your ability. Over time, your skill becomes a record of everything Claude has gone wrong.

A Word of Warning

Adding constraints to your ability can sometimes cause Claude to produce output where each sentence ends with “…although more data would be needed to confirm this.”

That's not helpful either.

The goal is limited confidence where the strength of Claude's language matches the strength of the evidence. If you find Claude to be overly ambitious, you can add a rating limit:

Don't over-qualify every statement. If a pattern appears clearly and consistently across multiple reviews, clearly state and include references to the data behind the pattern. Keep qualified for uncertain or speculative claims.

The conclusion

Claude excels at producing professional-looking reports, which can sometimes be a problem.

Polish hides overconfidence. Your stakeholders see clean formatting and authoritative language, and they think the information is solid even if it isn't.

The four lines I have gone through here do not make Claude weak. They make it more reliable. And in the context of reporting, credibility is more important than impressiveness.

Learn more about what other use cases Claude is good for here, including building dashboards, debugging, and writing scripts:

Claude's 3 Skills Every Data Scientist Needs in 2026

Thanks for Reading

Contact me at LinkedIn

Buy me a coffee support my work!

Source link

Related Articles

Leave a Reply

Your email address will not be published. Required fields are marked *

Back to top button