Stop Wasting Tokens: A Smarter Alternative to JSON for LLM pipelines

# Introduction
JSON is ideal for APIs, storage, and application logic. But within large language model (LLM) pipelines, it often carries a lot of overhead tokens that don't add much value to the model: braces, quotes, commas, and field names that are repeated across lines. YOU KNOWshort for Token-Oriented Object Notation, a new format specifically designed to store the JSON data model while using fewer tokens and providing models with clearer structural notation. The official TOON documentation describes it as a compact, lossless JSON representation of LLM input, particularly robust to the same order of objects.
In this article, you will learn what TOON is, if it makes sense to use it, and how you can start using it step by step in your LLM career. We will also keep the tradeoffs honest, because TOON helps in some cases, not all.
# Why JSON Wastes Tokens in LLM Pipelines
JSON is expensive to format because it repeats the structure over and over again. LLMs don't care if JSON is a standard. They only see tokens.
If you send 100 support tickets, product lines, or user records to the model, the same field names appear in everything. TOON reduces that duplication by declaring the fields once and then distributing the row values in an associative table format. Here is a simple example.
JSON:
{
"users": [
{ "id": 1, "name": "Alice", "role": "admin" },
{ "id": 2, "name": "Bob", "role": "user" },
{ "id": 3, "name": "Charlie", "role": "user" }
]
}
TOON:
users[3]{id,name,role}:
1,Alice,admin
2,Bob,user
3,Charlie,user
Same data, less clutter.
The layout is still clear, but the duplicated keys are gone. This is where TOON gets most of its value.
# What exactly is TOON and when should you use it
TOON is a standard format for the JSON data model. That means it can represent objects, arrays, strings, numbers, booleans, and empty values - but in a more compact way to model. The TOON project introduces it as a lossless relative to JSON, meaning you can convert JSON to TOON and back without losing information. The important thing to understand is this:
You don't need to replace JSON in your application.
The best way is to store JSON in your backend, APIs, and storage, and convert it to TOON only when you are about to send structured data to LLM.
TOON is very useful if your data contains repeated structured records with the same fields. Good examples include returned support tickets, catalog lines, statistical records, tool exits, CRM entries, or memory snapshots of agent systems. However, if your structure is deeply embedded, too irregular, too flat, or too small, the benefits may shrink or disappear.
# Starting with TOON
// Step 1: Installing the TOON Command-Line Interface
The easiest way to try TOON is the official command line interface (CLI) from the TOON project. The TOON site connects directly to its CLI, and the main repository introduces the format as part of a broader SDK and tools ecosystem.
Install the package:
npm install -g @toon-format/cli
// Step 2: Converting the JSON File to TOON
Let's create a folder first:
mkdir toon-test
cd toon-test
Now, run the following command to create the JSON file:
Paste this:
[
{ "id": 1, "name": "Alice", "role": "admin" },
{ "id": 2, "name": "Bob", "role": "user" },
{ "id": 3, "name": "Charlie", "role": "user" }
]
Now change it:
npx @toon-format/cli users.json -o users.toon
You should get an output similar to this:
[3]{id,name,role}:
1,Alice,admin
2,Bob,user
3,Charlie,user
This is the basic TOON pattern: declare the shape once, then write the values row by row. That's in line with the formal design principle of parallel arrays.
// Step 3: Using TOON as a Model Input
The best place to use TOON is on the inlet side of your pipe. Instead of attaching a big JSON blob to the notification, skip the TOON version and keep the instructions simple.
For example:
The following data is in TOON format.
users[3]{id,name,role}:
1,Alice,admin
2,Bob,user
3,Charlie,user
Summarize the user roles and point out anything unusual.
This works well because TOON is designed to help the model learn an iterative structure with minimal overhead. That's also how the official project sets its benchmarks: as comprehension tests across different structured input formats.
// Step 4: Saving the JSON of the results
This is one of the most important practical decisions. TOON is more useful for input, but JSON is often still a better choice for output when another system needs to parse the model's response. That's because JSON has very strong tool support, and modern APIs can use structured JSON output with schemas.
Basically, the safest pattern is:
- JSON to your application.
- TOON's large content of organized information.
- JSON and to get the responses of the machine-distributed model.
This gives you efficiency on the input side and reliability on the output side.
// Step 5: Measuring Your Pipeline
Don't switch formats based on hype alone.
Do a little benchmarking in your workflow:
- Count the input JSON tokens.
- Count TOON input tokens.
- Compare the delay.
- Compare the quality of the response.
- Compare the total cost.
The official TOON project puts the saving of tokens as one of the main benefits, and third-party installations repeat those claims, but the public discussion also shows that the results are highly dependent on the data composition. That's why the best question isn't “Is TOON better than JSON?”
A better question is: “Is TOON better with this LLM initiative?”
# Final thoughts
TOON is not something you need to use everywhere.
It's a targeted fix for one problem: wasting tokens in a repeated JSON structure within LLM data. If your pipeline passes a lot of structured records into a model, TOON is worth checking out. If your payloads are small, irregular, or highly nested, JSON may still be a better choice.
The smartest way to use it is simple: save JSON when JSON already works fine, use TOON when packing large structured inputs into a notification, and evaluate the results in your functions before binding them.
Kanwal Mehreen is a machine learning engineer and technical writer with a deep passion for data science and the intersection of AI and medicine. He co-authored the ebook “Increasing Productivity with ChatGPT”. As a Google Generation Scholar 2022 for APAC, he strives for diversity and academic excellence. He has also been recognized as a Teradata Diversity in Tech Scholar, a Mitacs Globalink Research Scholar, and a Harvard WeCode Scholar. Kanwal is a passionate advocate for change, having founded FEMCodes to empower women in STEM fields.



