Generative AI

MiniMax Releases M2.1: An Enhanced Version of M2 with Features Like Multi-Coding Language Support, API Integration, and Improved Structured Coding Tools

A few months after releasing the M2—a fast, low-cost model designed for agents and code—MiniMax has introduced an improved version: MiniMax M2.1.

The M2 already stands out for its efficiency, operating at about 8% of the cost of the Claude Sonnet while delivering much higher speeds. More importantly, it introduced a different pattern of computation and reasoning, especially in the way the model builds and executes its reasoning during complex code and tool-driven workflows.

M2.1 builds on this foundation, bringing tangible improvements in all key areas: better code quality, smarter next-generation instructions, cleaner logic, and stronger performance across programming languages. This upgrade increases the initial power of the M2 while staying true to the MiniMax concept of “Intelligence with everyone.

Reinforcing the core capabilities of M2, M2.1 isn't just about better coding—it's also producing clearer, more organized results for all conversations, documents, and writing.

  • Designed for real-world coders and native AI teams: Designed to support everything from “vibe creation” to complex, production-grade workflows.
  • Pass coding: It produces clearer, more organized, and higher-quality results in all day-to-day conversations, technical documents, and writing tasks.
  • High-quality multilingual coding performance: It scores 72.5% in SWE-Multilingual, outperforming Claude Sonnet 4.5 and Gemini 3 Pro in multiple programming languages.
  • AppDev and WebDev strengths are strong: It scores 88.6% on VIBE-Bench, surpassing the Claude Sonnet 4.5 and Gemini 3 Pro, with significant improvements in native Android, iOS, and modern web development.
  • Best agent and tool compatibility: It delivers consistent and stable performance across all leading coding tools and agent frameworks, including Claude Code, Droid (Factory AI), Cline, Kilo Code, Roo Code, BlackBox, and more.
  • Strong content management support: It works reliably with advanced context methods such as Skill.md, Claude.md / agent.md / cursorrule, and Slash Commands, enabling a scalable agent workflow.
  • Automatic caching, zero setting: Built-in caching works out of the box to reduce latency, lower costs, and deliver a smoother overall experience.

To get started with MiniMax M2.1, you will need an API key from the MiniMax platform. You can generate one from the MiniMax user console.

Once issued, store the API key securely and avoid disclosing it in code repositories or public areas.

Installing and setting up dependencies

MiniMax supports both Anthropic and OpenAI API formats, making it easy to integrate MiniMax models into existing workflows with minimal configuration changes—whether you're using Anthropic-style messaging APIs or an OpenAI-compatible setup.

import os
from getpass import getpass
os.environ['ANTHROPIC_BASE_URL'] = '
os.environ['ANTHROPIC_API_KEY'] = getpass('Enter MiniMax API Key: ')

With this little setup, you are ready to start using the model.

Sending Requests to the Model

MiniMax M2.1 returns structured results that separate the internal thinking (thinking) from the final response (text). This allows you to see how the model interprets the intent and organizes its response before generating output for the user.

import anthropic

client = anthropic.Anthropic()

message = client.messages.create(
    model="MiniMax-M2.1",
    max_tokens=1000,
    system="You are a helpful assistant.",
    messages=[
        {
            "role": "user",
            "content": [
                {
                    "type": "text",
                    "text": "Hi, how are you?"
                }
            ]
        }
    ]
)

for block in message.content:
    if block.type == "thinking":
        print(f"Thinking:n{block.thinking}n")
    elif block.type == "text":
        print(f"Text:n{block.text}n")
Thinking:
The user is just asking how I am doing. This is a friendly greeting, so I should respond in a warm, conversational way. I'll keep it simple and friendly.

Text:
Hi! I'm doing well, thanks for asking! 😊

I'm ready to help you with whatever you need today. Whether it's coding, answering questions, brainstorming ideas, or just chatting, I'm here for you.

What can I help you with?

What makes MiniMax stand out is the transparency in its thought process. Before generating the final response, the model clearly reasons about the user's intent, tone, and expected style—ensuring that the response is appropriate and context-aware.

By cleanly separating logic from responses, the model becomes easier to interpret, debug, and trust, especially in agent-based or multi-step workflows, and with M2.1 this clarity is paired with faster responses, shorter logic, and significantly reduced token usage compared to M2.

The MiniMax M2 stands out for its excellent handling of Interleaved Thinking, which allows it to organize and adapt between complex codes and tool-based workflows, and the M2.1 extends this ability with improved code quality, more precise instructions to follow, clear thinking, and strong performance in all programming languages—especially in handling the complex constraints of OctomakingCooking commands as seen by office constraints.

To test these capabilities in practice, let's test the model using a structured coding command that covers many of the constraints and needs of real-world engineering.

import anthropic

client = anthropic.Anthropic()

def run_test(prompt: str, title: str):
    print(f"n{'='*80}")
    print(f"TEST: {title}")
    print(f"{'='*80}n")

    message = client.messages.create(
        model="MiniMax-M2.1",
        max_tokens=10000,
        system=(
            "You are a senior software engineer. "
            "Write production-quality code with clear structure, "
            "explicit assumptions, and minimal but sufficient reasoning. "
            "Avoid unnecessary verbosity."
        ),
        messages=[
            {
                "role": "user",
                "content": [{"type": "text", "text": prompt}]
            }
        ]
    )

    for block in message.content:
        if block.type == "thinking":
            print("🧠 Thinking:n", block.thinking, "n")
        elif block.type == "text":
            print("📄 Output:n", block.text, "n")

PROMPT= """
Design a small Python service that processes user events.

Requirements:
1. Events arrive as dictionaries with keys: user_id, event_type, timestamp.
2. Validate input strictly (types + required keys).
3. Aggregate events per user in memory.
4. Expose two functions:
   - ingest_event(event: dict) -> None
   - get_user_summary(user_id: str) -> dict
5. Code must be:
   - Testable
   - Thread-safe
   - Easily extensible for new event types
6. Do NOT use external libraries.

Provide:
- Code only
- Brief inline comments where needed
"""

run_test(prompt=PROMPT, title="Instruction Following + Architecture")

This test uses a purpose-built and complex command designed to test more than just code generation. Data requires robust input validation, memory state management, thread safety, testability, and extensibility—all without relying on external libraries.

By combining architectural decisions with many non-trivial constraints, the data operates at a medium to high level of complexity, making it well-suited to testing how effectively the MiniMax M2.1 follows instructions, reasons through design trade-offs, and produces production-quality code instead of isolated snippets.

A Model of Reasoning and Outcome

The model accounts for significant architectural trade-offs before coding, carefully balancing flexibility, memory usage, and scalability. It explores multiple methods of event integration and deliberately chooses raw event storage to enable future extensions without changing the import concept.

Chain security is clearly handled by locking, and strong input validation is implemented to ensure data validity, reflecting a real-world, engineering perspective rather than jumping straight into implementation.

This thoughtful thinking is reflected in the final output, which demonstrates strong code quality with clear structure, meaningful naming, type hints, and thread-safe design aligned to production standards. The solution follows all the fast constraints precisely—covering authentication, memory integration, scalability, and the absence of external dependencies—while maintaining a clean, focused logic that avoids unnecessary complexity and remains easy to maintain.

import threading
from typing import Dict, List, Any

class EventProcessor:
    """
    Thread-safe event processor that aggregates user events in memory.
    Validates input strictly and supports easy extension for new event types.
    """
    
    def __init__(self) -> None:
        # Stores events per user: user_id -> list of event dictionaries
        self._user_events: Dict[str, List[Dict[str, Any]]] = {}
        # Lock for thread-safe access
        self._lock = threading.Lock()
    
    def ingest_event(self, event: dict) -> None:
        """
        Validate and ingest a single event.
        Strictly validates types and presence of required keys.
        """
        # Validate event is a dictionary
        if not isinstance(event, dict):
            raise ValueError("Event must be a dictionary")
        
        # Validate required keys and their types
        required_keys = {
            'user_id': str,
            'event_type': str,
            'timestamp': (str, int, float)  # Accept string or numeric timestamp
        }
        
        for key, expected_type in required_keys.items():
            if key not in event:
                raise ValueError(f"Missing required key: '{key}'")
            if not isinstance(event[key], expected_type):
                raise ValueError(f"Key '{key}' must be of type {expected_type.__name__}")
        
        # Thread-safe event storage
        with self._lock:
            user_id = event['user_id']
            if user_id not in self._user_events:
                self._user_events[user_id] = []
            self._user_events[user_id].append(event)
    
    def get_user_summary(self, user_id: str) -> dict:
        """
        Generate summary for a specific user.
        Returns aggregated event counts and timestamps.
        """
        if not isinstance(user_id, str):
            raise ValueError("user_id must be a string")
        
        with self._lock:
            user_events = self._user_events.get(user_id, [])
            
            # Aggregate event counts
            event_counts = {}
            timestamps = []
            
            for event in user_events:
                event_type = event['event_type']
                event_counts[event_type] = event_counts.get(event_type, 0) + 1
                timestamps.append(event['timestamp'])
            
            return {
                'user_id': user_id,
                'total_events': len(user_events),
                'event_counts': event_counts,
                'timestamps': sorted(timestamps) if timestamps else []
            }
```

**Key Design Choices:**

1. **Thread Safety**: Uses `threading.Lock` to protect shared state during concurrent access
2. **Input Validation**: Strict type checking for required keys with clear error messages
3. **Extensibility**: 
   - New event types automatically handled by dynamic counting
   - Easy to add new aggregations in `get_user_summary`
4. **Testability**: Clear separation of concerns, easy to mock for unit tests
5. **Memory Efficiency**: Stores only essential data (event dictionaries)

**Usage Example:**
```python
processor = EventProcessor()

# Ingest events
processor.ingest_event({
    'user_id': 'user123',
    'event_type': 'login',
    'timestamp': '2023-01-01T10:00:00Z'
})

# Get user summary
summary = processor.get_user_summary('user123')
print(summary)

Now let's see the integrated logic of the MiniMax M2.1 in action. We ask the model to compare two organizations based on P/E ratio and sentiment, using two dummy instruments to clearly see how the workflow works.

This example shows how M2.1 interacts with external tools in a controlled, agent-style setup. One tool simulates downloading stock metrics, while the other provides sentiment analysis, with both responses generated locally. As the model receives these instrumental results, it incorporates them into its reasoning and adjusts its final comparison accordingly.

Defining tools

import anthropic
import json

client = anthropic.Anthropic()

def get_stock_metrics(ticker):
    data = {
        "NVDA": {"price": 130, "pe": 75.2},
        "AMD": {"price": 150, "pe": 40.5}
    }
    return json.dumps(data.get(ticker, "Ticker not found"))

def get_sentiment_analysis(company_name):
    sentiments = {"NVIDIA": 0.85, "AMD": 0.42}
    return f"Sentiment score for {company_name}: {sentiments.get(company_name, 0.0)}"

tools = [
    {
        "name": "get_stock_metrics",
        "description": "Get price and P/E ratio.",
        "input_schema": {
            "type": "object",
            "properties": {"ticker": {"type": "string"}},
            "required": ["ticker"]
        }
    },
    {
        "name": "get_sentiment_analysis",
        "description": "Get news sentiment score.",
        "input_schema": {
            "type": "object",
            "properties": {"company_name": {"type": "string"}},
            "required": ["company_name"]
        }
    }
]
messages = [{"role": "user", "content": "Compare NVDA and AMD value based on P/E and sentiment."}]
running = True

print(f"👤 [USER]: {messages[0]['content']}")

while running:
    # Get model response
    response = client.messages.create(
        model="MiniMax-M2.1",
        max_tokens=4096,
        messages=messages,
        tools=tools,
    )

    messages.append({"role": "assistant", "content": response.content})

    tool_results = []
    has_tool_use = False

    for block in response.content:
        if block.type == "thinking":
            print(f"n💭 [THINKING]:n{block.thinking}")
        
        elif block.type == "text":
            print(f"n💬 [MODEL]: {block.text}")
            if not any(b.type == "tool_use" for b in response.content):
                running = False
        
        elif block.type == "tool_use":
            has_tool_use = True
            print(f"🔧 [TOOL CALL]: {block.name}({block.input})")
            
            # Execute the correct mock function
            if block.name == "get_stock_metrics":
                result = get_stock_metrics(block.input['ticker'])
            elif block.name == "get_sentiment_analysis":
                result = get_sentiment_analysis(block.input['company_name'])
            
            # Add to the results list for this turn
            tool_results.append({
                "type": "tool_result",
                "tool_use_id": block.id,
                "content": result
            })

    if has_tool_use:
        messages.append({"role": "user", "content": tool_results})
    else:
        running = False

print("n✅ Conversation Complete.")

During execution, the model decides when and where the tool should call, receives the results of the corresponding tool, and then updates its assumptions and final response based on that data. This demonstrates M2.1's ability to integrate reasoning, tool use, and response processing—modifying output dynamically as new information becomes available.

Finally, we compare the MiniMax M2.1 and the GPT-5.2 using the integrated multi-language instruction manual. The function requires the model to identify coffee-related terms from a Spanish text, translate only those words into English, remove duplicates, and return the result in a strictly formatted numeric list.

To use this code block, you will need an OpenAI API key, which can be generated from the OpenAI developer dashboard.

import os
from getpass import getpass
os.environ['OPENAI_API_KEY'] = getpass ('Enter OpenAI API Key: ')
input_text = """
¡Preparar café Cold Brew es un proceso sencillo y refrescante!
Todo lo que necesitas son granos de café molido grueso y agua fría.
Comienza añadiendo el café molido a un recipiente o jarra grande.
Luego, vierte agua fría, asegurándote de que todos los granos de café
estén completamente sumergidos.
Remueve la mezcla suavemente para garantizar una saturación uniforme.
Cubre el recipiente y déjalo en remojo en el refrigerador durante al
menos 12 a 24 horas, dependiendo de la fuerza deseada.
"""

prompt = f"""
The following text is written in Spanish.

Task:
1. Identify all words in the text that are related to coffee or coffee preparation.
2. Translate ONLY those words into English.
3. Remove duplicates (each word should appear only once).
4. Present the result as a numbered list.

Rules:
- Do NOT include explanations.
- Do NOT include non-coffee-related words.
- Do NOT include Spanish words in the final output.

Text:
<{input_text}>
"""

from openai import OpenAI
client = OpenAI()

response = client.responses.create(
    model="gpt-5.2",
    input=prompt
)

print(response.output_text)
import anthropic

client = anthropic.Anthropic()

message = client.messages.create(
    model="MiniMax-M2.1",
    max_tokens=10000,
    system="You are a helpful assistant.",
    messages=[
        {
            "role": "user",
            "content": [
                {
                    "type": "text",
                    "text": prompt
                }
            ]
        }
    ]
)

for block in message.content:
    if block.type == "thinking":
        print(f"Thinking:n{block.thinking}n")
    elif block.type == "text":
        print(f"Text:n{block.text}n")

Comparing outputs, MiniMax M2.1 produces a significantly broader and more granular set of coffee-related terms than GPT-5.2. M2.1 shows not only contextual nouns such as coffee, beans, and water, but also preparatory actions (pour, stir, cover), process-related conditions (soak, soak), and contextual attributes (cold, wages, strength, hours).

This shows a deep semantic pass over the text, where the model is reasoning through all the preparation work instead of extracting the most obvious keywords.

This difference is also reflected in the thought process. M2.1 clearly analyzes the context, resolves edge cases (such as English loanwords such as Cold Brew), considers repetitions, and deliberates whether certain adjectives or verbs qualify as coffee-related before finalizing the list. GPT-5.2, in contrast, delivers concise and highly sequential output focused on high-confidence terms, with a depth of abstract reasoning.

Together, this highlights M2.1's strong adherence to instruction and semantic input, especially in tasks that require careful sorting, interpretation, and control of robust output.


Asif Razzaq is the CEO of Marktechpost Media Inc. As a visionary entrepreneur and engineer, Asif is committed to harnessing the power of Artificial Intelligence for the benefit of society. His latest endeavor is the launch of Artificial Intelligence Media Platform, Marktechpost, which stands out for its extensive coverage of machine learning and deep learning stories that sound technically sound and easily understood by a wide audience. The platform boasts of more than 2 million monthly views, which shows its popularity among viewers.

Source link

Related Articles

Leave a Reply

Your email address will not be published. Required fields are marked *

Back to top button