The NVIDIA Nemotron 3 Nano 30B MoE model is now available at Amazon SageMaker JumpStart

nimda February 11, 2026

0 15 4 minutes read

The NVIDIA Nemotron 3 Nano 30B MoE model is now available at Amazon SageMaker JumpStart

Today we are pleased to announce that the NVIDIA Nemotron 3 Nano 30B model with 3B operating parameters is now available in the Amazon SageMaker JumpStart model catalog. You can accelerate innovation and deliver tangible business value with Nemotron 3 Nano on Amazon Web Services (AWS) without managing the complexity of model deployment. You can power your productive AI applications with Nemotron capabilities using the managed deployment capabilities provided by SageMaker JumpStart.

The Nemotron 3 Nano is a combination of small machine learning (MoE) experts with high-performance computing and engineering precision to run highly skilled agent operations at scale. The model is fully open with open weights, datasets, and recipes, so developers can easily customize, configure, and deploy the model in their infrastructure to help meet their privacy and security requirements. Nemotron 3 Nano excels in coding and imaging, and leads in benchmarks such as SWE Bench Verified, GPQA Diamond, AIME 2025, Arena Hard v2, and IFBench.

About the Nemotron 3 Nano 30B

The Nemotron 3 Nano is distinguished from other models by its construction and precision, boasting strong performance in various technical capacities:

Architecture:
- ο MoE with hybrid Transformer-Mamba architectureο Supports token budgeting to provide absolute accuracy by generating tokens with minimal logic
Accuracy:
- Leading accuracy in coding, scientific reasoning, math, and following instructions
- Leads in benchmarks like LiveCodeBench, GPQA Diamond, AIME 2025, BFCL, and IFBench (compared to other open source languages under 30B)
Usability:
- 30B parameter model with 3 billion active parameters
- It has a context window of up to 1 million tokens
- A text-based base model, using text for both input and output

What is required

To get started with Nemotron 3 Nano in Amazon SageMaker JumpStart, you must have a dedicated Amazon SageMaker Studio domain.

Boot with NVIDIA Nemotron 3 Nano 30B in SageMaker JumpStart

To test the Nemotron 3 Nano model in SageMaker JumpStart, open SageMaker Studio and select Models in the navigation pane. Search for NVIDIA in the search bar and select NVIDIA Nemotron 3 Nano 30B as a model.

On the model details page, select Use it and follow the instructions to use the model.

After the model is deployed in the SageMaker AI repository, you can test it. You can access the model using AWS Command Line Interface (AWS CLI) code examples. You can use nvidia/nemotron-3-nano such as the model ID.

cat > input.json << EOF
{
"model": "${MODEL_ID}",
"messages": [
{
 	"role": "system",
 	"content": "You are a helpful assistant."
 },
 {
 	"role": "user",
       	"content": "What is NVIDIA? Answer in 2-3 sentences."
}],
"max_tokens": 512,
"temperature": 0.2,
"stream": False, # Set to False for non-streaming mode,
   	"chat_template_kwargs": {"enable_thinking": False} # Set to False for non-reasoning mode
}
EOF
 
aws sagemaker-runtime invoke-endpoint 
--endpoint-name ${ENDPOINT_NAME} 
--region ${AWS_REGION} 
--content-type 'application/json' 
--body fileb://input.json 
> response.json

Alternatively, you can access the model using the SageMaker SDK and Boto3 code. The following Python code examples show how to send a text message to the NVIDIA Nemotron 3 Nano 30B using the SageMaker SDK. For more code examples, see the NVIDIA GitHub repo.

runtime_client = boto3.client('sagemaker-runtime', region_name=region) 
payload = {
        "messages": [
            {"role": "user", "content": prompt}
        ],
        "max_tokens": 1000
    }
    
    try:
        response = self.runtime_client.invoke_endpoint(
            EndpointName=self.endpoint_name,
            ContentType="application/json",
            Body=json.dumps(payload)
        )
        
        response_body = response['Body'].read().decode('utf-8')
        raw_response = json.loads(response_body)
        
        # Parse the response using our custom parser
        return self.parse_response(raw_response)
        
    except Exception as e:
        raise Exception(
            f"Failed to invoke endpoint '{self.endpoint_name}': {str(e)}. "
            f"Check that the endpoint is InService and you have least-privileged IAM permissions assigned."
        )

It is now available

NVIDIA Nemotron 3 Nano is now available fully managed in SageMaker JumpStart. Check out the AWS Regional availability model package. To learn more, check out the Nemotron Nano model page, the NVIDIA GitHub sample notebook for the Nemotron 3 Nano 30B, and the Amazon SageMaker JumpStart pricing page.

Try the Nemotron 3 Nano model on Amazon SageMaker JumpStart today and submit feedback in AWS re:Post for SageMaker JumpStart or through your regular AWS Support contacts.

About the writers

Dan Ferguson Solutions Architect at AWS, based in New York, USA. As a machine learning services specialist, Dan works to support clients on their journey to integrate ML workflows effectively, efficiently, and sustainably.

Pooja Karadgi leads product relations and strategy for Amazon SageMaker JumpStart, the machine learning and production AI hub within SageMaker. He is dedicated to accelerating customer adoption of AI by simplifying foundational model adoption and deployment, enabling customers to build AI-ready production systems across the model lifecycle – from onboarding and customization to deployment.

Benjamin Crabtree is a Senior Software Engineer on the Amazon SageMaker AI team, focused on delivering “last mile” experiences to customers. He is interested in democratizing the latest breakthroughs in artificial intelligence by providing easy-to-use capabilities. Also, Ben has extensive experience in building machine learning infrastructure at scale.

Timothy Ma He is a Principal Expert in productive AI at AWS, where he works with customers to design and deliver state-of-the-art machine learning solutions. He also leads go-to-market strategies for productive AI services, helping organizations harness the power of cutting-edge AI technologies.

Abdullahi Olaoye a Senior AI Solutions Architect at NVIDIA, specializing in integrating NVIDIA AI libraries, frameworks, and products with cloud AI services and open source tools to improve AI model deployment, reference, and productive AI workflows. He collaborates with AWS to improve AI workloads and drive adoption of NVIDIA-powered AI and productivity AI solutions.

Nirmal Kumar Juluru is a product marketing manager at NVIDIA who drives the adoption of AI software, models, and APIs in the NVIDIA NGC catalog and the NVIDIA AI Foundation models and endpoints. He previously worked as a software engineer. Nirmal holds an MBA from Carnegie Mellon University and a computer science degree from BITS Pilani.

Vivian Chen is a Deep Learning Solutions Architect at NVIDIA, where he helps teams bridge the gap between complex AI research and real-world performance. Specializing in intelligent and cloud-integrated AI solutions, Vivian focuses on turning the heavy lifting of machine learning into fast, scalable applications. He is passionate about helping customers navigate NVIDIA's accelerated computing stack to ensure their models not only work in the lab, but thrive in production.