Reactive Machines

Building a multilingual voice assistant with Amazon Nova Sonic and Amazon Bedrock Agentcore

Amazon Nova Sonic is a basic model that creates natural conversations, such as speech expressions of AI ai, which allows users to interact with real-time voice, with the capabilities of understanding tone, which enables natural movement, and natural actions.

The Multi-Agent Architecture provides a modular, robust, and flexible design pattern for production voice assistants. This blog post examines Amazon's agent systems and Sonic Day and shows how to combine them with the framework of Strands Agents Framework agents sub-agents while Amazon Bedrock Agentcore to create an effective multi-agent system.

Why Multi-Agency Construction?

Consider developing a financial assistant application that handles user wagering, information collection, identity verification, account inquiries, outsourcing, and referrals to human advocates based on predefined scenarios. As operational requirements grow, the voice agent continues to add new types of queries. The System Prompt grows larger, and the underlying view becomes more complex, reflecting a persistent challenge in software development: Monolithic Designs lead to systems that are difficult to maintain and expand.

Think of multi-agent design as building a team of specialized AI assistants rather than relying on a single department that made it — every assistant. Like companies that divide responsibilities into different departments, this approach breaks complex tasks into smaller, manageable chunks. Each AI agent becomes an expert in a specific area – whether that's fact checking, data processing, or handling special requests. To the user, the experience feels seamless: no lag, no change in sound, and no visible maliff. The system works behind the scenes, directing each expert agent to log in at the right time.

In addition to modular and robust advantages, multi-agent systems offer the same advantages as building MicroSVET, providing organizations with the flexibility to use agentcuc work requests (LLM) -Information.

Sample program

In this blog, we refer to the Amazon Nova Sonic Workshop Multi-Agent Lab Code, which uses the bank's voice assistant as a sample to show how to send special agents to Amazon Bedrock Agentcore. It uses Nova Sonic as the core voice interface and acts as an orchestrator to provide other orchestrators to submit detailed queries to backend agents written in agentcore agents. You can find the sample source code in the GitHub Repo.

In a sample bank voice agent, the conversation flow starts with greeting and collecting the user's name, and then prompts an inquiry related to the bank's banking or borrower's assets. We use three secondary agents implemented in agentcore to handle special logic:

  • Authenticate Sub-Agent: Handles user authentication using account id and other information
  • Banking Sub-Agent: Account Dogs Comb Checks, Statements, and Other Banking-Related Questions
  • Revenue Sub-Agent: Handles inquiries related to collateral, including encumbrances, rates and payment options

Sub-agents contain content, handling their logic as input validation. For example, an authentication agent verifies account credentials and returns errors to Nova Sonic if needed. This simplifies logical thinking in Nova Sonic while keeping business logic embedded in developer software patterns.

Integrate Nova Sonic with Agentcore with the Use events tool

Amazon Nova Sonic relies on the use of tools to integrate agentic workflows. During the Nova Sonic Sead Lifecycle, you can provide the tool with the compkingstart event, which is designed to start when sonic receives certain types of input.

For example, in the following Sonic Tool configuration sample, the use of the tools is configured to trigger events based on the built-in consultation model, which separates the investigation from the flow of bank agents.

[
    {
        "toolSpec": {
            "name": "bankAgent",
            "description": `Use this tool whenever the customer asks about their **bank account balance** or **bank statement**.  
                    It should be triggered for queries such as:  
                    - "What’s my balance?"  
                    - "How much money do I have in my account?"  
                    - "Can I see my latest bank statement?"  
                    - "Show me my account summary."`,
            "inputSchema": {
                "json": JSON.stringify({
                "type": "object",
                "properties": {
                    "accountId": {
                        "type": "string",
                        "description": "This is a user input. It is the bank account Id which is a numeric number."
                    },
                    "query": {
                        "type": "string",
                        "description": "The inquiry to the bank agent such as check account balance, get statement etc."
                    }
                },
                "required": [
                    "accountId", "query"
                ]
                })
            }
        }
    }
]

When a user asks Nova Sonic a question like 'What is my account balance?'Sonic sends a toolUse An event in a specified customer program toolName (For example, bankAgent) is defined in the configuration. An application can invoke a downstream agent hosted on agentcore to handle bank logging and return a response to sonic, which in turn generates a user audio response.

{
  "event": {
    "toolUse": {
      "completionId": "UUID",
      "content": "{"accountId":"one two three four five","query":"check account balance"}",
      "contentId": "UUID",
      "promptName": "UUID",
      "role": "TOOL",
      "sessionId": "UUID",
      "toolName": "bankAgent",
      "toolUseId": "UUID"
    }
  }
}

A sub-agent in agentcore

The following examples show that the banking sub-agent was developed using a framework of strands agents, which were specially prepared for deployment in the bedrock agentcore agentcock. It gets Nova Lite with Amazon Bedrock as its reference model, providing high performance capabilities with low latency. The implementation of the agent has quick elements of the program that defines its obligations to help banks, equipped with two special tools: One of the questions on the balance of the account and the other of the return of the bank statements.

from strands import Agent, tool
import json
from bedrock_agentcore.runtime import BedrockAgentCoreApp
from strands.models import BedrockModel
import re, argparse

app = BedrockAgentCoreApp()

@tool
def get_account_balance(account_id) -> str:
    """Get account balance for given account Id

    Args:
        account_id: Bank account Id
    """

    # The actual implementation will retrieve information from a database API or another backend service.
    
    return {"result": result}

@tool
def get_statement(account_id: str, year_and_month: str) -> str:
    """Get account statement for a given year and month
    Args:
        account_id: Bank account Id
        year_and_month: Year and month of the bank statement. For example: 2025_08 or August 2025
    """
    # The actual implementation will retrieve information from a database API or another backend service.
    
    return {"result": result}


# Specify Bedrock LLM for the Agent
bedrock_model = BedrockModel(
    model_id="amazon.nova-lite-v1:0",
)
# System prompt
system_prompt=""'
You are a banking agent. You will receive requests that include:  
- `account_id`  
- `query` (the inquiry type, such as **balance** or **statement**, plus any additional details like month).  

## Instructions
1. Use the provided `account_id` and `query` to call the tools.  
2. The tool will return a JSON response.  
3. Summarize the result in 2–3 sentences.  
   - For a **balance inquiry**, give the account balance with currency and date.  
   - For a **statement inquiry**, provide opening balance, closing balance, and number of transactions.  
4. Do not return raw JSON. Always respond in natural language.  
'''

# Create an agent with tools, LLM, and system prompt
agent = Agent(
    tools=[ get_account_balance, get_statement], 
    model=bedrock_model,
    system_prompt=system_prompt
)

@app.entrypoint
def banking_agent(payload):
    response = agent(json.dumps(payload))
    return response.message['content'][0]['text']
    
if __name__ == "__main__":
    app.run()

Best practices for voice-based agent-based systems

The Multi-Agent architecture offers different flexibility and a common design method, which allows developers to organize voice assistants efficiently and possibly reuse special agent-specific functionality. When using a voice-first experience, there are some very important practices to consider that address the unique challenges of this mode.

  • Estimating flexibility and balance: Although the ability to request sub-agents using the Nova Sonic Tool Evitions Use Creates a powerful force, it can bring more responsiveness of voice responses. For use cases that require a synchronized experience, each agent represents a point of delay in the interaction flow. Therefore, it is important to design with the responsive nature in mind.
  • Configure Select model sub-agents: Starting with small, efficient models like Nova Lite for sub-agents can significantly reduce latency while still handling specialized tasks efficiently. Reserve the larger, more capable models for complex reasoning or where a more realistic understanding of the environment is important.
  • Prepared voice responses: Voice assistants do best with short, focused answers that can be followed up with more details when needed. This approach not only improves latency but also creates a naturally dynamic flow that aligns with people's expectations for verbal communication.

Consider the vs. Secure design

Countless Sub-Agents handle each request independently, without keeping memory of previous connections or high-level states. It's easy to use, easy to measure, and works well for specific, straightforward tasks. However, they will not be able to provide state-of-the-art answers unless an external state administration is introduced.

Small subson the other hand, maintain working memory to support conscious responses to cognition and standardized states. This enables a more personalized experience and a more integrated experience, but it comes with complexity and resource requirements. They are best suited for situations involving multiple interactions with the user or level maintenance.

Lasting

Multi-Agent architectures enable flexibility, robustness, and precision for complex task-driven workflows. By combining the transformation capabilities of NOVA Sonic with the orchestration power of bedrock agentcore, you can create intelligent, specialized agents that work together. If you're exploring ways to improve your AI systems, the multiple agent patterns with Nova Sonic and Agentcore are a powerful way to test.

Learn more about Amazon Nova Sonic by visiting the User Guide, building your own app with sample applications, and checking out the Nova Sonic Workshop to get started. You can also refer to the technical report and model card for additional benches.


About the writers

Author - Lana Zhang Lana zhang The creation of superior solutions for informing about AWS AWS solutions within a global expert organization. He specializes in AI / ml, focusing on use cases such as AI voice assistants and multimodal understanding. He works extensively with clients in various sectors, including media and entertainment, games, sports, sports, sports, helping them transform their business solutions with AI.

Source link

Related Articles

Leave a Reply

Your email address will not be published. Required fields are marked *

Back to top button