Sakana Fugu: The Multi-Agent AI Redefining Innovation

⚡

Key Takeaways

1Sakana AI innovates with Fugu, a multi-agent AI model that stands out from traditional models.

2Unlike classical approaches, Fugu uses multiple expert agents to coordinate its responses.

3A simple API call to Fugu can trigger direct responses, delegate to specialists, or perform intermediate checks.

💡Why it matters — This approach could transform the way AI systems handle complex tasks, optimizing efficiency and accuracy.

Sakana Fugu: The Multi-Agent AI Redefining Innovation

What is Sakana Fugu?

Sakana Fugu is a managed model API compatible with OpenAI that resembles a unique LLM but operates as an internal multi-agent system. Developers send a prompt to a model identifier, such as fugu or fugu-ultra, while Fugu handles agent selection, role assignment, coordination, verification, and the final response.

Instead of manually building planners, coders, reviewers, researchers, or supervisors with frameworks like LangGraph, AutoGen, or CrewAI, teams benefit from orchestration integrated into the model itself. This reduces the need to manage prompts, routing, retries, memory, state, monitoring, and failure recovery.

Why is the Name Important?

The name "Sakana" means fish in Japanese. The company often frames its research around collective intelligence, similar to how a school of fish can behave like a coordinated system. Fugu follows this idea. Many agents coordinate behind a single interface.

Why is the Multi-Agent System as a Model Important?

Most AI systems in production today fall into one of three patterns:

Single Model Prompts
LLM Applications Enhanced by Tools
Manually Designed Multi-Agent Workflows

Single model prompts are straightforward, but they can fail on complex tasks requiring planning, execution, verification, and iteration.

LLMs enhanced by tools improve utility by connecting models to research, databases, code executions, APIs, or enterprise systems. However, the model typically acts as the central reasoning engine.

Multi-agent workflows go further. They divide the work among specialized agents. For example:

A planner breaks down the task.
A researcher gathers context.
A coder writes the code.
A reviewer checks for correctness.
A verifier tests the response.
A supervisor coordinates the process.

This can improve reliability on challenging tasks, but building it well is complicated. Teams must address many system design questions:

Which agent should handle which task?
How should agents communicate?
When should the system stop?
How should intermediate outputs be verified?
How should costs and latency be controlled?
How should failures be recovered?
How should compliance restrictions be enforced?

Fugu attempts to simplify this by transforming multi-agent orchestration into a model-level capability. The developer does not need to manually design each interaction between agents.

Fugu vs Fugu Ultra

Sakana Fugu comes in two main model options: Fugu and Fugu Ultra.

Fugu is the default model for everyday work. It balances performance and latency. It is suitable for coding support, code review, chatbots, internal assistants, document analysis, and interactive workflows where response time is critical.

A key point is that Fugu can route to the best model based on the task. It also allows users to exclude certain agents from the model pool, which can help with data, privacy, compliance, or organizational requirements.

Fugu Ultra is optimized for maximum response quality. It coordinates a deeper pool of expert agents and is intended for difficult, high-stakes, multi-step problems. According to Sakana, Fugu Ultra can route between one and three agents depending on the problem.

Fugu Ultra is better suited for workloads where accuracy, depth, and persistence matter more than latency. Examples include:

Document reproduction
Kaggle-style data science workflows
Cybersecurity analysis
Literature review
Patent investigation
In-depth technical research
Complex code review
Scientific reasoning

Benchmark Results

Sakana reports benchmark scores for Fugu and Fugu Ultra in coding, reasoning, science, agent tasks, long-term reasoning, and cybersecurity-style evaluation.

Benchmarks are useful, but they should not be considered direct production guarantees. The benchmark profile of Fugu suggests three practical insights:

Fugu is strongest when tasks require orchestration
The strongest use case is not a simple single response. The model is designed for tasks that benefit from decomposition, expert selection, verification, and synthesis.
Ultra is not always automatically better
Fugu Ultra is optimized for response quality, but Fugu can outperform it on certain benchmarks. Developers should evaluate both models on their own workload before standardizing.
A Practical Routing Strategy
Fugu allows for dynamic agent selection, which is crucial for maximizing efficiency and relevance of responses.

Sakana Fugu: The Multi-Agent AI Redefining Innovation

Le brief IA que les pros lisent chaque soir

Sakana Fugu: The Multi-Agent AI Redefining Innovation

What is Sakana Fugu?

Why is the Name Important?

Why is the Multi-Agent System as a Model Important?

Fugu vs Fugu Ultra

Benchmark Results

Brief IA — L'actualité IA en français