Sakana Fugu: The Multi-Agent AI Redefining Innovation

Le brief IA que les pros lisent chaque soir
Les 7 actus IA du jour, décryptées en 5 min. Gratuit.
Inclus dès l'inscription : notre sélection des meilleurs guides & comparatifs IA.
Choisis ton rythme
Gratuit · Pas de spam · Désabonnement en 1 clic
Sakana Fugu: The Multi-Agent AI Redefining Innovation
What is Sakana Fugu?
Sakana Fugu is a managed model API compatible with OpenAI that resembles a unique LLM but operates as an internal multi-agent system. Developers send a prompt to a model identifier, such as fugu or fugu-ultra, while Fugu handles agent selection, role assignment, coordination, verification, and the final response.
Instead of manually building planners, coders, reviewers, researchers, or supervisors with frameworks like LangGraph, AutoGen, or CrewAI, teams benefit from orchestration integrated into the model itself. This reduces the need to manage prompts, routing, retries, memory, state, monitoring, and failure recovery.
Why is the Name Important?
The name "Sakana" means fish in Japanese. The company often frames its research around collective intelligence, similar to how a school of fish can behave like a coordinated system. Fugu follows this idea. Many agents coordinate behind a single interface.
Why is the Multi-Agent System as a Model Important?
Most AI systems in production today fall into one of three patterns:
- Single Model Prompts
- LLM Applications Enhanced by Tools
- Manually Designed Multi-Agent Workflows
Single model prompts are straightforward, but they can fail on complex tasks requiring planning, execution, verification, and iteration.
LLMs enhanced by tools improve utility by connecting models to research, databases, code executions, APIs, or enterprise systems. However, the model typically acts as the central reasoning engine.
Multi-agent workflows go further. They divide the work among specialized agents. For example:
- A planner breaks down the task.
- A researcher gathers context.
- A coder writes the code.
- A reviewer checks for correctness.
- A verifier tests the response.
- A supervisor coordinates the process.
This can improve reliability on challenging tasks, but building it well is complicated. Teams must address many system design questions:
- Which agent should handle which task?
- How should agents communicate?
- When should the system stop?
- How should intermediate outputs be verified?
- How should costs and latency be controlled?
- How should failures be recovered?
- How should compliance restrictions be enforced?
Fugu attempts to simplify this by transforming multi-agent orchestration into a model-level capability. The developer does not need to manually design each interaction between agents.
Fugu vs Fugu Ultra
Sakana Fugu comes in two main model options: Fugu and Fugu Ultra.
Fugu is the default model for everyday work. It balances performance and latency. It is suitable for coding support, code review, chatbots, internal assistants, document analysis, and interactive workflows where response time is critical.
A key point is that Fugu can route to the best model based on the task. It also allows users to exclude certain agents from the model pool, which can help with data, privacy, compliance, or organizational requirements.
Fugu Ultra is optimized for maximum response quality. It coordinates a deeper pool of expert agents and is intended for difficult, high-stakes, multi-step problems. According to Sakana, Fugu Ultra can route between one and three agents depending on the problem.
Fugu Ultra is better suited for workloads where accuracy, depth, and persistence matter more than latency. Examples include:
- Document reproduction
- Kaggle-style data science workflows
- Cybersecurity analysis
- Literature review
- Patent investigation
- In-depth technical research
- Complex code review
- Scientific reasoning
Benchmark Results
Sakana reports benchmark scores for Fugu and Fugu Ultra in coding, reasoning, science, agent tasks, long-term reasoning, and cybersecurity-style evaluation.
Benchmarks are useful, but they should not be considered direct production guarantees. The benchmark profile of Fugu suggests three practical insights:
-
Fugu is strongest when tasks require orchestration
The strongest use case is not a simple single response. The model is designed for tasks that benefit from decomposition, expert selection, verification, and synthesis. -
Ultra is not always automatically better
Fugu Ultra is optimized for response quality, but Fugu can outperform it on certain benchmarks. Developers should evaluate both models on their own workload before standardizing. -
A Practical Routing Strategy
Fugu allows for dynamic agent selection, which is crucial for maximizing efficiency and relevance of responses.
Brief IA — L'actualité IA en français
L'essentiel de l'actualité de l'intelligence artificielle, décrypté et expliqué chaque jour.