Gemma 4 and OpenCode: Create Your Local AI Agent in 5 Steps

⚡

Key Takeaways

1Gemma 4, a model from Google, allows you to create a local AI agent for coding.

2Ollama serves as a platform to run Gemma 4 on your personal machine.

3OpenCode provides an interface to interact with the AI model without going through the cloud.

💡Why it matters — This setup ensures data privacy and reduces costs associated with cloud services.

Large Language Model

The use of language models via the cloud has become a common practice due to its convenience and access to powerful models. However, for those looking to reduce costs, protect their data privacy, or gain a better understanding of the internal workings of agents, a local setup can be advantageous.

This article details how to create a local coding agent using three key components:

Ollama to host the model;
Gemma 4 as the local language model;
OpenCode for the agent interface.

By the end of this process, OpenCode will be connected to a local language model, providing a robust and cloud-independent solution.

1. Install Ollama

The first step is to install Ollama, which will serve as the server for the Gemma 4 model on your local machine.

Ollama is a runtime environment that allows you to download, run, and serve language models directly from your computer. Once configured, Ollama creates a local API endpoint, facilitating communication with other tools like OpenCode.

For Windows users, installation can be done via the official installer available here: Download Ollama.
Alternatively, installation can be performed via PowerShell with the following command:
```
winget install Ollama.Ollama
```

After installation, Ollama should appear in the Windows Start menu. You can launch it like any other application, and an icon in the taskbar will indicate that the local service is active.

To check the availability of the Ollama CLI, open a new PowerShell window and run:

ollama --version

On a Linux machine, Ollama can be installed with:

curl ‒fsSL https://ollama.com/install.sh | sh

Then verify the installation with:

ollama --version

Once Ollama is installed, it runs a local server on your machine. OpenCode will use this local server to interact with the model, eliminating the need to rely on a cloud model provider.

2. Download Gemma 4

The next step is to prepare the local language model, Gemma 4.

Gemma 4 is an innovative open-source model released by Google on April 2, 2026. It is designed for various tasks such as reasoning, coding, multimodal understanding, and agent workflows.

This model is available in several sizes, with device-optimized variants and larger ones for workstations. For this article, we will focus on the device-optimized variants, namely the E2B (gemma4:e2b) and E4B (gemma4:e4b) models.

In Ollama's nomenclature, the E stands for "efficient parameters." For this tutorial, we will use the E4B model as it offers enhanced capabilities. In PowerShell, run:

ollama pull gemma4:e4b

On Linux, use the same command:

ollama pull gemma4:e4b

You can verify the downloaded model with the following command:

gemma4:e4b    9.6 GB

For reference, the computer used for this tutorial is equipped with an Intel i7-13800H processor, 32 GB of RAM, and an NVIDIA RTX 2000 Ada GPU with approximately 8 GB of VRAM. If the E4B model seems too slow, you can opt for gemma4:e2b.

A few technical notes: the version of gemma4:e4b downloaded is a 4-bit quantized model, with GGUF as the local model format used by Ollama. On this machine, Ollama indicates that gemma4:e4b supports a context length of 128K.

Before moving to the next step, you can perform a quick test:

ollama run gemma4:e4b "What is the capital of France?"

If you get "Paris" in return, it means that Gemma 4 is now operational on your local machine via Ollama.

Note that the first call may be slow as Ollama needs to load the model. Once the model is loaded, subsequent requests should be faster.

3. Install OpenCode

We now need an agent interface, and OpenCode is the ideal tool for this.

If you have previously used tools like Claude Code or Codex, OpenCode belongs to the same category. It is an agent runtime environment capable of operating within a local repository, inspecting files, executing commands, and performing various tasks.

A key difference is that OpenCode is open-source and independent of LLM providers. You can connect it to cloud models (like Claude/GPT/Gemini) or to a local model served by Ollama.

For Windows users, start by installing Node.js with the following command:

winget install OpenJS.NodeJS.LTS

On Linux, run:

sudo apt install -y nodejs npm

After installation, open a new PowerShell window and check if node and npm are available:

node --version
npm --version

We can now install OpenCode:

npm install -g opencode-ai

Then verify the installation with:

opencode --version

At this point, OpenCode is installed. You can launch the interactive user interface of OpenCode (TUI) from any project folder by running:

opencode

4. Connect OpenCode to Gemma 4

By default, OpenCode does not know which model to use. Therefore, we need to specify the Gemma 4 model served by Ollama.

Start by creating an Ollama model tag with the full context window (128K) enabled. This ensures that the agent can operate correctly without being truncated in its context.

To do this, create an Ollama Modelfile named gemma4-e4b-128k.Modelfile in the folder or repository you wish to work with:

PARAMETER num_ctx 131072

Next, in the command line, create a new Ollama tag with:

ollama create gemma4:e4b-128k -f gemma4-e4b-128k.Modelfile

Note: this will not trigger a new model download! It simply creates an Ollama profile that uses the same Gemma 4 E4B model but explicitly sets the execution context window to 128K.

We can now connect OpenCode to the Gemma 4 model. To do this, create an opencode.json file in the project folder:

{
  "$schema": "https://opencode.ai/config.json",
  "npm": "@ai-sdk/[openai](/dossier/openai)-compatible",
  "name": "Ollama (local)",
  "baseURL": "http://localhost:11434/v1",
  "gemma4:e4b-128k": {
    "name": "Gemma 4 E4B 128K",
    "model": "ollama/gemma4:e4b-128k"
  }
}

Two important elements here:

First, OpenCode communicates with Ollama via Ollama's local OpenAI-compatible endpoint: http://localhost:11434/v1.
Next, note that we have defined the model name following the provider/model format of OpenCode: ollama/gemma4:e4b-128k.

You are using our newly created model tag above.

Now, if you launch OpenCode from the same project folder via:

opencode

You should see gemma4:e4b-128k listed.

5. What can you do with this setup?

With the OpenCode TUI launched, you can test your setup by asking the agent to perform a few tasks. For example, you can ask the agent to write a README file, explain specific functions, create test scripts, etc.

In fact, beyond coding, you can also ask the agent to perform many office tasks, such as file manipulations, content extractions, and more.

OpenCode also gives you the ability to expand the setup. You can connect tools to the agent, install agent skills with SKILL.md, and define specialized agents with AGENTS.md.

Additionally, you can run tasks from the command line with:

opencode run "Summarize this repository."

For more programmatic use, OpenCode can also operate as a server, so the TUI is not the only interface.

And here’s the most important part: all your data remains completely local.

You can find the relevant OpenCode documentation here: