GLM-5.2: the AI Model Disrupting Opus and GPT-5.5

⚡

Key Takeaways

1Claire tested GLM-5.2, an AI model from Z.ai, to replace Opus in Claude Code.

2GLM-5.2 offers flexibility and cost advantages through its self-hosting and open weights.

3Gusto developed a product in 10 weeks with Claude Code, without traditional processes.

💡Why it matters — Self-hosting and open-weight AI models redistribute power from providers, reducing costs and increasing developers' autonomy.

GLM-5.2: A Serious Alternative to Opus in Claude Code

Claire, an experienced developer, recently put the GLM-5.2 model, a creation of Z.ai, to the test as part of her work on ChatPRD. This open-weight model was evaluated in various contexts, including code audits, user interface redesigns, and a 45-minute autonomous bug-hunting task in the Cursor and Claude Code environments. Claire analyzed the model's performance, its surprises, and its challenges to assess whether GLM-5.2 could replace Opus in certain workflows.

Open-Weight Models: A Production-Ready Solution

Open-weight models, once considered curiosities for hobbyists, are now viable solutions for production. GLM-5.2, developed by Z.ai in Beijing, is positioned close to Claude Opus 4.8 and surpasses GPT-5.5 according to the SWE Bench Pro benchmark. With a processing capacity of one million tokens, it supports reasoning, function calling, structured output, and context caching. The question is no longer whether these models are capable, but rather to consider the aspects of cost, control, and dependency on providers.

Self-Hosting: A Paradigm Shift

Self-hosting AI models significantly alters the power dynamics between providers and users. With the public availability of model weights, teams can perform inferences on their own hardware, fine-tune the model with proprietary data, and avoid the constraints of single-provider APIs. This means that when labs change their policies or pricing, users of open-weight models can switch providers without altering their code.

Setting Up GLM-5.2: A Quick Process

Integrating GLM-5.2 into Cursor took Claire only 30 minutes. She documented the steps not covered by the official documentation, including passing the API key via Open Router, replacing the OpenAI base URL with openrouter.ai/api/v1/cursor in the Cursor settings, and adding z-ai/glm-5.2 as a custom model. For Claude Code, it was sufficient to modify two environment variables and the claude/settings.json file. In less than an hour, the model was operational.

A Revealing Autonomous Task

During a 45-minute autonomous task, Claire asked GLM-5.2 to retrieve errors from Sentry and logs from Vercel from the last 72 hours, then propose a bug-fixing plan. The model successfully authenticated with external services and produced an engineering canvas with 20 Sentry errors, five Vercel log signals, and 14 proposed fixes. It even identified two critical errors that Claire had not noticed before.

Challenges Encountered with React

GLM-5.2 faced difficulties with TypeScript compilation errors during the task but ultimately produced clean React output. Claire noted that while HTML and CSS generation is reliable, using React under agentic and multi-step pressure remains unstable. For teams whose code is primarily based on React, it is crucial to test this point before fully committing to the model.

Competitive Cost

The cost of using GLM-5.2 is notably low: $3.36 for 6 million tokens, including the 45-minute session. With a caching rate of 72%, even at full price, open-weight inference via Open Router is much cheaper than Opus or GPT-5.5 rates for equivalent coding capacity. For agents that accumulate long context windows, open-weight models offer a different cost structure.

Claire's Recommendation

Claire recommends integrating GLM-5.2 into workflows but not completely replacing closed models. She uses it in Cursor for frontend and design tasks, and in Claude Code for extended tasks, alongside closed models. She is monitoring the model's ability to handle workloads in React, which could strengthen the case for GLM-5.2.

Gusto: A New Product Line with Claude Code

Eddie Kim, co-founder and CTO of Gusto, shared how a team of five used Claude Code to develop Gusto Cofounder in just 10 weeks. Without resorting to traditional processes such as PM, Figma, or Jira, the team managed to create a top-tier product.

AI at the Heart of Engineering

A small team without complex processes can outperform a large team if AI manages the engineering. Eddie Kim demonstrated that the constraint of not having a traditional process was not an obstacle but an opportunity for design. AI took over coordination, allowing the team to focus on what matters.

Rapid Development Without Initial Code

The team reached a production milestone without any pre-existing code, challenging the notion that teams must spend months on infrastructure before delivering a product. With Claude Code, the initial development became a matter of direction and judgment, reducing the time between idea validation and user contact.

Simplifying Processes

The team did not use meetings, Jira, or text threads to coordinate the project. A shared context maintained by AI replaced these elements. When the model carries the state and the team is aligned, the burden of human coordination becomes unnecessary.

A Minimalist Infrastructure

The tech stack used for the production AI agent was minimalist, running on Cloudflare Workers with the Vercel AI SDK. No proprietary orchestration layer or third-party agent framework. This approach proves that infrastructural minimalism can accelerate development.

Redefining Agent Complexity

Building AI agents is not as complex as it may seem. An agent is simply an AI SDK running in the cloud, capable of consulting files and calling tools. Concerns about state management and orchestration can be resolved with the same judgments as for any backend system.

The "Permanent Zoom" Development Model

The development model with Claude Code in a persistent loop allows AI to have continuous access to the current state of the codebase. This is akin to an engineer who never shuts down their computer, providing constant and up-to-date availability.

A New Approach for Founding Teams

The lesson for founding teams is not just to use Claude Code but to design their process by integrating AI as a team member. Rather than tacking AI tools onto a traditional workflow, Eddie's team treated AI as a primary contributor, thereby accelerating the workflow as AI improves.