VISIVE.AI

AI Agents Evolve to Boost Coding Skills

New research demonstrates how AI coding agents use evolutionary algorithms to improve their own abilities, potentially revolutionizing software development.

Jun 26, 2025Source: Visive.ai
AI Agents Evolve to Boost Coding Skills

In April, Microsoft’s CEO revealed that AI now writes close to a third of the company’s code. Last October, Google’s CEO reported that AI generates around a quarter of new Google code. Other tech companies are likely not far behind. This trend is driven by the development of advanced coding agents that can recursively improve themselves using evolutionary techniques.

Researchers have long aimed to create coding agents that can enhance their own capabilities. A recent study, described in a preprint on arXiv, introduces Darwin Gödel Machines (DGMs), a system that leverages large language models (LLMs) and evolutionary algorithms to achieve this goal. DGMs start with a coding agent that can read, write, and execute code, using an LLM for the reading and writing tasks.

The process involves creating variations of the coding agent, testing their performance, and selecting the best performers for the next iteration. DGMs maintain a population of agents, allowing for open-ended exploration and the potential for initially suboptimal changes to lead to breakthroughs in later iterations.

Jenny Zhang, a computer scientist at the University of British Columbia and lead author of the study, noted that the agents could write complex code by themselves, including editing multiple files, creating new files, and building intricate systems. The researchers ran DGMs for 80 iterations using two coding benchmarks: SWE-bench and Polyglot. The agents' performance improved significantly, with scores rising from 20 percent to 50 percent on SWE-bench and from 14 percent to 31 percent on Polyglot.

DGMs outperformed an alternate method that used a fixed external system for improving agents. The best SWE-bench agent, while not as good as the best human-designed agent, was generated automatically and could potentially surpass human expertise with more time and computational resources. Zhengyao Jiang, a cofounder of Weco AI, sees this as a significant proof of concept for recursive self-improvement and suggests further progress could be made by modifying the underlying LLM or even the chip architecture.

The potential of DGMs extends beyond coding benchmarks. They could be applied to specific applications like drug design, improving the agents' ability to design better drugs. However, the safety of self-improving systems is a concern. Zhang and her team added guardrails to keep the DGMs in sandboxes without internet access and logged all code changes. They also explored ways to reward AI for making itself more interpretable and aligned with human directives.

The risks associated with recursive self-improvement, such as the potential for AI to become uninterpretable or misaligned, are a topic of ongoing discussion. In 2017, experts at the Asilomar conference signed the Asilomar AI Principles, calling for restrictions on AI systems designed to recursively self-improve. Despite these concerns, experts like Jürgen Schmidhuber and Zhengyao Jiang remain optimistic about the future of AI, emphasizing the importance of responsible development and human oversight.

Frequently Asked Questions

What are Darwin Gödel Machines (DGMs)?

DGMs are AI systems that use large language models and evolutionary algorithms to create coding agents that can improve their own coding abilities.

How do DGMs improve coding agents?

DGMs create variations of coding agents, test their performance, and select the best performers for the next iteration, allowing for open-ended exploration and potential breakthroughs.

What were the results of the DGM study?

The study showed significant improvements in coding performance, with scores rising from 20 percent to 50 percent on the SWE-bench benchmark and from 14 percent to 31 percent on the Polyglot benchmark.

What are the potential applications of DGMs beyond coding?

DGMs could be applied to specific applications like drug design, improving the agents' ability to design better drugs and other complex systems.

What safety concerns are associated with DGMs?

Safety concerns include the potential for AI to become uninterpretable or misaligned with human directives. Researchers are exploring ways to add guardrails and ensure responsible development.

Related News Articles

Image for Meta in Talks to Acquire AI Voice Cloning Startup PlayAI

Meta in Talks to Acquire AI Voice Cloning Startup PlayAI

Read Article →
Image for Cosmo Pharmaceuticals Unveils 2030 Vision: €480M Revenue Target and AI-Driven Growth

Cosmo Pharmaceuticals Unveils 2030 Vision: €480M Revenue Target and AI-Driven Growth

Read Article →
Image for Asana Appoints Dan Rogers as New CEO to Drive AI-Driven Growth

Asana Appoints Dan Rogers as New CEO to Drive AI-Driven Growth

Read Article →
Image for Space Technology Transforms Communication and Education in Uttarakhand

Space Technology Transforms Communication and Education in Uttarakhand

Read Article →
Image for State Grid Xinjiang Deploys Advanced AI for Knowledge Management

State Grid Xinjiang Deploys Advanced AI for Knowledge Management

Read Article →
Image for MSU Researchers Use Nanomedicine and AI to Diagnose Diseases

MSU Researchers Use Nanomedicine and AI to Diagnose Diseases

Read Article →