Cursor Is Lying To Developers…

Basic Dev

26 Jan 202610:00

Summary

TLDRA group of AI agents, backed by $2.3 billion in funding, was tasked with building a web browser from scratch. Despite running for a week, the project quickly fell apart due to poor coordination and lack of human oversight. The agents struggled with basic tasks, failed to produce functional code, and relied heavily on pre-existing libraries. Ultimately, the project was revealed to be a marketing stunt aimed at justifying a $23 billion valuation, showing that AI, while powerful, still needs human guidance to succeed in complex tasks like web development.

Takeaways

😀 AI agents attempted to build a browser from scratch in a week using $2.3 billion in VC funding, but faced numerous challenges in coordination and progress.
😀 The lack of hierarchy in the AI system led to a flat structure where agents avoided difficult tasks, resulting in no substantial progress on the project.
😀 The cursor team introduced a hierarchical structure, splitting AI agents into 'planners' (who created tasks) and 'workers' (who executed them), mirroring a corporate environment.
😀 Despite the large scale of the project (1 million lines of code), the result was a broken browser with many issues, including compilation errors and lack of functionality.
😀 The AI agents never performed essential steps like running 'cargo build' or 'cargo check,' leading to errors that users faced when trying to compile the browser.
😀 The claims that the AI agents built core web features like an HTML parser and JavaScript engine from scratch were misleading, as they reused existing libraries and code from other open-source projects.
😀 A web standards test revealed that the resulting browser was severely flawed, unable to execute key tests like those for JavaScript, leading to widespread skepticism about the project's viability.
😀 Developers found numerous overlaps between the cursor browser's code and the Servo project, raising questions about the originality of the AI's work.
😀 The project’s developer tried to fix the issues, but some inconsistencies in the Git logs indicated possible human intervention, casting doubt on the AI's role in the project.
😀 Gregory Tzen, a browser expert, criticized the project for its poor design and lack of understanding of web standards, pointing out that AI should assist, not replace human expertise in complex tasks like building browsers.
😀 Ultimately, the project was seen as a marketing stunt to justify the massive $23 billion valuation of the cursor team, with many questioning whether AI can truly replace skilled developers.

Q & A

What was the main goal of the AI agents' experiment conducted by the Cursor team?
-The main goal of the experiment was to have AI agents build a fully functioning web browser from scratch, aiming to demonstrate that software engineers may not be needed in the future.
What challenges did the Cursor team's AI agents face during the experiment?
-The AI agents faced several challenges, such as inefficient coordination, locking issues, and risk-averse behavior. These resulted in work churning and slow progress, with some agents failing to release file locks or attempting to lock files they were already working on.
How did the experiment's structure change after the initial failure?
-The experiment's structure was changed to create two types of agents: planners and workers. Planners assigned tasks, while workers focused on completing tasks without coordinating with others. A judge agent evaluated the work and decided whether to continue or restart the process.
What was the result of the experiment in terms of code production?
-The AI agents produced 1 million lines of code spread across 1,000 files in one week, using trillions of GPT-5.2 tokens. Despite this, the code was not fully functional, with many errors and warnings preventing it from compiling.
What was the quality of the browser that the AI agents created?
-The browser built by the AI agents was highly problematic. Although it looked like a browser, it failed to compile and was riddled with errors. It could not run basic web tests, and the code contained no coherent design or adherence to web standards.
Did the Cursor team use any existing code to build the browser?
-Yes, the Cursor team reused existing code by importing libraries from other projects, such as the Servo browser for the HTML and CSS parsers, and QuickJS for the JavaScript engine. The team initially claimed they built these components from scratch, but this was later revealed to be false.
How did the web community react to the browser's performance?
-The web community reacted critically to the browser, with experts pointing out major flaws in the design and implementation. Gregory Tzen, a contributor to Servo, criticized the code for being a 'tangle of spaghetti' and non-compliant with web standards, ultimately claiming that the AI agents did not build a functional browser.
What was the main issue with the AI agents' approach to building the browser?
-The main issue was that the AI agents did not follow web standards or conventions. They created a browser based on their own hallucinated designs, leading to a dysfunctional, non-standard implementation that experts described as a 'monstrosity.'
How did the Cursor team defend their project against criticisms?
-The Cursor team pushed back on some criticisms, arguing that not all dependencies from Servo were used and that some code was altered. However, these claims were met with skepticism, as many developers noticed strong overlaps between their code and Servo's, including similar comments in the code.
What lesson can be drawn from the Cursor team's experiment with AI agents?
-The experiment highlights that AI, while a powerful tool, needs human oversight and guidance to produce functional, quality code. Autonomous agents alone, without proper human intervention and expertise, are unlikely to build reliable and standards-compliant software.