Mistral's Devstral: NEW Opensource Coding LLM! 1# On SWE Bench! (Fully Tested)
Summary
TLDRMistrol AI has launched Devstral, a cutting-edge open-source coding model designed specifically for software engineering tasks. Outperforming prior open-source models and some closed models like GPT-4.1 Mini, Devstral excels at debugging, navigating large codebases, and resolving multi-file issues. This 24-billion parameter model can be run locally on consumer hardware and integrated into VS Code through the Continue extension. While primarily focused on backend tasks, it can also handle simple front-end design tasks. With future plans for larger versions, Devstral promises to push the boundaries of open-source AI in software development.
Takeaways
- π Mistrol AI introduced a new open-source coding model called Devstral, designed for real software engineering tasks.
- π Devstral is released under the Apache 2.0 license and outperforms all previous open-source models, achieving a score of 46.8% on the sway bench verified benchmark.
- π The model outperforms closed models like GPT-4.1 Mini by over 20%, making it one of the top choices for real-world coding challenges.
- π Unlike typical large language models, Devstral is specifically built to tackle complex software engineering problems such as debugging, navigating large codebases, and resolving issues across multiple files.
- π Devstral works using agent scaffolds like open hands or suite agent, simulating real developer approaches to problem-solving.
- π The model can be run locally on systems such as an RTX 4090 or a Mac with 32GB RAM, with instructions available for setting it up with LM Studio or Continue.
- π Devstral is flexible, fast, and enterprise-ready, outperforming models like Claw 3.5 and GPT-4.1 Mini in sway bench tests.
- π You can access the Devstral model through the Continue extension in VS Code, which allows you to integrate it for executing complex coding tasks.
- π Devstral is available through Open Router and Mistrol's API, with a pricing structure of 10 cents for 1 million input tokens and 30 cents for 1 million output tokens.
- π Despite being open-source and not specifically designed for frontend development, Devstral can generate simple yet decent front-end designs like a SAS landing page.
- π The model excels at debugging, having identified and resolved several issues in a codebase, demonstrating its utility for navigating large codebases and maintaining code integrity.
Q & A
What is Devstral and what is its primary purpose?
-Devstral is an open-source coding model developed by Mistrol AI, designed specifically for software engineering tasks. It aims to handle real-world coding challenges like navigating large codebases, debugging errors across multiple files, and understanding project structures.
How does Devstral perform compared to previous open-source models?
-Devstral significantly outperforms previous open-source models, scoring 46.8% on the Sway Benchmarks, which is over 6 percentage points higher than the previous best. It even surpasses some large closed models like GPT-4.1 Mini by more than 20%.
What makes Devstral different from typical large language models in coding tasks?
-Unlike typical large language models that struggle with coding challenges, Devstral is specifically built to tackle complex engineering tasks such as debugging and managing large codebases, using frameworks like Open Hands to simulate real developer behavior.
What hardware is required to run Devstral locally?
-Devstral can be run locally on an **RTX 4090** or a **Mac with 32GB RAM**. This makes it accessible to developers with high-end hardware.
Can Devstral be run on systems with less powerful hardware?
-Yes, OpenRouter provides a free, smaller version of Devstral that can be accessed through an API, though it has limitations like rate limits and lower performance compared to the full version.
How does Devstral integrate with development tools like VS Code?
-Devstral integrates with development tools like **VS Code** through the **Continue** extension. Developers can easily set it up by installing the extension and running the necessary commands to access Devstral's capabilities within their development environment.
What is the licensing for Devstral?
-Devstral is released under the **Apache 2.0** license, making it open-source and freely available for modification and distribution.
What are some of the use cases for Devstral?
-Devstral can be used for a variety of software engineering tasks, including generating basic front-end components (e.g., SAS landing pages) and debugging large codebases by identifying issues across multiple files and resolving them.
How does Devstral handle debugging in large codebases?
-Devstral autonomously navigates and edits large codebases, identifying errors across multiple files and ensuring that any changes made do not negatively affect other parts of the code. It can fix issues within individual files and make updates across the entire repository.
Is Devstral suitable for front-end development tasks?
-While Devstral is primarily focused on complex back-end and software engineering tasks, it can generate simple front-end components, like a basic SAS landing page. However, its true strength lies in debugging and solving real-world software engineering challenges.
Outlines

This section is available to paid users only. Please upgrade to access this part.
Upgrade NowMindmap

This section is available to paid users only. Please upgrade to access this part.
Upgrade NowKeywords

This section is available to paid users only. Please upgrade to access this part.
Upgrade NowHighlights

This section is available to paid users only. Please upgrade to access this part.
Upgrade NowTranscripts

This section is available to paid users only. Please upgrade to access this part.
Upgrade NowBrowse More Related Video

OpenThinker (Fully Tested): This NEW REASONING MODEL is QUITE CRAZY!

BREAKING: LLaMA 405b is here! Open-source is now FRONTIER!

Claude has taken control of my computer...

Microsoft AutoDev is Here! Fully Autonomous SOFTWARE DEVELOPERS

Should you still learn to code? (ft. Devin)

This free Chinese AI just crushed OpenAI's $200 o1 model...
5.0 / 5 (0 votes)