Google Cloud console

Codestral (25.01)

A cutting-edge model specifically designed for code generation, including fill-in-the-middle and code completion.

Overview

Codestral (25.01) is explicitly designed for code generation tasks. It helps developers write and interact with code through a shared instruction and completion API endpoint. As it masters code and can also converse in a variety of languages, it can be used to design advanced AI applications for software developers. Codestral 25.01 features a more efficient architecture and an improved tokenizer than the original, generating and completing code about 2 times faster.

A model fluent in 80+ programming languages including Python, Java, C, C++, JavaScript, and Bash. It also performs well on more specific ones like Swift and Fortran.
Improve developers productivity and reduce errors: it can complete coding functions, write tests, and complete any partial code using a fill-in-the-middle mechanism.
New standard on the performance/latency space with a 128k context window.

Characteristics	Mistral Small 3.1 (25.03)	Codestral (25.01)	Mistral Large (24.11)	Mistral Nemo
Preferred use-cases	Fast and versatile multimodal tasks with image inputs	Code-specific tasks to enhance developer productivity (e.g. autocompletion, automated code review, test suite generation)	Complex tasks requiring advanced reasoning abilities or a high level of specialization (e.g. creative writing, agentic workflows, code generation)	Streamlined tasks that one can do in bulk (e.g. classification, customer support, text generation)

Use cases

Code generation: code completion, suggestions, translation
Code understanding and documentation: code summarization and explanation
Code quality: code review, refactoring, bug fixing and test case generation
Code fill-in-the-middle: users can define the starting point of the code using a prompt, and the ending point of the code using an optional suffix and an optional stop. The Codestral model will then generate the code that fits in between, making it ideal for tasks that require a specific piece of code to be generated.

Documentation

Getting started

Before you begin

Enable the Vertex AI API.

Authenticate with one of the standard mechanisms documented here.

Vertex AI API - cURL

Execute the following commands/script in Cloud Shell or a local terminal window with the gcloud CLI installed. Authenticate and replace PROJECT_ID with your Google Cloud project ID. You can find the supported regions here.

Instruct example

You can send a POST request to the specified API endpoint to get a response from the Mistral model. Find here, Mistral's API documentation and more info on parameter settings such as temperature, top_p, max_tokens, etc.

Please be aware that safe_prompt is the only parameter from Mistral API not supported for now. This will be supported soon. You will be able to set safe_prompt:true to enable the optional system prompt in order to enforce guardrails on top of Mistral models for chat completion.

Fill-in-the-middle example

Python code sample

You will need to install the httpx and the google-auth packages in your virtual environment. You will also need to set the GOOGLE_REGION and GOOGLE_PROJECT_ID environment variables respectively to the target region and project ids to use. You can find the supported regions here.

Instruct example

Fill-in-the-middle example: replace the data dict from the instruct example with

Evaluation Metrics

Overview

Model	Context length	HumanEval	MBPP	CruxEval	LiveCodeBench	RepoBench	Spider	CanItEdit	HumanEval (average)	HumanEvalFIM (average)
Codestral-2501	256k	86.6%	80.2%	55.5%	37.9%	38.0%	66.5%	50.5%	71.4%	85.9%
Codestral-2405 22B	32k	81.1%	78.2%	51.3%	31.5%	34.0%	63.5%	50.5%	65,6%	82.1%
Codellama 70B instruct	4k	67.1%	70.8%	47.3%	20.0%	11.4%	37.0%	29.5%	55.3%	-
DeepSeek Coder 33B instruct	16k	77.4%	80.2%	49.5%	27.0%	28.4%	60.0%	47.6%	65.1%	85.3%
DeepSeek Coder V2 lite	128k	83.5%	83.2%	49.7%	28.1	20.0%	72.0%	41.0%	65.9%	84.1%

Per-language

Model	HumanEval Python	HumanEval C++	HumanEval Java	HumanEval Javascript	HumanEval Bash	HumanEval TypeScript	HumanEval C#	HumanEval (average)
Codestral-2501	86.6%	78.9%	72.8%	82.6%	43.0%	82.4%	53.2%	71.4%
Codestral-2405 22B	81.1%	68.9%	78.5%	71.4%	40.5%	74.8%	43.7%	65,6%
Codellama 70B instruct	67.1%	56.5%	60.8%	62.7%	32.3%	61.0%	46.8%	55.3%
DeepSeek Coder 33B instruct	77.4%	65.8%	73.4%	73.3%	39.2%	77.4%	49.4%	65.1%
DeepSeek Coder V2 lite	83.5%	68.3%	65.2%	80.8%	34.2%	82.4%	46.8%	65.9%

FIM (single line exact match)

Model	HumanEvalFIM Python	HumanEvalFIM Java	HumanEvalFIM JS	HumanEvalFIM (average)
Codestral-2501	80.2%	89.6%	87.96%	85.89%
Codestral-2405 22B	77.0%	83.2%	86.08%	82.07%
OpenAI FIM API*	80.0%	84.8%	86.5%	83.7%
DeepSeek Chat API	78.8%	89.2%	85.78%	84.63%
DeepSeek Coder V2 lite	78.7%	87.8%	85.90%	84.13%
DeepSeek Coder 33B instruct	80.1%	89.0%	86.80%	85.3%

FIM pass@1

Model	HumanEvalFIM Python	HumanEvalFIM Java	HumanEvalFIM JS	HumanEvalFIM (average)
Codestral-2501	92.5%	97.1%	96.1%	95.3%
Codestral-2405 22B	90.2%	90.1%	95.0%	91.8%
OpenAI FIM API*	91.1%	91.8%	95.2%	92.7%
DeepSeek Chat API	91.7%	96.1%	95.3%	94.4%

Model input and output

Instruct example

Sample Input (POST request payload)

Sample Output

Fill-in-the-middle example

Sample Input (POST request payload)

Sample Output

Best practices and limitations

Fill-in-the-middle mode fills the void between the prompt and the suffix provided by the user. It can also be used in a “completion” mode by setting the suffix to an empty string (see example here).
To avoid having a model being too verbose or to prevent infinite generation issues, you can specify an optional stop token by adding the stop value to the payload (see example here).

Versions

Resource ID	Release date	Release stage	Description
codestral-2501	2025-01-13	General Availability

Tarea

Generación

Idioma

Inglés

French

German

Spanish

Italian

Chinese

Japanese

Korean

Portuguese

Dutch

Polish

Other languages

Nivel de habilidad

Principiante

Intermedio

Avanzado

Overview

Use cases

Documentation

Getting started

Before you begin

Vertex AI API - cURL

Instruct example

Fill-in-the-middle example

Python code sample

Instruct example

Fill-in-the-middle example: replace the data dict from the instruct example with

Evaluation Metrics

Overview

Per-language

FIM (single line exact match)

FIM pass@1

Model input and output

Best practices and limitations

Versions

Links

ID de modelo

Nombre de la versión

Etiquetas

Tarea

Idioma

Nivel de habilidad

Your page may be loading slowly because you're building optimized sources. If you intended on using uncompiled sources, please click this link.