Phi-4
Explore and build with Phi-4 models on Vertex AI.The Phi-4 model was proposed in the Phi-4 Technical Report. Microsoft released the Phi-4 model which is a 14B parameter, dense-decoder Transformer model. This model was trained with an extension of the Phi-3 datasets that includes both synthetic data and the filtered publicly available websites data with a focus on high-quality and reasoning dense properties. After initial training, the models underwent a post-training process that involved supervised fine-tuning and direct preference optimization to enhance its ability to follow instructions and adhere to safety measures. This model card includes the Phi-4 model.
The model is intended for broad commercial and research use in English. The model provides uses for general purpose AI systems and applications which require:
Memory/compute constrained environments
Latency bound scenarios
Strong reasoning (especially code, math and logic)
The model is designed to accelerate research on language and multimodal models, for use as a building block for generative AI powered features.
This model can be used in a notebook. Click Open notebook to deploy and run inference on the model in Colab Enterprise.
Deploying a model consists of 3 steps: creating an endpoint resource, uploading the model, and deploying the model to the endpoint. A service account will need to be created with the Vertex AI User role for deploying models to Vertex AI endpoints.
Example deployment (Python) The code sample below creates a new endpoint, uploads the model, and deploys the model to the endpoint (Refer to vLLM API documentation).
Example inference (Python)
Example inference (Cloud console) If you've deployed this model to Vertex AI, you can perform inference in the Cloud console as a JSON request:
Example model response (Cloud console)
Model
The family of Phi-4 models are dense decoder-only Transformer models and are
suited best for prompts using chat format.
Inputs: It is best suited for prompts using chat format. Outputs: Generated text in response to the input.
Model | Params | Tokens | Training Data | Training Time | Knowledge Cutoff Date |
---|---|---|---|---|---|
Phi-4 | 14B | 16K | 9.8T tokens | 21 days | October 2024->November 2024 |
Datasets
The training data includes a wide variety of sources and is a combination of:
Publicly available documents filtered rigorously for quality, selected high-quality educational data, and code;
Newly created synthetic, "textbook-like" data for the purpose of teaching math, coding, common sense reasoning, general knowledge of the world (science, daily activities, theory of mind, etc.);
Acquired academic books and Q&A datasets.
High quality chat format supervised data covering various topics to reflect human preferences on different aspects such as instruct-following, truthfulness, honesty and helpfulness.
Category | Benchmark | Phi-4 |
Popular aggregated benchmark | MMLU | 84.8 |
Reasoning | DROP | 75.5 |
Factual Knowledge | SimpleQA | 3.0 |
Math | MGSM | 80.6 |
MATH | 80.4 | |
Code Generation | HumanEval | 82.6 |
Science | GPQA | 56.1 |
Benchmarks accessed from microsoft/Phi-4 · Hugging Face
Resource ID | Release Date | Release Stage | Description |
---|---|---|---|
microsoft/Phi-4 | 1/28/2025 | GA | Serving for text generation |
The model is licensed under the MIT license.
Google Cloud Console has failed to load JavaScript sources from www.gstatic.com.
Possible reasons are: