Microsoft Phi-3

Microsoft New Phi-3 Model Additions

This week at the Microsoft Build 2024 conference, the tech giant announced an exciting set of updates to its Phi-3 family of small, open models. The news includes the introduction of Phi-3-vision, a multimodal model that combines language and vision capabilities, providing developers with powerful tools for generative AI applications.

New Phi-3 Models

The Phi-3 family now includes four models, each designed to meet different needs and computational constraints:

  • Phi-3-vision: A 4.2 billion parameter model integrating language and vision capabilities optimized for tasks such as OCR, chart, and diagram understanding.
  • Phi-3-mini: A 3.8 billion parameter language model, available in two context lengths (128K and 4K).
  • Phi-3-small: A 7 billion parameter language model, available in two context lengths (128K and 8K).
  • Phi-3-medium: A 14 billion parameter language model, available in two context lengths (128K and 4K).

These models are now available on Microsoft Azure, enabling developers to leverage their strong reasoning capabilities, optimized performance, and cost-effectiveness for various AI applications.

Phi-3-vision is the first multimodal model in the Phi-3 family, excelling in tasks that require integrating textual and visual information. Examples of this are understanding charts and diagrams, performing OCR, and generating insights from complex visual data. With its 4.2 billion parameters, Phi-3-vision delivers what Microsoft characterizes as “groundbreaking performance” in visual reasoning tasks, surpassing larger models like Claude-3 Haiku and Gemini 1.0 Pro V.

Phi-3 models are optimized to run across diverse hardware platforms, supporting a wide range of devices and deployment scenarios. Optimized variants are available with ONNX Runtime and DirectML, ensuring compatibility with mobile and web deployments. Additionally, Phi-3 models are accessible as NVIDIA NIM inference microservices, providing an efficient and scalable solution for AI applications on NVIDIA GPUs and Intel accelerators.

Safety and Responsible AI

Microsoft developed the Phi-3 models in accordance with its Responsible AI Standard, ensuring they meet rigorous safety and security standards. These models undergo extensive safety evaluations, including reinforcement learning from human feedback (RLHF) and automated testing across harm categories. Developers can also use Azure AI tools to build safer and more trustworthy applications.

Analysis

As the industry finds use for generative AI across a broad range of environments, we’re seeing that not every generative AI application requires a large language model. Indeed, small language models (SLMs) are having a moment. In addition to Microsoft’s Phi-3 family, OpenAI offers its GPT-2, Hugging Face its DistilBERT, and Google its ALBERT.

Small language models offer significant value by providing powerful AI capabilities while being more cost-effective and efficient in terms of computational resources. They excel in tasks requiring strong reasoning, language understanding, and generation, making them ideal for applications where quick response times and limited computing power are critical.

SLMs are easier to deploy across various devices, including mobile and web platforms, and can be fine-tuned to meet specific needs. Thus, they democratize access to advanced AI and enable a wide range of practical applications in diverse fields such as healthcare, education, and customer service.

Microsoft’s Phi-3 family offers versatile, efficient, and cost-effective solutions for various applications. With the addition of Phi-3-vision and the availability of other Phi-3 models on Azure, developers are well-equipped to create innovative AI applications that enhance digital experiences and drive operational efficiency.

Disclosure: The author is an industry analyst, and NAND Research an industry analyst firm, that engages in, or has engaged in, research, analysis, and advisory services with many technology companies, which may include those mentioned in this article. The author does not hold any equity positions with any company mentioned in this article.