Just at the beginning of this week, Microsoft introduced its new artificial intelligence model, Phi-3 Mini. This model attracts attention as the first of three small models planned by the company. Phi-3 Mini was trained with 3.8 billion parameters and was created using a smaller dataset than large language models.
The company has made Phi-3 Mini available on Azure, Hugging Face and Ollama platforms. It also plans to launch Phi-3 Small (7 billion parameters) and Phi-3 Medium (14 billion parameters). Parameters refer to the number of complex instructions a model can understand.
Microsoft Phi-3 Mini: Small size, big functionality
Eric Boyd, Microsoft's Corporate Vice President of AI Platform, stated that the Phi-3 Mini is just as capable as large language models (LLM) such as GPT-3.5 and has “the same capability in a smaller form factor.” These models have advantages such as being cheaper to operate and performing better on personal devices compared to their older siblings.
Earlier this year, The Information reported that Microsoft was creating a dedicated team that would focus on lightweight AI models. In addition to the Phi series, the company also develops the Orca-Math model designed to solve mathematical problems and is taking important steps in this field.
Rival companies are also developing their own small artificial intelligence models. For example, Google's Gemma 2D and 7B models are suitable for simple chatbots and language-related work. Anthropic's Claude 3 Haiku model can quickly summarize research-intensive articles, while Meta's newly released Llama 3 8B model can be used for some chatbots and coding assistants.
Boyd stated that they adopted a “curriculum”-based approach to Phi-3's education. The developers were inspired by how children learn from bedtime stories, books written with simpler words, and sentence structures that cover broad topics. “Because there weren't enough children's books, we asked an LLM to make 'children's books' using over 3,000 words to educate Phi,” Boyd said.
Phi-3 has evolved further by adding new features over previous versions. While Phi-1 concentrated on coding, Phi-2 began to develop its reasoning ability. Phi-3 became more skilled at coding and reasoning. However, despite its general knowledge, Phi-3 seems unable to compete with the diversity of answers given by a large fully internet-trained model such as GPT-4.
technoblog X, Flipboard, Google News And InstagramFollow on!