Microsoft shrinks AI down to pocket size with Phi-3 Mini

Language model focused on reasoning fits on a smartphone and runs offline

Microsoft claims the latest incarnation of its lightweight Phi-3 Mini AI model rivals competitors such as GPT-3.5 while being small enough to be deployed on a phone.

Phi-3 Mini is a 3.8 billion-parameter language model trained on 3.3 trillion tokens. This figure is up from the 2.7 billion parameters of Phi-2, which Microsoft introduced in December 2023.

Rather than shoveling as much as possible into the training models, the focus was on reasoning. Microsoft said: "As an example, the result of a game in Premier League in a particular day might be good training data for frontier models, but we need to remove such information to leave more model capacity for 'reasoning' for the mini size models."

The targeted approach means that while Phi-3 might not have the sheer breadth of knowledge of its competitors, it is at least as good, if not better, when it comes to reasoning, or so claims Microsoft. In a research paper [PDF], Microsoft notes that this allowed its small language model "to reach the level of highly capable models such as GPT-3.5 or Mixtral with only 3.8B total parameters (while Mixtral has 45B total parameters for example)."

The research also notes that the training data used consisted of "heavily filtered web data ... from various open internet sources" and LLM-generated data. The data sources used to train LLMs is the subject of several lawsuits.

The small size of Phi-3 Mini means it can run offline on a smartphone, we're told. Researchers said it could be made to occupy approximately 1.8 GB of memory and tried it out offline on an iPhone 14 with an A16 Bionic chip running natively on a device. In the paper, researchers show screenshots of Phi-3 Mini writing a poem and suggesting things to do in Houston.

The researchers also highlight the downsides inherent in focusing on language understanding and reasoning. "The model simply does not have the capacity to store too much 'factual knowledge,'" something that can be mitigated to a certain extent by augmenting it with a search engine. However, that would defeat the point of being able to run it offline.

The language is mostly restricted to English at present, and problems inherent in most LLMs – hallucinations, bias amplification, and the generation of inappropriate content – can also be found in Phi-3 Mini.

Researchers say in the paper: "There is significant work ahead to fully address these challenges."

Larger models – relatively speaking – have also been announced in the form of Phi-3 Small and Phi-3 Medium with 7 and 14 billion parameters respectively.

Victor Botev, CTO and co-founder at Iris.ai, told us: "Microsoft's announcement of the Phi-3 model represents a continuing trend in AI development. Rather than chasing ever-larger models, Microsoft is developing tools with more carefully curated data and specialized training. This allows for improved performance and reasoning abilities without the massive computational costs of models with trillions of parameters. Fulfilling this promise would mean tearing down a huge adoption barrier for businesses looking for AI solutions.

"Microsoft is wisely looking beyond the 'bigger is better' mindset. For widespread business and consumer AI applications, feasibility and specificity are more important than massive parameters counts. Models like Phi-3 clearly demonstrate that with the right data and training approach, advanced AI capabilities need not require building ever-larger models – a deciding factor for businesses where cost-to-quality ratio is critical." ®

More about

TIP US OFF

Send us news


Other stories you might like