Meta AI recently launched MobileLLM-R1, a series of lightweight edge inference models, which are now available on Hugging Face. The model series has parameters ranging from 140M to 950M, focusing on efficient mathematical, coding, and scientific reasoning, and achieving excellent performance with fewer than 1 billion parameters.

image.png

The largest model in the MobileLLM-R1 series is MobileLLM-R1-950M, which uses a series of architectural optimization designs: including 22-layer Transformer structures, 24 attention heads, and 6 grouped KV heads. The embedding dimension of the model is 1536, and the hidden layer dimension is 6144. In addition, the model uses grouped query attention (GQA) to reduce computational and memory requirements, block-level weight sharing technology reduces the number of parameters without significantly increasing latency, and the SwiGLU activation function enhances the representational ability of small models. The model supports a context length of 4K and a 32K post-training model.

In terms of training efficiency, MobileLLM-R1 also shows remarkable performance. The model was trained on approximately 4.2 trillion tokens, using only about 11.7% of the data compared to Qwen3's 0.6B model, which was trained on 36 trillion tokens, and achieved or exceeded the accuracy of Qwen3. At the same time, the model was fine-tuned on supervised datasets for mathematical, coding, and reasoning tasks, thereby reducing training costs and resource requirements.

In various benchmark tests, MobileLLM-R1-950M performed excellently: on the MATH500 dataset, its accuracy was about five times higher than OLMo-1.24B and about twice higher than SmolLM2-1.7B. On reasoning and coding tasks such as GSM8K, AIME, and LiveCodeBench, MobileLLM-R1 even matched or surpassed Qwen3-0.6B, despite using far fewer tokens than the latter.

However, the focus of MobileLLM-R1 also brings limitations. Although it performs strongly in mathematics, coding, and structured reasoning, it lags behind larger models in general conversation, common-sense reasoning, and creative tasks. In addition, the use of the model in production environments is restricted by the FAIR NC (non-commercial) license, and the longer context (32K) increases the key-value cache and memory requirements during inference.

Overall, Meta's MobileLLM-R1 demonstrates a trend towards smaller and more specialized models, which can achieve competitive reasoning capabilities without requiring large-scale training budgets. The model performs particularly well in mathematical, coding, and scientific application scenarios, defining new standards for large-scale language model deployment on edge devices.

Project: https://huggingface.co/facebook/MobileLLM-R1-950M

Key Points:   

🧩 ** New Model Release **: Meta AI launched the MobileLLM-R1 series of lightweight edge inference models with parameters ranging from 140M to 950M.   

📊 ** Training Efficiency **: MobileLLM-R1 showed excellent performance using only about 11.7% of the data, significantly reducing training costs and resource requirements.   

💡 ** Performance Advantages **: In multiple benchmark tests, MobileLLM-R1-950M outperformed several large open-source models, especially in mathematical and coding tasks.