Meta's New AI Models: Llama 4 Explained
On April 5, 2025, Meta introduced three new AI models: Llama 4 Scout, Maverick, and Behemoth. Let's take a quick look at what each of them can do.
Scout: Efficient and Capable
Scout has 17 billion parameters and runs on a single NVIDIA H100 GPU. It has a large context window of 10 million tokens. This helps it handle long documents, summarize data, and process complex code.
It beats other models like Gemma 3, Gemini 2.0 Flashlight, and Mistral 3.1 in many tasks.
Maverick: Versatile and Strong
Maverick also has 17 billion parameters but uses 128 experts. It can understand both text and images together. It does better than GPT-4o and Gemini 2.0 in coding, languages, and long text tasks. It uses a method called Mixture of Experts to save computing power.
Behemoth: The Next Big Thing
Behemoth is still in training. It has 288 billion active parameters and 2 trillion total. Meta says it outperforms GPT-4.5 and others in math and science tasks. It might help train smaller models in the future.
Multimodal Features
All three models can understand text, images, and videos together. They use early fusion to combine these inputs. This can help in healthcare, content creation, and more.
Open Source Questions
Meta released these models as “open,” but there are limits. Big companies need permission to use them. Some people say this means the models are not truly open source.
RAG and Context Windows
Scout’s long context window may reduce the need for Retrieval-Augmented Generation (RAG). But RAG can still help when new or specific info is needed.
Key Points
- Scout is efficient with a 10 million token context window.
- Maverick is powerful and handles both text and images well.
- Behemoth is huge and still in development.
- All models use multimodal input and a Mixture of Experts setup.
- There are questions about how “open” these models really are.
Sources: