Cohere Releases Command A+ as Open Source: 218B Parameter Enterprise Model Runs on Two H100 GPUs

Cohere released Command A+, a 218-billion parameter mixture-of-experts model with 25 billion active parameters, under a fully permissive Apache 2.0 open-source license. The model runs on as few as two H100 GPUs while delivering frontier-level reasoning, native citations, multimodal document processing, and support for 48 languages.

What Is Command A+?

Command A+ is Cohere’s most powerful model, unifying capabilities from four prior models — Command A, Command A Reasoning, Command A Vision, and Command A Translate — into a single architecture. It uses a Sparse Mixture-of-Experts Transformer with 128 experts (8 active per token), totaling 218 billion parameters with only 25 billion active during inference. The small active parameter count enables deployment on as few as two H100 GPUs or a single B200 GPU.

Why Is the Apache 2.0 License a Big Deal?

This is Cohere’s first fully Apache 2.0 licensed model. Previous Cohere models used CC-BY-NC 4.0, which prohibited commercial use without an enterprise license. Apache 2.0 is a true open-source license — anyone can use, modify, distribute, and commercialize the model without licensing fees. This positions Command A+ as a direct competitor to DeepSeek V4 and Llama 4 for sovereign AI deployments.

What Benchmarks Does It Set?

Command A+ showed dramatic improvements over its predecessor: τ²-Bench Telecom jumped from 37% to 85%, Terminal-Bench Hard from 3% to 25%. On the Artificial Analysis Intelligence Index, it scores just under 37 points — on par with Claude 4.5 Haiku. The model also introduces native citation generation, directly linking every factual claim to its source document.

What Hardware Does It Need?

Three quantization variants are available: BF16 (16-bit) needs 4x B200 or 8x H100 GPUs; FP8 (8-bit) needs 2x B200 or 4x H100; W4A4 (4-bit) runs on a single B200 or 2x H100 GPUs. All three show negligible quality differences. Cohere recommends W4A4 for most deployments, using NVFP4 quantization with Quantization-Aware Distillation.

Key Takeaways

218B total / 25B active parameters in Sparse MoE architecture
Released under fully permissive Apache 2.0 open-source license
Runs on as few as 2x H100 GPUs (W4A4 quantization)
Native citation generation with explicit grounding spans
Supports 48 languages with multimodal document processing
128K token context window with up to 63% faster inference than predecessor
Cohere recently merged with German AI startup Aleph Alpha

Frequently Asked Questions

Where can I download Command A+? Model weights are available on Hugging Face in multiple quantizations, with day-one support for vLLM and Hugging Face inference frameworks.

Is Command A+ better than DeepSeek V4? Command A+ competes favorably on reasoning and math benchmarks despite its smaller active parameter count, but trails DeepSeek V4 on pure coding benchmarks.

What does “sovereign AI” mean? Sovereign AI refers to the ability of governments and enterprises to run, control, and adapt AI within their own secure environments without sending data to external providers.