Theia

Article

NVIDIA Expands DGX Spark to Support 4 Nodes for 700B Parameter AI Models

DATA AND AI INFRASTRUCTURE

NVIDIA has expanded its DGX Spark platform to support up to four nodes, increasing memory to 512 GB and allowing local inference for models with up to 700 billion parameters. Token generation throughput improves from 18,400 to 74,600 tokens per second, while output token processing time decreases from 269ms to 72ms with tensor parallelism.

The Grace Blackwell Superchip enables efficient processing of multiple subagents simultaneously. The platform's configurations cater to various model sizes, including support for models like Qwen3.5 and GLM 5. Additionally, the Tile IR kernel portability layer allows code developed on DGX Spark to be deployed on data center GPUs with minimal changes, streamlining the transition from local to cloud infrastructure.

NVIDIA Expands DGX Spark to Support 4 Nodes for 700B Parameter AI Models
Mar 17, 2026, 9:45 PM

No comments yet. Be the first to share your thoughts!