NVIDIA Expands DGX Spark to Support 4 Nodes for 700B Parameter AI Models
NVIDIA has upgraded its DGX Spark platform to support up to four nodes, enabling local inference of AI models with 700 billion parameters. This expansion increases memory to 512 GB and improves token generation throughput significantly. The platform utilizes the Grace Blackwell Superchip for efficient processing of autonomous agents, facilitating complex requests and multi-agent systems. The new capabilities present a viable option for enterprises seeking to balance local and cloud AI infrastructure.

NVIDIA has expanded its DGX Spark platform to support up to four nodes, increasing memory to 512 GB and allowing local inference for models with up to 700 billion parameters. Token generation throughput improves from 18,400 to 74,600 tokens per second, while output token processing time decreases from 269ms to 72ms with tensor parallelism.
The Grace Blackwell Superchip enables efficient processing of multiple subagents simultaneously. The platform's configurations cater to various model sizes, including support for models like Qwen3.5 and GLM 5. Additionally, the Tile IR kernel portability layer allows code developed on DGX Spark to be deployed on data center GPUs with minimal changes, streamlining the transition from local to cloud infrastructure.




Comments