AMD unveils MI400 inference chip to challenge Nvidia H200

A new data-center GPU with 288GB HBM3e and ROCm 7 software stack targets LLM inference workloads at hyperscale.

AMD's MI400 accelerator brings 288GB of HBM3e and a peak throughput of 5.3 petaflops of FP8, directly targeting Nvidia's H200 in inference-heavy deployments. ROCm 7 ships alongside with improved PyTorch and vLLM compatibility. Microsoft and Oracle have committed to MI400 deployments in H2 2026.

Source