The correct answer is A because memory-transfer bottlenecks are addressed by GPU memory-system features such as high-bandwidth memory, shared memory, cache, and high-bandwidth interconnects or buses. NVIDIA’s Blackwell tuning guide describes the GPU memory system and states that the NVIDIA B200 GPU supports HBM3 and HBM3e high-bandwidth memory with capacity up to 180 GB. NVIDIA’s CUDA tuning documentation also describes shared memory as an important architectural resource available per streaming multiprocessor, which helps reduce slower memory traffic when used effectively.
Why the other options are incorrect: GPUs are not normally wired as main disk controllers. Increasing generic PCIe I/O ports does not directly solve simulation memory-transfer bottlenecks inside GPU execution. Dedicated inference ASICs are not the general NVIDIA GPU architectural feature used to address memory-transfer performance in simulation workloads.
[Reference: NVIDIA CUDA Blackwell Tuning Guide; NVIDIA CUDA Ada GPU Architecture Tuning Guide.]
Contribute your Thoughts:
Chosen Answer:
This is a voting comment (?). You can switch to a simple comment. It is better to Upvote an existing comment if you don't have anything to add.
Submit