NVIDIA AI Infrastructure NCP-AII Question # 26 Topic 3 Discussion
NCP-AII Exam Topic 3 Question 26 Discussion:
Question #: 26
Topic #: 3
A system administrator needs to install a GPU/DPU in a server. The server has a free PCI-e slot, there are enough free PCI-e lanes, and there is enough room for the card. Which procedure should be followed?
A.
Ensure the server has enough power. Verify compatibility of cables with server ' s platform. Make sure the server is down to remove cables safely. Do not wear an ESD bracelet.
B.
Ensure the server has enough power. Make sure the server is down to remove cables safely. Wear an ESD bracelet.
C.
Ensure the server has enough power. Make sure the server is up and running with attached cables. Wear an ESD bracelet.
D.
Ensure the server has enough power. Verify compatibility of cables with server ' s platform. Make sure the server is down to remove cables safely. Wear an ESD bracelet.
The physical installation of high-performance NVIDIA components, such as H100 PCIe GPUs or BlueField DPUs, requires strict adherence to data center safety and hardware preservation standards. Option D is the only " 100% verified " procedure because it covers three critical pillars: Power, Compatibility, and Safety. First, high-end GPUs can draw up to 300W-450W individually; verifying the server ' s PDU and internal PSU capacity is essential to prevent over-current shutdowns. Second, verifying cable compatibility (such as 12VHPWR or specific PCIe power 8-pin layouts) is vital to avoid electrical damage. Third, " Cold Service " (ensuring the server is powered down and cables are removed) is the standard for non-hot-plug PCIe components to prevent short circuits. Finally, wearing an ESD (Electrostatic Discharge) bracelet is non-negotiable when handling NVIDIA hardware, as static charges can destroy the sensitive HBM (High Bandwidth Memory) or the GPU die itself. Skipping ESD protection (as suggested in Option A) or performing the install while the system is " up and running " (as suggested in Option C) are leading causes of hardware infant mortality in AI infrastructure.
Contribute your Thoughts:
Chosen Answer:
This is a voting comment (?). You can switch to a simple comment. It is better to Upvote an existing comment if you don't have anything to add.
Submit