Black Friday Special Limited Time 70% Discount Offer - Ends in 0d 00h 00m 00s - Coupon code: validbest

Pass the NVIDIA-Certified Professional NCP-AIO Questions and answers with ValidTests

Exam NCP-AIO All Questions
Exam NCP-AIO Premium Access

View all detail and faqs for the NCP-AIO exam

Viewing page 1 out of 2 pages
Viewing questions 1-10 out of questions
Questions # 1:

A system administrator wants to run these two commands in Base Command Manager.

main

showprofile device status apc01

What command should the system administrator use from the management node system shell?

Options:

A.

cmsh -c “main showprofile; device status apc01”

B.

cmsh -p “main showprofile; device status apc01”

C.

system -c “main showprofile; device status apc01”

D.

cmsh-system -c “main showprofile; device status apc01”

Expert Solution
Questions # 2:

You are an administrator managing a large-scale Kubernetes-based GPU cluster using Run:AI.

To automate repetitive administrative tasks and efficiently manage resources across multiple nodes, which of the following is essential when using the Run:AI Administrator CLI for environments where automation or scripting is required?

Options:

A.

Use the runai-adm command to directly update Kubernetes nodes without requiring kubectl.

B.

Use the CLI to manually allocate specific GPUs to individual jobs for better resource management.

C.

Ensure that the Kubernetes configuration file is set up with cluster administrative rights before using the CLI.

D.

Install the CLI on Windows machines to take advantage of its scripting capabilities.

Expert Solution
Questions # 3:

What must be done before installing new versions of DOCA drivers on a BlueField DPU?

Options:

A.

Uninstall any previous versions of DOCA drivers.

B.

Re-flash the firmware every time.

C.

Disable network interfaces during installation.

D.

Reboot the host system.

Expert Solution
Questions # 4:

You have successfully pulled a TensorFlow container from NGC and now need to run it on your stand-alone GPU-enabled server.

Which command should you use to ensure that the container has access to all available GPUs?

Options:

A.

kubectl create pod --gpu=all nvcr.io/nvidia/tensorflow:

B.

docker run nvcr.io/nvidia/tensorflow:

C.

docker start nvcr.io/nvidia/tensorflow:

D.

docker run --gpus all nvcr.io/nvidia/tensorflow:

Expert Solution
Questions # 5:

An instance of NVIDIA Fabric Manager service is running on an HGX system with KVM. A System Administrator is troubleshooting NVLink partitioning.

By default, what is the GPU polling subsystem set to?

Options:

A.

Every 1 second

B.

Every 30 seconds

C.

Every 60 seconds

D.

Every 10 seconds

Expert Solution
Questions # 6:

An administrator is troubleshooting a bottleneck in a deep learning run time and needs consistent data feed rates to GPUs.

Which storage metric should be used?

Options:

A.

Disk I/O operations per second (IOPS)

B.

Disk free space

C.

Sequential read speed

D.

Disk utilization in performance manager

Expert Solution
Questions # 7:

When troubleshooting Slurm job scheduling issues, a common source of problems is jobs getting stuck in a pending state indefinitely.

Which Slurm command can be used to view detailed information about all pending jobs and identify the cause of the delay?

Options:

A.

scontrol

B.

sacct

C.

sinfo

Expert Solution
Questions # 8:

You are a Solutions Architect designing a data center infrastructure for a cloud-based AI application that requires high-performance networking, storage, and security. You need to choose a software framework to program the NVIDIA BlueField DPUs that will be used in the infrastructure. The framework must support the development of custom applications and services, as well as enable tailored solutions for specific workloads. Additionally, the framework should allow for the integration of storage services such as NVMe over Fabrics (NVMe-oF) and elastic block storage.

Which framework should you choose?

Options:

A.

NVIDIA TensorRT

B.

NVIDIA CUDA

C.

NVIDIA NSight

D.

NVIDIA DOCA

Expert Solution
Questions # 9:

You need to do maintenance on a node. What should you do first?

Options:

A.

Drain the compute node using scontrol update.

B.

Set the node state to down in Slurm before completing maintenance.

C.

Set the node state to down in Slurm before completing maintenance.

D.

Disable job scheduling on all compute nodes in Slurm before completing maintenance.

Expert Solution
Questions # 10:

A system administrator of a high-performance computing (HPC) cluster that uses an InfiniBand fabric for high-speed interconnects between nodes received reports from researchers that they are experiencing unusually slow data transfer rates between two specific compute nodes. The system administrator needs to ensure the path between these two nodes is optimal.

What command should be used?

Options:

A.

ibtracert

B.

ibstatus

C.

ibping

D.

ibnetdiscover

Expert Solution
Viewing page 1 out of 2 pages
Viewing questions 1-10 out of questions