NCP-ADS試験無料問題集「NVIDIA-Certified-Professional Accelerated Data Science 認定」

A machine learning engineer is working with a financial dataset that contains multiple numerical features, including income, loan amount, and transaction frequency. Some features are normally distributed, while others have a highly skewed distribution with extreme outliers.
Which of the following approaches best ensures uniformity across features before training a model?

You are tasked with optimizing a data science workflow to scale across multiple GPUs using Dask.
Which of the following approaches would be most effective for implementing data parallelism in this scenario? (Select two)

You are deploying an NVIDIA GPU-accelerated machine learning model in a Docker container and want to ensure that your application can leverage the GPU efficiently.
What is the best way to manage CUDA dependencies and avoid compatibility issues inside your Docker container?

You are working with a large dataset containing millions of rows, and you need to store it efficiently for fast read and write operations while maintaining compatibility with CuDF and pandas.
Which of the following file formats is the best choice for efficient columnar storage and GPU-accelerated processing?

You are working with a large dataset containing millions of high-resolution images for a deep learning project. The dataset needs to be processed efficiently on a GPU before training a model.
Which NVIDIA technology is best suited for preprocessing, augmenting, and efficiently loading the dataset into memory?

Which of the following Nvidia technologies is primarily used for performing benchmarking and optimizing GPU-accelerated deep learning workflows, especially focusing on model training performance?

You are working with a large dataset that contains missing values in multiple columns. Your goal is to prepare this dataset for training a machine learning model on an NVIDIA GPU using RAPIDS.
Which of the following approaches is the most efficient method to handle missing values in this scenario?

You are analyzing a transportation network where airports represent nodes and flight routes represent edges. You need to determine the most critical airports in the network based on how many shortest paths pass through them.
Which cuGraph centrality algorithm should you use for this task?

You are training a deep learning model on a large dataset. Initially, you train the model on a single GPU and achieve a training time of 10 hours. To speed up training, you switch to a multi-GPU setup with four GPUs. However, after testing, you notice that the training time is only reduced to 3.5 hours instead of the expected 2.5 hours (a linear speedup).
What is the most likely reason for this sublinear speedup?

You are working with large datasets in cuDF and have noticed significant performance bottlenecks due to repeated computation and excessive shuffling in your workflow. You want to use data caching to optimize the execution plan and reduce redundant operations.
Which of the following is the best way to implement data caching in cuDF to avoid repeated recomputation and excessive shuffling?

You are implementing a Dask-based solution for distributed data parallelism across a multi-GPU system.
Which configuration steps would ensure effective use of GPUs for parallel computation? (Select two)

You are processing a multi-terabyte dataset in CuDF and want to optimize query performance and storage efficiency.
Which approach should you follow to ensure that the dataset remains efficiently partitioned and easily accessible?