[Q41-Q57] MLA-C01認証試験の問題集解答を提供しています [2025年02月]

Share

MLA-C01認証試験の問題集解答を提供しています [2025年02月]

更新されたMLA-C01試験練習テスト問題

質問 # 41
A company's ML engineer has deployed an ML model for sentiment analysis to an Amazon SageMaker endpoint. The ML engineer needs to explain to company stakeholders how the model makes predictions.
Which solution will provide an explanation for the model's predictions?

  • A. Show the distribution of inferences from A/# testing in Amazon CloudWatch.
  • B. Use SageMaker Clarify on the deployed model.
  • C. Add a shadow endpoint. Analyze prediction differences on samples.
  • D. Use SageMaker Model Monitor on the deployed model.

正解:B

解説:
SageMaker Clarify is designed to provide explainability for ML models. It can analyze feature importance and explain how input features influence the model's predictions. By using Clarify with the deployed SageMaker model, the ML engineer can generate insights and present them to stakeholders to explain the sentiment analysis predictions effectively.


質問 # 42
An ML engineer needs to implement a solution to host a trained ML model. The rate of requests to the model will be inconsistent throughout the day.
The ML engineer needs a scalable solution that minimizes costs when the model is not in use. The solution also must maintain the model's capacity to respond to requests during times of peak usage.
Which solution will meet these requirements?

  • A. Deploy the model to an Amazon SageMaker endpoint. Deploy multiple copies of the model to the endpoint. Create an Application Load Balancer to route traffic between the different copies of the model at the endpoint.
  • B. Deploy the model on an Amazon Elastic Container Service (Amazon ECS) cluster that uses AWS Fargate. Set a static number of tasks to handle requests during times of peak usage.
  • C. Create AWS Lambda functions that have fixed concurrency to host the model. Configure the Lambda functions to automatically scale based on the number of requests to the model.
  • D. Deploy the model to an Amazon SageMaker endpoint. Create SageMaker endpoint auto scaling policies that are based on Amazon CloudWatch metrics to adjust the number of instances dynamically.

正解:D


質問 # 43
A company is using ML to predict the presence of a specific weed in a farmer's field. The company is using the Amazon SageMaker linear learner built-in algorithm with a value of multiclass_dassifier for the predictorjype hyperparameter.
What should the company do to MINIMIZE false positives?

  • A. Increase the number of training epochs.
  • B. Increase the value of the target_precision hyperparameter.
  • C. Set the value of the weight decay hyperparameter to zero.
  • D. Change the value of the predictorjype hyperparameter to regressor.

正解:B

解説:
Thetarget_precisionhyperparameter in the Amazon SageMaker linear learner controls the trade-off between precision and recall for the model. Increasing the target_precision prioritizes minimizing false positives by making the model more cautious in its predictions. This approach is effective for use cases where false positives have higher consequences than false negatives.


質問 # 44
A company uses Amazon Athena to query a dataset in Amazon S3. The dataset has a target variable that the company wants to predict.
The company needs to use the dataset in a solution to determine if a model can predict the target variable.
Which solution will provide this information with the LEAST development effort?

  • A. Configure Amazon Macie to analyze the dataset and to create a model. Report the model's achieved performance.
  • B. Implement custom scripts to perform data pre-processing, multiple linear regression, and performance evaluation. Run the scripts on Amazon EC2 instances.
  • C. Select a model from Amazon Bedrock. Tune the model with the data. Report the model's achieved performance.
  • D. Create a new model by using Amazon SageMaker Autopilot. Report the model's achieved performance.

正解:D

解説:
Amazon SageMaker Autopilot automates the process of building, training, and tuning machine learning models. It provides insights into whether the target variable can be effectively predicted by evaluating the model's performance metrics. This solution requires minimal development effort as SageMaker Autopilot handles data preprocessing, algorithm selection, and hyperparameter optimization automatically, making it the most efficient choice for this scenario.


質問 # 45
A credit card company has a fraud detection model in production on an Amazon SageMaker endpoint. The company develops a new version of the model. The company needs to assess the new model's performance by using live data and without affecting production end users.
Which solution will meet these requirements?

  • A. Set up blue/green deployments with canary traffic shifting.
  • B. Set up SageMaker Debugger and create a custom rule.
  • C. Set up shadow testing with a shadow variant of the new model.
  • D. Set up blue/green deployments with all-at-once traffic shifting.

正解:C

解説:
Shadow testing allows you to send a copy of live production traffic to a shadow variant of the new model while keeping the existing production model unaffected. This enables you to evaluate the performance of the new model in real-time with live data without impacting end users. SageMaker endpoints support this setup by allowing traffic mirroring to the shadow variant, making it an ideal solution for assessing the new model's performance.


質問 # 46
A company is building a deep learning model on Amazon SageMaker. The company uses a large amount of data as the training dataset. The company needs to optimize the model's hyperparameters to minimize the loss function on the validation dataset.
Which hyperparameter tuning strategy will accomplish this goal with the LEAST computation time?

  • A. Random search
  • B. Hyperbaric!
  • C. Grid search
  • D. Bayesian optimization

正解:B

解説:
Hyperband is a hyperparameter tuning strategy designed to minimize computation time by adaptively allocating resources to promising configurations and terminating underperforming ones early. It efficiently balances exploration and exploitation, making it ideal for large datasets and deep learning models where training can be computationally expensive.


質問 # 47
An ML engineer is working on an ML model to predict the prices of similarly sized homes. The model will base predictions on several features The ML engineer will use the following feature engineering techniques to estimate the prices of the homes:
* Feature splitting
* Logarithmic transformation
* One-hot encoding
* Standardized distribution
Select the correct feature engineering techniques for the following list of features. Each feature engineering technique should be selected one time or not at all (Select three.)

正解:

解説:

Explanation:
* City (name):One-hot encoding
* Type_year (type of home and year the home was built):Feature splitting
* Size of the building (square feet or square meters):Standardized distribution
* City (name): One-hot encoding
* Why?The "City" is a categorical feature (non-numeric), so one-hot encoding is used to transform it into a numeric format. This encoding creates binary columns for eachunique category (e.g., cities like "New York" or "Los Angeles"), which the model can interpret.
* Type_year (type of home and year the home was built): Feature splitting
* Why?"Type_year" combines two pieces of information into one column, which could confuse the model. Feature splitting separates this column into two distinct features: "Type of home" and
"Year built," enabling the model to process each feature independently.
* Size of the building (square feet or square meters): Standardized distribution
* Why?Size is a continuous numerical variable, and standardization (scaling the feature to have a mean of 0 and a standard deviation of 1) ensures that the model treats it fairly compared to other features, avoiding bias from differences in feature scale.
By applying these feature engineering techniques, the ML engineer can ensure that the input data is correctly formatted and optimized for the model to make accurate predictions.


質問 # 48
An ML engineer needs to use data with Amazon SageMaker Canvas to train an ML model. The data is stored in Amazon S3 and is complex in structure. The ML engineer must use a file format that minimizes processing time for the data.
Which file format will meet these requirements?

  • A. Apache Parquet files
  • B. JSON files compressed with gzip
  • C. JSON objects in JSONL format
  • D. CSV files compressed with Snappy

正解:A

解説:
Apache Parquet is a columnar storage file format optimized for complex and large datasets. It provides efficient reading and processing by accessing only the required columns, which reduces I/O and speeds up data handling. This makes it ideal for use with Amazon SageMaker Canvas, where minimizing processing time is important for training ML models. Parquet is also compatible with S3 and widely supported in data analytics and ML workflows.


質問 # 49
A company has an ML model that generates text descriptions based on images that customers upload to the company's website. The images can be up to 50 MB in total size.
An ML engineer decides to store the images in an Amazon S3 bucket. The ML engineer must implement a processing solution that can scale to accommodate changes in demand.
Which solution will meet these requirements with the LEAST operational overhead?

  • A. Create an Amazon SageMaker Asynchronous Inference endpoint and a scaling policy. Run a script to make an inference request for each image.
  • B. Create an Amazon Elastic Kubernetes Service (Amazon EKS) cluster that uses Karpenter for auto scaling. Host the model on the EKS cluster. Run a script to make an inference request for each image.
  • C. Create an AWS Batch job that uses an Amazon Elastic Container Service (Amazon ECS) cluster.Specify a list of images to process for each AWS Batch job.
  • D. Create an Amazon SageMaker batch transform job to process all the images in the S3 bucket.

正解:A

解説:
SageMaker Asynchronous Inference is designed for processing large payloads, such as images up to 50 MB, and can handle requests that do not require an immediate response.
It scales automatically based on the demand, minimizing operational overhead while ensuring cost-efficiency.
A script can be used to send inference requests for each image, and the results can be retrieved asynchronously. This approach is ideal for accommodating varying levels of traffic with minimal manual intervention.


質問 # 50
A company that has hundreds of data scientists is using Amazon SageMaker to create ML models. The models are in model groups in the SageMaker Model Registry.
The data scientists are grouped into three categories: computer vision, natural language processing (NLP), and speech recognition. An ML engineer needs to implement a solution to organize the existing models into these groups to improve model discoverability at scale. The solution must not affect the integrity of the model artifacts and their existing groupings.
Which solution will meet these requirements?

  • A. Create a custom tag for each of the three categories. Add the tags to the model packages in the SageMaker Model Registry.
  • B. Create a model group for each category. Move the existing models into these category model groups.
  • C. Create a Model Registry collection for each of the three categories. Move the existing model groups into the collections.
  • D. Use SageMaker ML Lineage Tracking to automatically identify and tag which model groups should contain the models.

正解:A

解説:
Using custom tags allows you to organize and categorize models in the SageMaker Model Registry without altering their existing groupings or affecting the integrity of the model artifacts. Tags are a lightweight and scalable way to improve model discoverability at scale, enabling the data scientists to filter and identify models by category (e.g., computer vision, NLP, speech recognition). This approach meets the requirements efficiently without introducing structural changes to the existing model registry setup.


質問 # 51
A company is using an Amazon Redshift database as its single data source. Some of the data is sensitive.
A data scientist needs to use some of the sensitive data from the database. An ML engineer must give the data scientist access to the data without transforming the source data and without storing anonymized data in the database.
Which solution will meet these requirements with the LEAST implementation effort?

  • A. Create a materialized view with masking logic on top of the database. Grant the necessary read permissions to the data scientist.
  • B. Unload the Amazon Redshift data to Amazon S3. Create an AWS Glue job to anonymize the data.Share the dataset with the data scientist.
  • C. Unload the Amazon Redshift data to Amazon S3. Use Amazon Athena to create schema-on-read with masking logic. Share the view with the data scientist.
  • D. Configure dynamic data masking policies to control how sensitive data is shared with the data scientist at query time.

正解:D

解説:
Dynamic data maskingallows you to control how sensitive data is presented to users at query time, without modifying or storing transformed versions of the source data. Amazon Redshift supports dynamic data masking, which can be implemented with minimal effort. This solution ensures that the data scientistcan access the required information while sensitive data remains protected, meeting the requirements efficiently and with the least implementation effort.


質問 # 52
An ML engineer needs to use an Amazon EMR cluster to process large volumes of data in batches. Any data loss is unacceptable.
Which instance purchasing option will meet these requirements MOST cost-effectively?

  • A. Run the primary node on an On-Demand Instance. Run the core nodes and task nodes on Spot Instances.
  • B. Run the primary node, core nodes, and task nodes on On-Demand Instances.
  • C. Run the primary node, core nodes, and task nodes on Spot Instances.
  • D. Run the primary node and core nodes on On-Demand Instances. Run the task nodes on Spot Instances.

正解:D

解説:
For Amazon EMR, the primary node and core nodes handle the critical functions of the cluster, including data storage (HDFS) and processing. Running them on On-Demand Instances ensures high availability and prevents data loss, as Spot Instances can be interrupted. The task nodes, which handle additionalprocessing but do not store data, can use Spot Instances to reduce costs without compromising the cluster's resilience or data integrity. This configuration balances cost-effectiveness and reliability.


質問 # 53
A company has implemented a data ingestion pipeline for sales transactions from its ecommerce website. The company uses Amazon Data Firehose to ingest data into Amazon OpenSearch Service. The buffer interval of the Firehose stream is set for 60 seconds. An OpenSearch linear model generates real-time sales forecasts based on the data and presents the data in an OpenSearch dashboard.
The company needs to optimize the data ingestion pipeline to support sub-second latency for the real-time dashboard.
Which change to the architecture will meet these requirements?

  • A. Replace the Firehose stream with an AWS DataSync task. Configure the task with enhanced fan-out consumers.
  • B. Replace the Firehose stream with an Amazon Simple Queue Service (Amazon SQS) queue.
  • C. Increase the buffer interval of the Firehose stream from 60 seconds to 120 seconds.
  • D. Use zero buffering in the Firehose stream. Tune the batch size that is used in the PutRecordBatch operation.

正解:D

解説:
Amazon Kinesis Data Firehose allows for near real-time data streaming. Setting thebuffering hintsto zero or a very small value minimizes the buffering delay and ensures that records are delivered to the destination (Amazon OpenSearch Service) as quickly as possible. Additionally, tuning thebatch sizein thePutRecordBatchoperation can further optimize the data ingestion for sub-second latency. This approach minimizes latency while maintaining the operational simplicity of using Firehose.


質問 # 54
An ML engineer needs to use Amazon SageMaker Feature Store to create and manage features to train a model.
Select and order the steps from the following list to create and use the features in Feature Store. Each step should be selected one time. (Select and order three.)
* Access the store to build datasets for training.
* Create a feature group.
* Ingest the records.

正解:

解説:

Explanation:

Step 1: Create a feature group.Step 2: Ingest the records.Step 3: Access the store to build datasets for training.
* Step 1: Create a Feature Group
* Why?A feature group is the foundational unit in SageMaker Feature Store, where features are defined, stored, and organized. Creating a feature group specifies the schema (name, data type) for the features and the primary keys for data identification.
* How?Use the SageMaker Python SDK or AWS CLI to define the feature group by specifying its name, schema, and S3 storage location for offline access.
* Step 2: Ingest the Records
* Why?After creating the feature group, the raw data must be ingested into the Feature Store. This step populates the feature group with data, making it available for both real-time and offline use.
* How?Use the SageMaker SDK or AWS CLI to batch-ingest historical data or stream new records into the feature group. Ensure the records conform to the feature group schema.
* Step 3: Access the Store to Build Datasets for Training
* Why?Once the features are stored, they can be accessed to create training datasets. These datasets combine relevant features into a single format for machine learning model training.
* How?Use the SageMaker Python SDK to query the offline store or retrieve real-time features using the online store API. The offline store is typically used for batch training, while the online store is used for inference.
Order Summary:
* Create a feature group.
* Ingest the records.
* Access the store to build datasets for training.
This process ensures the features are properly managed, ingested, and accessible for model training using Amazon SageMaker Feature Store.


質問 # 55
An ML engineer has developed a binary classification model outside of Amazon SageMaker. The ML engineer needs to make the model accessible to a SageMaker Canvas user for additional tuning.
The model artifacts are stored in an Amazon S3 bucket. The ML engineer and the Canvas user are part of the same SageMaker domain.
Which combination of requirements must be met so that the ML engineer can share the model with the Canvas user? (Choose two.)

  • A. The ML engineer must host the model on AWS Marketplace.
  • B. The Canvas user must have permissions to access the S3 bucket where the model artifacts are stored.
  • C. The model must be registered in the SageMaker Model Registry.
  • D. The ML engineer and the Canvas user must be in separate SageMaker domains.
  • E. The ML engineer must deploy the model to a SageMaker endpoint.

正解:B、C

解説:
The SageMaker Canvas user needs permissions to access the Amazon S3 bucket where the model artifacts are stored to retrieve the model for use in Canvas.
Registering the model in the SageMaker Model Registry allows the model to be tracked and managed within the SageMaker ecosystem. This makes it accessible for tuning and deployment through SageMaker Canvas.
This combination ensures proper access control and integration within SageMaker, enabling the Canvas user to work with the model.


質問 # 56
A company wants to improve the sustainability of its ML operations.
Which actions will reduce the energy usage and computational resources that are associated with the company's training jobs? (Choose two.)

  • A. Deploy models by using AWS Lambda functions.
  • B. Use AWS Trainium instances for training.
  • C. Use Amazon SageMaker Ground Truth for data labeling.
  • D. Use Amazon SageMaker Debugger to stop training jobs when non-converging conditions are detected.
  • E. Use PyTorch or TensorFlow with the distributed training option.

正解:B、D

解説:
SageMaker Debuggercan identify when a training job is not converging or is stuck in a non-productive state.
By stopping these jobs early, unnecessary energy and computational resources are conserved, improving sustainability.
AWS Trainiuminstances are purpose-built for ML training and are optimized for energy efficiency and cost- effectiveness. They use less energy per training task compared to general-purpose instances, making them a sustainable choice.


質問 # 57
......

検証済みのMLA-C01問題集と解答を使って100%一発合格保証で更新された問題集:https://drive.google.com/open?id=1fYRwAugX617m_0vGNJx1njk_EO5dVIj_

合格させるAWS Certified Associate MLA-C01試験には85問があります:https://www.goshiken.com/Amazon/MLA-C01-mondaishu.html