Pass the Amazon Web Services AWS Certified Associate MLA-C01 Questions and answers with CertsForce

Viewing page 1 out of 3 pages
Viewing questions 1-10 out of questions
Questions # 1:

An ML engineer needs to create data ingestion pipelines and ML model deployment pipelines on AWS. All the raw data is stored in Amazon S3 buckets.

Which solution will meet these requirements?

Options:

A.

Use Amazon Data Firehose to create the data ingestion pipelines. Use Amazon SageMaker Studio Classic to create the model deployment pipelines.


B.

Use AWS Glue to create the data ingestion pipelines. Use Amazon SageMaker Studio Classic to create the model deployment pipelines.


C.

Use Amazon Redshift ML to create the data ingestion pipelines. Use Amazon SageMaker Studio Classic to create the model deployment pipelines.


D.

Use Amazon Athena to create the data ingestion pipelines. Use an Amazon SageMaker notebook to create the model deployment pipelines.


Expert Solution
Questions # 2:

An ML engineer needs to use Amazon SageMaker Feature Store to create and manage features to train a model.

Select and order the steps from the following list to create and use the features in Feature Store. Each step should be selected one time. (Select and order three.)

• Access the store to build datasets for training.

• Create a feature group.

• Ingest the records.

Question # 2


Expert Solution
Questions # 3:

A company needs to give its ML engineers appropriate access to training data. The ML engineers must access training data from only their own business group. The ML engineers must not be allowed to access training data from other business groups.

The company uses a single AWS account and stores all the training data in Amazon S3 buckets. All ML model training occurs in Amazon SageMaker.

Which solution will provide the ML engineers with the appropriate access?

Options:

A.

Enable S3 bucket versioning.


B.

Configure S3 Object Lock settings for each user.


C.

Add cross-origin resource sharing (CORS) policies to the S3 buckets.


D.

Create IAM policies. Attach the policies to IAM users or IAM roles.


Expert Solution
Questions # 4:

A company wants to improve the sustainability of its ML operations.

Which actions will reduce the energy usage and computational resources that are associated with the company's training jobs? (Choose two.)

Options:

A.

Use Amazon SageMaker Debugger to stop training jobs when non-converging conditions are detected.


B.

Use Amazon SageMaker Ground Truth for data labeling.


C.

Deploy models by using AWS Lambda functions.


D.

Use AWS Trainium instances for training.


E.

Use PyTorch or TensorFlow with the distributed training option.


Expert Solution
Questions # 5:

A company wants to develop an ML model by using tabular data from its customers. The data contains meaningful ordered features with sensitive information that should not be discarded. An ML engineer must ensure that the sensitive data is masked before another team starts to build the model.

Which solution will meet these requirements?

Options:

A.

Use Amazon Made to categorize the sensitive data.


B.

Prepare the data by using AWS Glue DataBrew.


C.

Run an AWS Batch job to change the sensitive data to random values.


D.

Run an Amazon EMR job to change the sensitive data to random values.


Expert Solution
Questions # 6:

A company is using Amazon SageMaker and millions of files to train an ML model. Each file is several megabytes in size. The files are stored in an Amazon S3 bucket. The company needs to improve training performance.

Which solution will meet these requirements in the LEAST amount of time?

Options:

A.

Transfer the data to a new S3 bucket that provides S3 Express One Zone storage. Adjust the training job to use the new S3 bucket.


B.

Create an Amazon FSx for Lustre file system. Link the file system to the existing S3 bucket. Adjust the training job to read from the file system.


C.

Create an Amazon Elastic File System (Amazon EFS) file system. Transfer the existing data to the file system. Adjust the training job to read from the file system.


D.

Create an Amazon ElastiCache (Redis OSS) cluster. Link the Redis OSS cluster to the existing S3 bucket. Stream the data from the Redis OSS cluster directly to the training job.


Expert Solution
Questions # 7:

An ML engineer needs to process thousands of existing CSV objects and new CSV objects that are uploaded. The CSV objects are stored in a central Amazon S3 bucket and have the same number of columns. One of the columns is a transaction date. The ML engineer must query the data based on the transaction date.

Which solution will meet these requirements with the LEAST operational overhead?

Options:

A.

Use an Amazon Athena CREATE TABLE AS SELECT (CTAS) statement to create a table based on the transaction date from data in the central S3 bucket. Query the objects from the table.


B.

Create a new S3 bucket for processed data. Set up S3 replication from the central S3 bucket to the new S3 bucket. Use S3 Object Lambda to query the objects based on transaction date.


C.

Create a new S3 bucket for processed data. Use AWS Glue for Apache Spark to create a job to query the CSV objects based on transaction date. Configure the job to store the results in the new S3 bucket. Query the objects from the new S3 bucket.


D.

Create a new S3 bucket for processed data. Use Amazon Data Firehose to transfer the data from the central S3 bucket to the new S3 bucket. Configure Firehose to run an AWS Lambda function to query the data based on transaction date.


Expert Solution
Questions # 8:

A company has deployed an ML model that detects fraudulent credit card transactions in real time in a banking application. The model uses Amazon SageMaker Asynchronous Inference. Consumers are reporting delays in receiving the inference results.

An ML engineer needs to implement a solution to improve the inference performance. The solution also must provide a notification when a deviation in model quality occurs.

Which solution will meet these requirements?

Options:

A.

Use SageMaker real-time inference for inference. Use SageMaker Model Monitor for notifications about model quality.


B.

Use SageMaker batch transform for inference. Use SageMaker Model Monitor for notifications about model quality.


C.

Use SageMaker Serverless Inference for inference. Use SageMaker Inference Recommender for notifications about model quality.


D.

Keep using SageMaker Asynchronous Inference for inference. Use SageMaker Inference Recommender for notifications about model quality.


Expert Solution
Questions # 9:

Case study

An ML engineer is developing a fraud detection model on AWS. The training dataset includes transaction logs, customer profiles, and tables from an on-premises MySQL database. The transaction logs and customer profiles are stored in Amazon S3.

The dataset has a class imbalance that affects the learning of the model's algorithm. Additionally, many of the features have interdependencies. The algorithm is not capturing all the desired underlying patterns in the data.

Before the ML engineer trains the model, the ML engineer must resolve the issue of the imbalanced data.

Which solution will meet this requirement with the LEAST operational effort?

Options:

A.

Use Amazon Athena to identify patterns that contribute to the imbalance. Adjust the dataset accordingly.


B.

Use Amazon SageMaker Studio Classic built-in algorithms to process the imbalanced dataset.


C.

Use AWS Glue DataBrew built-in features to oversample the minority class.


D.

Use the Amazon SageMaker Data Wrangler balance data operation to oversample the minority class.


Expert Solution
Questions # 10:

An ML engineer trained an ML model on Amazon SageMaker to detect automobile accidents from dosed-circuit TV footage. The ML engineer used SageMaker Data Wrangler to create a training dataset of images of accidents and non-accidents.

The model performed well during training and validation. However, the model is underperforming in production because of variations in the quality of the images from various cameras.

Which solution will improve the model's accuracy in the LEAST amount of time?

Options:

A.

Collect more images from all the cameras. Use Data Wrangler to prepare a new training dataset.


B.

Recreate the training dataset by using the Data Wrangler corrupt image transform. Specify the impulse noise option.


C.

Recreate the training dataset by using the Data Wrangler enhance image contrast transform. Specify the Gamma contrast option.


D.

Recreate the training dataset by using the Data Wrangler resize image transform. Crop all images to the same size.


Expert Solution
Viewing page 1 out of 3 pages
Viewing questions 1-10 out of questions