Month End Sale Limited Time 70% Discount Offer - Ends in 0d 00h 00m 00s - Coupon code: simple70

Pass the Amazon Web Services AWS Certified Associate MLA-C01 Questions and answers with CertsForce

Viewing page 1 out of 7 pages
Viewing questions 1-10 out of questions
Questions # 1:

A company is developing an ML model to forecast future values based on time series data. The dataset includes historical measurements collected at regular intervals and categorical features. The model needs to predict future values based on past patterns and trends.

Which algorithm and hyperparameters should the company use to develop the model?

Options:

A.

Use the Amazon SageMaker AI XGBoost algorithm. Set the scale_pos_weight hyperparameter to adjust for class imbalance.


B.

Use k-means clustering with k to specify the number of clusters.


C.

Use the Amazon SageMaker AI DeepAR algorithm with matching context length and prediction length hyperparameters.


D.

Use the Amazon SageMaker AI Random Cut Forest (RCF) algorithm with contamination to set the expected proportion of anomalies.


Expert Solution
Questions # 2:

An ML engineer wants to run a training job on Amazon SageMaker AI. The training job will train a neural network by using multiple GPUs. The training dataset is stored in Parquet format.

The ML engineer discovered that the Parquet dataset contains files too large to fit into the memory of the SageMaker AI training instances.

Which solution will fix the memory problem?

Options:

A.

Attach an Amazon Elastic Block Store (Amazon EBS) Provisioned IOPS SSD volume to the instance. Store the files in the EBS volume.


B.

Repartition the Parquet files by using Apache Spark on Amazon EMR. Use the repartitioned files for the training job.


C.

Change the instance type to Memory Optimized instances with sufficient memory for the training job.


D.

Use the SageMaker AI distributed data parallelism (SMDDP) library with multiple instances to split the memory usage.


Expert Solution
Questions # 3:

A company is using an Amazon Redshift database as its single data source. Some of the data is sensitive.

A data scientist needs to use some of the sensitive data from the database. An ML engineer must give the data scientist access to the data without transforming the source data and without storing anonymized data in the database.

Which solution will meet these requirements with the LEAST implementation effort?

Options:

A.

Configure dynamic data masking policies to control how sensitive data is shared with the data scientist at query time.


B.

Create a materialized view with masking logic on top of the database. Grant the necessary read permissions to the data scientist.


C.

Unload the Amazon Redshift data to Amazon S3. Use Amazon Athena to create schema-on-read with masking logic. Share the view with the data scientist.


D.

Unload the Amazon Redshift data to Amazon S3. Create an AWS Glue job to anonymize the data. Share the dataset with the data scientist.


Expert Solution
Questions # 4:

An ML engineer needs to deploy ML models to get inferences from large datasets in an asynchronous manner. The ML engineer also needs to implement scheduled monitoring of the data quality of the models. The ML engineer must receive alerts when changes in data quality occur.

Which solution will meet these requirements?

Options:

A.

Deploy the models by using scheduled AWS Glue jobs. Use Amazon CloudWatch alarms to monitor the data quality and to send alerts.


B.

Deploy the models by using scheduled AWS Batch jobs. Use AWS CloudTrail to monitor the data quality and to send alerts.


C.

Deploy the models by using Amazon Elastic Container Service (Amazon ECS) on AWS Fargate. Use Amazon EventBridge to monitor the data quality and to send alerts.


D.

Deploy the models by using Amazon SageMaker AI batch transform. Use SageMaker Model Monitor to monitor the data quality and to send alerts.


Expert Solution
Questions # 5:

An ML engineer at a credit card company built and deployed an ML model by using Amazon SageMaker AI. The model was trained on transaction data that contained very few fraudulent transactions. After deployment, the model is underperforming.

What should the ML engineer do to improve the model’s performance?

Options:

A.

Retrain the model with a different SageMaker built-in algorithm.


B.

Use random undersampling to reduce the majority class and retrain the model.


C.

Use Synthetic Minority Oversampling Technique (SMOTE) to generate synthetic minority samples and retrain the model.


D.

Use random oversampling to duplicate minority samples and retrain the model.


Expert Solution
Questions # 6:

A company is building a near real-time data analytics application to detect anomalies and failures for industrial equipment. The company has thousands of IoT sensors that send data every 60 seconds. When new versions of the application are released, the company wants to ensure that application code bugs do not prevent the application from running.

Which solution will meet these requirements?

Options:

A.

Use Amazon Managed Service for Apache Flink with the system rollback capability enabled to build the data analytics application.


B.

Use Amazon Managed Service for Apache Flink with manual rollback when an error occurs to build the data analytics application.


C.

Use Amazon Data Firehose to deliver real-time streaming data programmatically for the data analytics application. Pause the stream when a new version of the application is released and resume the stream after the application is deployed.


D.

Use Amazon Data Firehose to deliver data to Amazon EC2 instances across two Availability Zones for the data analytics application.


Expert Solution
Questions # 7:

An ML engineer has an Amazon Comprehend custom model in Account A in the us-east-1 Region. The ML engineer needs to copy the model to Account В in the same Region.

Which solution will meet this requirement with the LEAST development effort?

Options:

A.

Use Amazon S3 to make a copy of the model. Transfer the copy to Account B.


B.

Create a resource-based IAM policy. Use the Amazon Comprehend ImportModel API operation to copy the model to Account B.


C.

Use AWS DataSync to replicate the model from Account A to Account B.


D.

Create an AWS Site-to-Site VPN connection between Account A and Account В to transfer the model.


Expert Solution
Questions # 8:

A company uses Amazon Athena to query a dataset in Amazon S3. The dataset has a target variable that the company wants to predict.

The company needs to use the dataset in a solution to determine if a model can predict the target variable.

Which solution will provide this information with the LEAST development effort?

Options:

A.

Create a new model by using Amazon SageMaker Autopilot. Report the model's achieved performance.


B.

Implement custom scripts to perform data pre-processing, multiple linear regression, and performance evaluation. Run the scripts on Amazon EC2 instances.


C.

Configure Amazon Macie to analyze the dataset and to create a model. Report the model's achieved performance.


D.

Select a model from Amazon Bedrock. Tune the model with the data. Report the model's achieved performance.


Expert Solution
Questions # 9:

An ML engineer needs to organize a large set of text documents into topics. The ML engineer will not know what the topics are in advance. The ML engineer wants to use built-in algorithms or pre-trained models available through Amazon SageMaker AI to process the documents.

Which solution will meet these requirements?

Options:

A.

Use the BlazingText algorithm to identify the relevant text and to create a set of topics based on the documents.


B.

Use the Sequence-to-Sequence algorithm to summarize the text and to create a set of topics based on the documents.


C.

Use the Object2Vec algorithm to create embeddings and to create a set of topics based on the embeddings.


D.

Use the Latent Dirichlet Allocation (LDA) algorithm to process the documents and to create a set of topics based on the documents.


Expert Solution
Questions # 10:

An ML engineer wants to re-train an XGBoost model at the end of each month. A data team prepares the training data. The training dataset is a few hundred megabytes in size. When the data is ready, the data team stores the data as a new file in an Amazon S3 bucket.

The ML engineer needs a solution to automate this pipeline. The solution must register the new model version in Amazon SageMaker Model Registry within 24 hours.

Which solution will meet these requirements?

Options:

A.

Create an AWS Lambda function that runs one time each week to poll the S3 bucket for new files. Invoke the Lambda function asynchronously. Configure the Lambda function to start the pipeline if the function detects new data.


B.

Create an Amazon CloudWatch rule that runs on a schedule to start the pipeline every 30 days.


C.

Create an S3 Lifecycle rule to start the pipeline every time a new object is uploaded to the S3 bucket.


D.

Create an Amazon EventBridge rule to start an AWS Step Functions TrainingStep every time a new object is uploaded to the S3 bucket.


Expert Solution
Viewing page 1 out of 7 pages
Viewing questions 1-10 out of questions