Month End Sale Limited Time 70% Discount Offer - Ends in 0d 00h 00m 00s - Coupon code: simple70

Pass the Amazon Web Services AWS Certified Associate MLA-C01 Questions and answers with CertsForce

Viewing page 5 out of 7 pages
Viewing questions 41-50 out of questions
Questions # 41:

A company ingests sales transaction data using Amazon Data Firehose into Amazon OpenSearch Service. The Firehose buffer interval is set to 60 seconds.

The company needs sub-second latency for a real-time OpenSearch dashboard.

Which architectural change will meet this requirement?

Options:

A.

Use zero buffering in the Firehose stream and tune the PutRecordBatch batch size.


B.

Replace Firehose with AWS DataSync and enhanced fan-out consumers.


C.

Increase the Firehose buffer interval to 120 seconds.


D.

Replace Firehose with Amazon SQS.


Expert Solution
Questions # 42:

A construction company is using Amazon SageMaker AI to train specialized custom object detection models to identify road damage. The company uses images from multiple cameras. The images are stored as JPEG objects in an Amazon S3 bucket.

The images need to be pre-processed by using computationally intensive computer vision techniques before the images can be used in the training job. The company needs to optimize data loading and pre-processing in the training job. The solution cannot affect model performance or increase compute or storage resources.

Which solution will meet these requirements?

Options:

A.

Use SageMaker AI file mode to load and process the images in batches.


B.

Reduce the batch size of the model and increase the number of pre-processing threads.


C.

Reduce the quality of the training images in the S3 bucket.


D.

Convert the images into RecordIO format and use the lazy loading pattern.


Expert Solution
Questions # 43:

A company is creating an ML model to identify defects in a product. The company has gathered a dataset and has stored the dataset in TIFF format in Amazon S3. The dataset contains 200 images in which the most common defects are visible. The dataset also contains 1,800 images in which there is no defect visible.

An ML engineer trains the model and notices poor performance in some classes. The ML engineer identifies a class imbalance problem in the dataset.

What should the ML engineer do to solve this problem?

Options:

A.

Use a few hundred images and Amazon Rekognition Custom Labels to train a new model.


B.

Undersample the 200 images in which the most common defects are visible.


C.

Oversample the 200 images in which the most common defects are visible.


D.

Use all 2,000 images and Amazon Rekognition Custom Labels to train a new model.


Expert Solution
Questions # 44:

An ML engineer is setting up a CI/CD pipeline for an ML workflow in Amazon SageMaker AI. The pipeline must automatically retrain, test, and deploy a model whenever new data is uploaded to an Amazon S3 bucket. New data files are approximately 10 GB in size. The ML engineer also needs to track model versions for auditing.

Which solution will meet these requirements?

Options:

A.

Use AWS CodePipeline, Amazon S3, and AWS CodeBuild to retrain and deploy the model automatically and track model versions.


B.

Use SageMaker Pipelines with the SageMaker Model Registry to orchestrate model training and version tracking.


C.

Use AWS Lambda and Amazon EventBridge to retrain and deploy the model and track versions via logs.


D.

Manually retrain and deploy the model using SageMaker notebook instances and track versions with AWS CloudTrail.


Expert Solution
Questions # 45:

A financial company receives a high volume of real-time market data streams from an external provider. The streams consist of thousands of JSON records every second.

The company needs to implement a scalable solution on AWS to identify anomalous data points.

Which solution will meet these requirements with the LEAST operational overhead?

Options:

A.

Ingest real-time data into Amazon Kinesis Data Streams. Use the built-in RANDOM_CUT_FOREST function in Amazon Managed Service for Apache Flink to process the data streams and to detect data anomalies.


B.

Ingest real-time data into Amazon Kinesis Data Streams. Deploy an Amazon SageMaker AI endpoint for real-time outlier detection. Create an AWS Lambda function to detect anomalies. Use the data streams to invoke the Lambda function.


C.

Ingest real-time data into Apache Kafka on Amazon EC2 instances. Deploy an Amazon SageMaker AI endpoint for real-time outlier detection. Create an AWS Lambda function to detect anomalies. Use the data streams to invoke the Lambda function.


D.

Send real-time data to an Amazon Simple Queue Service (Amazon SQS) FIFO queue. Create an AWS Lambda function to consume the queue messages. Program the Lambda function to start an AWS Glue extract, transform, and load (ETL) job for batch processing and anomaly detection.


Expert Solution
Questions # 46:

A company plans to use Amazon SageMaker AI to build image classification models. The company has 6 TB of training data stored on Amazon FSx for NetApp ONTAP. The file system is in the same VPC as SageMaker AI.

An ML engineer must make the training data accessible to SageMaker AI training jobs.

Which solution will meet these requirements?

Options:

A.

Mount the FSx for ONTAP file system as a volume to the SageMaker AI instance.


B.

Create an Amazon S3 bucket and use Mountpoint for Amazon S3 to link the bucket to FSx for ONTAP.


C.

Create a catalog connection from SageMaker Data Wrangler to the FSx for ONTAP file system.


D.

Create a direct connection from SageMaker Data Wrangler to the FSx for ONTAP file system.


Expert Solution
Questions # 47:

An ML engineer wants to run a training job on Amazon SageMaker AI by using multiple GPUs. The training dataset is stored in Apache Parquet format.

The Parquet files are too large to fit into the memory of the SageMaker AI training instances.

Which solution will fix the memory problem?

Options:

A.

Attach an Amazon EBS Provisioned IOPS SSD volume and store the files on the EBS volume.


B.

Repartition the Parquet files by using Apache Spark on Amazon EMR and use the repartitioned files for training.


C.

Change to memory-optimized instance types with sufficient memory.


D.

Use SageMaker distributed data parallelism (SMDDP) to split memory usage.


Expert Solution
Questions # 48:

A company is building an Amazon SageMaker AI pipeline for an ML model. The pipeline uses distributed processing and training.

An ML engineer needs to encrypt network communication between instances that run distributed jobs. The ML engineer configures the distributed jobs to run in a private VPC.

What should the ML engineer do to meet the encryption requirement?

Options:

A.

Enable network isolation.


B.

Configure traffic encryption by using security groups.


C.

Enable inter-container traffic encryption.


D.

Enable VPC flow logs.


Expert Solution
Questions # 49:

A company needs to combine data from multiple sources. The company must use Amazon Redshift Serverless to query an AWS Glue Data Catalog database and underlying data that is stored in an Amazon S3 bucket.

Select and order the correct steps from the following list to meet these requirements. Select each step one time or not at all. (Select and order three.)

• Attach the IAM role to the Redshift cluster.

• Attach the IAM role to the Redshift namespace.

• Create an external database in Amazon Redshift to point to the Data Catalog schema.

• Create an external schema in Amazon Redshift to point to the Data Catalog database.

• Create an IAM role for Amazon Redshift to use to access only the S3 bucket that contains underlying data.

• Create an IAM role for Amazon Redshift to use to access the Data Catalog and the S3 bucket that contains underlying data.

Question # 49


Expert Solution
Questions # 50:

An ML engineer wants to deploy a workflow that processes streaming IoT sensor data and periodically retrains ML models. The most recent model versions must be deployed to production.

Which service will meet these requirements?

Options:

A.

Amazon SageMaker Pipelines


B.

Amazon Managed Workflows for Apache Airflow (MWAA)


C.

AWS Lambda


D.

Apache Spark


Expert Solution
Viewing page 5 out of 7 pages
Viewing questions 41-50 out of questions