Amazon Web Services AWS Certified Machine Learning - Specialty MLS-C01 Question # 84 Topic 9 Discussion

Amazon Web Services AWS Certified Machine Learning - Specialty MLS-C01 Question # 84 Topic 9 Discussion

MLS-C01 Exam Topic 9 Question 84 Discussion:
Question #: 84
Topic #: 9

A data scientist receives a new dataset in .csv format and stores the dataset in Amazon S3. The data scientist will use this dataset to train a machine learning (ML) model.

The data scientist first needs to identify any potential data quality issues in the dataset. The data scientist must identify values that are missing or values that are not valid. The data scientist must also identify the number of outliers in the dataset.

Which solution will meet these requirements with the LEAST operational effort?)


A.

Create an AWS Glue job to transform the data from .csv format to Apache Parquet format. Use an AWS Glue crawler and Amazon Athena with appropriate SQL queries to retrieve the required information.


B.

Leave the dataset in .csv format. Use an AWS Glue crawler and Amazon Athena with appropriate SQL queries to retrieve the required information.


C.

Create an AWS Glue job to transform the data from .csv format to Apache Parquet format. Import the data into Amazon SageMaker Data Wrangler. Use the Data Quality and Insights Report to retrieve the required information.


D.

Leave the dataset in .csv format. Import the data into Amazon SageMaker Data Wrangler. Use the Data Quality and Insights Report to retrieve the required information.


Get Premium MLS-C01 Questions

Contribute your Thoughts:


Chosen Answer:
This is a voting comment (?). It is better to Upvote an existing comment if you don't have anything to add.