Spring Sale Limited Time 70% Discount Offer - Ends in 0d 00h 00m 00s - Coupon code: simple70

Pass the Amazon Web Services AWS Certified Data Engineer Data-Engineer-Associate Questions and answers with CertsForce

Viewing page 4 out of 9 pages
Viewing questions 31-40 out of questions
Questions # 31:

A company is developing a log streaming pipeline that uses Amazon Data Firehose. The pipeline streams Amazon CloudWatch Logs data to an Amazon S3 bucket. The company ' s analytics team needs to use the data in audits. The pipeline must deliver only the relevant logs to the S3 bucket in a compatible format for the team ' s analysis.

Which solution will meet these requirements and maintain reliable performance?

Options:

A.

Set the S3 bucket rules to allow logs from only specific timestamp ranges. Create an AWS Lambda function that converts the log files to the desired format. Use an S3 trigger to invoke the Lambda function.


B.

Create a subscription filter in the CloudWatch Logs log group that uses the Firehose delivery stream as the destination. Create an AWS Lambda function that converts the log files to the desired format. Configure Firehose to invoke the Lambda function.


C.

Create a subscription filter in the CloudWatch Logs log group. Configure the filter to monitor the Firehose stream. Create an AWS Lambda function to convert the log files to the desired format. Configure Firehose to invoke the Lambda function.


D.

Tag the CloudWatch Logs log groups that the analytics team needs. Configure Firehose to ingest only the tagged log groups. Configure Firehose to write the output in the desired format.


Expert Solution
Questions # 32:

An ecommerce company wants to use AWS to migrate data pipelines from an on-premises environment into the AWS Cloud. The company currently uses a third-party too in the on-premises environment to orchestrate data ingestion processes.

The company wants a migration solution that does not require the company to manage servers. The solution must be able to orchestrate Python and Bash scripts. The solution must not require the company to refactor any code.

Which solution will meet these requirements with the LEAST operational overhead?

Options:

A.

AWS Lambda


B.

Amazon Managed Workflows for Apache Airflow (Amazon MWAA)


C.

AWS Step Functions


D.

AWS Glue


Expert Solution
Questions # 33:

A data engineer maintains custom Python scripts that perform a data formatting process that many AWS Lambda functions use. When the data engineer needs to modify the Python scripts, the data engineer must manually update all the Lambda functions.

The data engineer requires a less manual way to update the Lambda functions.

Which solution will meet this requirement?

Options:

A.

Store a pointer to the custom Python scripts in the execution context object in a shared Amazon S3 bucket.


B.

Package the custom Python scripts into Lambda layers. Apply the Lambda layers to the Lambda functions.


C.

Store a pointer to the custom Python scripts in environment variables in a shared Amazon S3 bucket.


D.

Assign the same alias to each Lambda function. Call reach Lambda function by specifying the function ' s alias.


Expert Solution
Questions # 34:

A company stores employee data in Amazon Redshift A table named Employee uses columns named Region ID, Department ID, and Role ID as a compound sort key. Which queries will MOST increase the speed of a query by using a compound sort key of the table? (Select TWO.)

Options:

A.

Select * from Employee where Region ID= ' North America ' ;


B.

Select * from Employee where Region ID= ' North America ' and Department ID=20;


C.

Select * from Employee where Department ID=20 and Region ID= ' North America ' ;


D.

Select " from Employee where Role ID=50;


E.

Select * from Employee where Region ID= ' North America ' and Role ID=50;


Expert Solution
Questions # 35:

A data engineer is using an Apache Iceberg framework to build a data lake that contains 100 TB of data. The data engineer wants to run AWS Glue Apache Spark Jobs that use the Iceberg framework.

What combination of steps will meet these requirements? (Select TWO.)

Options:

A.

Create a key named -conf for an AWS Glue job. Set Iceberg as a value for the --datalake-formats job parameter.


B.

Specify the path to a specific version of Iceberg by using the --extra-Jars job parameter. Set Iceberg as a value for the ~ datalake-formats job parameter.


C.

Set Iceberg as a value for the -datalake-formats job parameter.


D.

Set the -enable-auto-scaling parameter to true.


E.

Add the -job-bookmark-option: job-bookmark-enable parameter to an AWS Glue job.


Expert Solution
Questions # 36:

A company has used an Amazon Redshift table that is named Orders for 6 months. The company performs weekly updates and deletes on the table. The table has an interleaved sort key on a column that contains AWS Regions.

The company wants to reclaim disk space so that the company will not run out of storage space. The company also wants to analyze the sort key column.

Which Amazon Redshift command will meet these requirements?

Options:

A.

VACUUM FULL Orders


B.

VACUUM DELETE ONLY Orders


C.

VACUUM REINDEX Orders


D.

VACUUM SORT ONLY Orders


Expert Solution
Questions # 37:

A company is creating a new data pipeline to populate a data lake. A data analyst needs to prepare and standardize the data before a data engineering team can perform advanced data transformations. The data analyst needs a solution to process the data that does not require writing new code.

Which solution will meet these requirements with the LEAST operational effort?

Options:

A.

Use Python and Pandas in an AWS Glue Studio notebook. Ensure that the data engineers add additional transformations to complete the pipeline.


B.

Use Amazon SageMaker Canvas and SageMaker Data Wrangler to write to a new dataset. Ensure that the data engineers add additional transformations to complete the pipeline by using AWS Glue.


C.

Use AWS Glue Studio with data preparation recipe transformations. Ensure that the data engineers add additional transformations to complete the pipeline.


D.

Create a document that includes the data preparation rules. Ensure that the data engineers implement the rules in AWS Glue.


Expert Solution
Questions # 38:

A company manages an Amazon Redshift data warehouse. The data warehouse is in a public subnet inside a custom VPC A security group allows only traffic from within itself- An ACL is open to all traffic.

The company wants to generate several visualizations in Amazon QuickSight for an upcoming sales event. The company will run QuickSight Enterprise edition in a second AW5 account inside a public subnet within a second custom VPC. The new public subnet has a security group that allows outbound traffic to the existing Redshift cluster.

A data engineer needs to establish connections between Amazon Redshift and QuickSight. QuickSight must refresh dashboards by querying the Redshift cluster.

Which solution will meet these requirements?

Options:

A.

Configure the Redshift security group to allow inbound traffic on the Redshift port from the QuickSight security group.


B.

Assign Elastic IP addresses to the QuickSight visualizations. Configure the QuickSight security group to allow inbound traffic on the Redshift port from the Elastic IP addresses.


C.

Confirm that the CIDR ranges of the Redshift VPC and the QuickSight VPC are the same. If CIDR ranges are different, reconfigure one CIDR range to match the other. Establish network peering between the VPCs.


D.

Create a QuickSight gateway endpoint in the Redshift VPC. Attach an endpoint policy to the gateway endpoint to ensure only specific QuickSight accounts can use the endpoint.


Expert Solution
Questions # 39:

A company stores historical customer data in an Amazon Redshift table. A column named Email contains null entries and values that are not email addresses. The quality of the Email column is critical for multiple downstream processes. A data engineer must create an AWS Glue Data Quality rule that fails when the percentage of valid email addresses in the Email column is less than 90%.

Which component of an AWS Glue Data Quality rule will meet these requirements?

Options:

A.

Uniqueness " Email " matches with a threshold set to > 0.9


B.

ColumnValues " Email " matches with a threshold set to > 0.1


C.

ColumnValues " Email " matches with a threshold set to > 0.9


D.

UniqueValueRatio " Email " matches with a threshold set to > 0.1


Expert Solution
Questions # 40:

A healthcare company stores patient records in an on-premises MySQL database. The company creates an application to access the MySQL database. The company must enforce security protocols to protect the patient records. The company currently rotates database credentials every 30 days to minimize the risk of unauthorized access.

The company wants a solution that does not require the company to modify the application code for each credential rotation.

Which solution will meet this requirement with the least operational overhead?

Options:

A.

Assign an IAM role access permissions to the database. Configure the application to obtain temporary credentials through the IAM role.


B.

Use AWS Key Management Service (AWS KMS) to generate encryption keys. Configure automatic key rotation. Store the encrypted credentials in an Amazon DynamoDB table.


C.

Use AWS Secrets Manager to automatically rotate credentials. Allow the application to retrieve the credentials by using API calls.


D.

Store credentials in an encrypted Amazon S3 bucket. Rotate the credentials every month by using an S3 Lifecycle policy. Use bucket policies to control access.


Expert Solution
Viewing page 4 out of 9 pages
Viewing questions 31-40 out of questions