Spring Sale Limited Time 70% Discount Offer - Ends in 0d 00h 00m 00s - Coupon code: simple70

Amazon Web Services AWS Certified Data Engineer - Associate (DEA-C01) Data-Engineer-Associate Question # 53 Topic 6 Discussion

Amazon Web Services AWS Certified Data Engineer - Associate (DEA-C01) Data-Engineer-Associate Question # 53 Topic 6 Discussion

Data-Engineer-Associate Exam Topic 6 Question 53 Discussion:
Question #: 53
Topic #: 6

A company stores Apache Parquet files in an Amazon S3 data lake. The data lake receives thousands of files from multiple sources every hour. The files range in size from 50 KB to 100 KB.

The company is evaluating the implementation of Apache Iceberg tables for the data lake. The company is using AWS Glue Data Catalog as part of the evaluation. The company needs a solution to optimize query performance in Iceberg. The solution must ensure that Iceberg table performance does not degrade when more files are added over time.

Which solution will meet these requirements?


A.

Use an AWS Glue job to compact the files into a standard size of 512 MB at the end of each day. Run an AWS Glue crawler to update the Data Catalog.


B.

Configure the Data Catalog to automatically compact the files every minute.


C.

Configure Iceberg table properties to enable automatic compaction based on thresholds for file size and the number of files.


D.

Implement a partition strategy in Amazon S3. Run an AWS Glue crawler to update the Data Catalog every 5 minutes.


Get Premium Data-Engineer-Associate Questions

Contribute your Thoughts:


Chosen Answer:
This is a voting comment (?). It is better to Upvote an existing comment if you don't have anything to add.