A company uses Apache Hadoop and Spark on-prem. The infrastructure is complex and not scalable. They want to reduce operational complexity but keep data processing on-premises.
Options:
A.
Use Site-to-Site VPN to access on-prem HDFS. Use Amazon EMR to process the data.
B.
Use AWS DataSync to connect to on-prem HDFS. Use Amazon EMR to process the data.
C.
Migrate to Amazon EMR on AWS Outposts.
D.
Use AWS Snowball to migrate data to S3. Use EMR to process.
AWS Outposts brings native AWS services (including Amazon EMR) on-premises, ideal when data residency or latency constraints require local processing.
You benefit from AWS’s managed services while meeting the requirement to keep data processing local.
[References:, Amazon EMR on AWS Outposts, ]
Contribute your Thoughts:
Chosen Answer:
This is a voting comment (?). You can switch to a simple comment. It is better to Upvote an existing comment if you don't have anything to add.
Submit