Question #985
A company needs to process genomic data stored in an Amazon S3 bucket. Each job analyzes 15-20 GB of data and stores results in a separate S3 bucket. The jobs are not time-sensitive and can tolerate interruptions. Which solution is MOST cost-effective?
Use AWS Lambda with provisioned concurrency to process data, triggered by S3 event notifications.
Create an AWS Batch compute environment using Amazon EC2 Spot Instances with the SPOTCAPACITYOPTIMIZED allocation strategy.
Configure AWS Batch with a mix of Amazon EC2 On-Demand and Spot Instances, using the BESTFITPROGRESSIVE allocation strategy for Spot Instances.
Use Amazon Elastic Container Service (Amazon ECS) with Fargate Spot instances to run the processing jobs.
Explanation
Answer B is correct because:
- EC2 Spot Instances provide up to 90% cost savings compared to On-Demand, ideal for non-time-sensitive jobs.
- SPOTCAPACITYOPTIMIZED allocation strategy reduces interruptions by prioritizing pools with the most available Spot capacity.
- AWS Batch simplifies batch job management and scales dynamically with Spot Instances.
Other options are less optimal:
- A: Lambda has runtime/memory limits (15 mins, 10GB) and provisioned concurrency adds unnecessary cost.
- C: Mixing On-Demand/Spot increases costs, and BESTFITPROGRESSIVE is less efficient for Spot capacity.
- D: Fargate Spot is costlier than EC2 Spot for large, batch workloads.
Key Points: Use Spot Instances for interrupt-tolerant workloads; AWS Batch + SPOTCAPACITYOPTIMIZED balances cost and reliability.
Answer
The correct answer is: B