Question #597
A company uses a Python application on an Amazon EC2 instance to handle data transformation tasks. The application checks an Amazon S3 bucket every 15 minutes and processes files, with each file taking roughly 7 minutes to complete. The application skips files that have already been processed. CloudWatch metrics reveal the EC2 instance is idle 50% of the time due to processing delays. The company aims to improve scalability, ensure high availability, and minimize operational management.
Which solution MOST cost-effectively meets these requirements?
Convert the data transformation application into an AWS Lambda function. Configure S3 event notifications to trigger the Lambda function automatically when new files are uploaded to the S3 bucket.
Set up an Amazon Simple Queue Service (SQS) queue. Enable S3 event notifications to send messages to the queue. Deploy an EC2 Auto Scaling group starting with one instance. Modify the application to poll the SQS queue and process files based on received messages.
Containerize the data transformation application and deploy it on the existing EC2 instance. Configure the container to continuously poll the S3 bucket for new files and process them as they arrive.
Migrate the application to a container running on Amazon Elastic Container Service (ECS) with AWS Fargate. Use an Amazon EventBridge rule to invoke the Fargate task via the RunTask API whenever a new file is uploaded to the S3 bucket.
Explanation
Answer A is correct because:
- Scalability: Lambda automatically scales with the number of file uploads, processing files concurrently without manual intervention.
- Cost-Effectiveness: Lambda charges per execution and duration, avoiding costs from idle EC2 instances.
- High Availability: Lambda runs across multiple Availability Zones by default.
- Operational Simplicity: No EC2 or container management is required.
Other options:
- B: Involves EC2 Auto Scaling and SQS, which adds operational complexity and potential idle costs.
- C: Fails to address scalability/HA and retains EC2 management.
- D: Fargate is costlier than Lambda for short tasks and requires containerization.
Key Points: Use serverless (Lambda) for event-driven, scalable workloads; S3 event triggers reduce delays and idle time.
Answer
The correct answer is: A