AWS Certified Solutions Architect - Associate / Question #1982 of 1019

Question #1982

A solutions architect is designing a cloud architecture for a new data processing pipeline deployed on AWS. The pipeline uses an Amazon Machine Image (AMI) and launch template to handle incoming data batches. The system must process data concurrently, automatically scale EC2 instances based on workload, maintain loose coupling between components, and ensure data batches are stored durably until processed. Which solution meets these requirements?

A

Use an Amazon Simple Notification Service (Amazon SNS) topic to distribute data batches. Configure an Auto Scaling group with the launch template, scaling instances based on CPU utilization.

B

Use an Amazon Simple Queue Service (Amazon SQS) queue to store data batches. Configure an Auto Scaling group with the launch template, scaling instances based on memory utilization.

C

Use an Amazon Simple Queue Service (Amazon SQS) queue to store data batches. Configure an Auto Scaling group with the launch template, scaling instances based on the ApproximateNumberOfMessagesVisible metric in the SQS queue.

D

Use an Amazon Simple Notification Service (Amazon SNS) topic to distribute data batches. Configure an Auto Scaling group with the launch template, scaling instances based on the NumberOfMessagesPublished metric in the SNS topic.

Explanation

Option C meets all requirements:
1. Durable Storage: Amazon SQS queues store messages until processed, ensuring no data loss.
2. Loose Coupling: SQS decouples data producers from consumers, allowing independent scaling and fault tolerance.
3. Automatic Scaling: Using the ApproximateNumberOfMessagesVisible metric ensures EC2 instances scale based on the actual backlog of messages, directly reflecting the workload.
4. Concurrent Processing: Multiple EC2 instances can process messages from the SQS queue concurrently.

Why other options fail:
- A/D: SNS does not store messages durably, risking data loss. Scaling based on CPU/NumberOfMessagesPublished does not track unprocessed messages.
- B: Scaling based on memory utilization does not correlate with the workload (pending messages), leading to inefficient scaling.

Answer

The correct answer is: C