Question #1130
A company recently migrated a stream processing system to AWS. The system uses Apache Kafka running on Amazon EC2 instances for ingesting data streams. A consumer application running on EC2 processes the streams and stores results in a PostgreSQL database on EC2. The company wants the system to be highly available with minimal operational overhead.
Which architecture provides the HIGHEST availability?
Deploy a second Kafka broker in another Availability Zone. Launch additional consumer EC2 instances in another Availability Zone. Configure PostgreSQL replication to another Availability Zone.
Use Amazon MSK (Managed Streaming for Kafka) with brokers distributed across two Availability Zones. Launch additional consumer EC2 instances in another Availability Zone. Configure PostgreSQL replication to another Availability Zone.
Use Amazon MSK with brokers distributed across two Availability Zones. Deploy consumer EC2 instances in an Auto Scaling group across two Availability Zones. Configure PostgreSQL replication to another Availability Zone.
Use Amazon MSK with brokers distributed across two Availability Zones. Deploy consumer EC2 instances in an Auto Scaling group across two Availability Zones. Use Amazon RDS for PostgreSQL with Multi-AZ deployment.
Explanation
Option D provides the highest availability because:
1. Amazon MSK: Managed Kafka service automatically distributes brokers across AZs, ensuring high availability without manual management.
2. Auto Scaling Group for Consumers: Deploys EC2 instances across multiple AZs, ensuring consumer application resilience.
3. Amazon RDS Multi-AZ: Automatically replicates PostgreSQL to a standby instance in another AZ with seamless failover, reducing database downtime.
Other options fall short:
- A: Self-managed Kafka/PostgreSQL increases operational effort; no Auto Scaling for consumers.
- B: Uses MSK but lacks Auto Scaling for consumers and relies on manual PostgreSQL replication.
- C: Uses Auto Scaling for consumers but retains self-managed PostgreSQL replication, which lacks automated failover.
Key Points: Use managed services (MSK/RDS) for HA and Auto Scaling to minimize operational overhead while maximizing availability.
Answer
The correct answer is: D