Question #1019
A company is developing a real-time monitoring system for a distributed application. The system must ingest a continuous stream of performance metrics and allow multiple processing units to analyze the data simultaneously. Each processing unit must be able to recover from failures without missing any metrics, and the solution should avoid duplicating data across processors. The architect also plans to add more processing units later with minimal changes. Which solution meets these requirements?
Publish the data to Amazon Simple Queue Service (Amazon SQS).
Publish the data to Amazon Data Firehose.
Publish the data to Amazon EventBridge.
Publish the data to Amazon Kinesis Data Streams.
Explanation
Amazon Kinesis Data Streams (D) meets all requirements:
1. Real-Time Ingestion: Kinesis is designed for continuous, high-throughput data streams.
2. Simultaneous Processing: Multiple consumers (e.g., KCL-based applications) can read from the same stream using separate iterators, ensuring no data duplication.
3. Fault Tolerance: Each processing unit tracks its progress via checkpoints, allowing recovery without missing data.
4. Scalability: Adding more processing units requires minimal changes; Kinesis handles shard scaling and data distribution.
Other options fail because:
- A (SQS): Messages are consumed by only one processor, preventing simultaneous analysis.
- B (Data Firehose): Focuses on data delivery (e.g., to S3), not real-time multi-consumer processing.
- C (EventBridge): Lacks persistent storage for replayability and requires duplication for multi-consumer access.
Key Points: Kinesis supports real-time streams, multi-consumer processing, fault tolerance via checkpoints, and scalability.
Answer
The correct answer is: D