Question #1055
Accompany is developing an application to collect and transmit sensor data from a factory. The application uses AWS IoT Core to send data from thousands of devices to an Amazon S3 data lake. The company must process and transform the data before storing it in Amazon S3. The sensor data is transmitted every 10 seconds, and the processed data must be available in Amazon S3 within 20 minutes of collection. No additional applications are handling the sensor data from AWS IoT Core. Which solution MOST cost-effectively meets these requirements?
Create an AWS IoT Core topic to ingest the sensor data. Configure an AWS IoT rule to trigger an AWS Lambda function for data transformation. Use the Lambda function to write the processed data directly to Amazon S3.
Use AWS IoT Core Basic Ingest to collect the sensor data. Configure an AWS IoT rule to route the data to Amazon Kinesis Data Firehose. Set Kinesis Data Firehose's buffering interval to 600 seconds. Use a Lambda function integrated with Kinesis Data Firehose to process the data, then configure Kinesis Data Firehose to deliver it to Amazon S3.
Create an AWS IoT Core topic to ingest the sensor data. Configure an AWS IoT rule to send the data to Amazon DynamoDB. Create a Lambda function to read and transform the data from DynamoDB, then write the processed data to Amazon S3.
Use AWS IoT Core Basic Ingest to collect the sensor data. Configure an AWS IoT rule to write the data to Amazon Kinesis Data Streams. Develop a Lambda function to process the data from Kinesis Data Streams and use the S3 PutObject API to write the data to Amazon S3.
Explanation
The correct answer is B. Here's why:
- Cost Efficiency: Kinesis Data Firehose batches data, minimizing S3 PUT operations (costly at scale) and Lambda invocations. Basic Ingest avoids MQTT costs.
- Buffering: Firehose's 600-second buffer (10 minutes) ensures data is processed within the 20-minute requirement while optimizing batch size.
- Managed Service: Firehose handles retries, batching, and delivery to S3, reducing operational overhead.
Why other options are incorrect:
- A: Direct Lambda-to-S3 per message incurs high Lambda/S3 costs and risks throttling.
- C: DynamoDB adds unnecessary storage costs and complexity for transient data.
- D: Kinesis Data Streams requires shard management and more Lambda invocations, increasing costs.
Key Points:
- Use Kinesis Firehose for batched, serverless data transformation and delivery.
- Basic Ingest reduces costs for high-volume IoT data ingestion.
- Buffering balances latency and cost efficiency.
Answer
The correct answer is: B