AWS Certified Solutions Architect - Associate / Question #1662 of 1019

Question #1662

A company operates several mobile applications for its different services. Each application generates tens of gigabytes of usage logs daily. A solutions architect must design a scalable solution to enable the company's data analysts to perform ad-hoc analysis of usage trends across all applications. This analysis occurs on a weekly basis over several months and must support standard SQL queries.

Which solution meets these requirements MOST cost-effectively?

A

Store the logs in Amazon S3. Use Amazon Athena for analysis.

B

Store the logs in Amazon RDS. Use a database client for analysis.

C

Store the logs in Amazon OpenSearch Service. Use OpenSearch for analysis.

D

Store the logs in an Amazon EMR cluster. Use a supported SQL framework for analysis.

Explanation

The correct answer is A because:

- Amazon S3 is optimized for storing large volumes of data (tens of GB daily) at low cost, with high durability and scalability.
- Amazon Athena is a serverless query service that uses standard SQL to analyze data directly in S3. It requires no infrastructure setup, scales automatically, and charges only for the queries run, which aligns with the weekly ad-hoc analysis requirement.

Why other options are incorrect:
- B (RDS): Storing logs in RDS is expensive for large datasets, and scaling requires manual intervention. RDS is better for transactional workloads, not log analytics.
- C (OpenSearch): OpenSearch is designed for real-time search and analytics, not batch SQL queries. It incurs higher costs for storage and compute compared to S3+Athena.
- D (EMR): EMR requires cluster management and incurs ongoing costs even when not in use. It is overkill for ad-hoc SQL queries on static log data.

Key Points:
- Use S3 for cost-effective log storage.
- Use Athena for serverless, on-demand SQL analysis.
- Avoid managed databases (RDS/OpenSearch) or clusters (EMR) for intermittent workloads.

Answer

The correct answer is: A