AWS Certified Solutions Architect - Associate / Question #1849 of 1019

Question #1849

A social media platform manages 8 TB of datasets, including 2 million user profiles and 20 million follower relationships. The relationships are many-to-many. The platform requires a highly performant method to identify mutual followers up to six degrees of separation. Which solution best meets these requirements?

A

Store the datasets in Amazon S3 and use Amazon Redshift to execute complex JOIN queries for finding connections.

B

Utilize Amazon Neptune to model the data as a graph with nodes and edges, then traverse the graph to find mutual followers.

C

Store the datasets in Amazon DynamoDB tables and use batch operations to recursively search for connections.

D

Use Amazon OpenSearch Service to index the datasets and perform search queries to discover connections.

Explanation

The correct answer is B because:

- Amazon Neptune is a graph database service designed to handle highly connected datasets. It models data as nodes (users) and edges (follower relationships), allowing efficient traversal of connections using graph queries (e.g., Gremlin or SPARQL). This is critical for identifying mutual followers up to six degrees of separation, as graph traversal is inherently optimized for such operations.

- Why other options are incorrect:
- A (Redshift): Redshift is a data warehouse optimized for analytical queries, not real-time graph traversals. Complex JOINs on large datasets would be slow and inefficient.
- C (DynamoDB): DynamoDB lacks native graph traversal capabilities. Recursive searches would require multiple round-trip queries, leading to latency and scalability issues.
- D (OpenSearch): OpenSearch is designed for text search and analytics, not graph-based relationship traversal. It cannot efficiently handle multi-level connection queries.

Key Points:
- Use graph databases (Neptune) for relationship-heavy, multi-level traversal requirements.
- Avoid relational databases (Redshift) or NoSQL (DynamoDB) for graph-based problems.
- OpenSearch is unsuitable for graph traversal despite its search capabilities.

Answer

The correct answer is: B