AWS Certified Developer – Associate / Question #568 of 557

Question #568

A company is migrating its legacy document management system to AWS. The new system must store document metadata (e.g., title, author, creation date) and the actual document files (e.g., PDFs, Word documents) for efficient retrieval via AWS APIs. Which solution enables scalable storage of the documents and efficient querying of metadata?

A

Encode the document files using Base64 and store them alongside metadata in an Amazon DynamoDB table with a composite primary key.

B

Store document metadata in an Amazon DynamoDB table and the document files in Amazon S3, referencing the S3 object keys in the DynamoDB entries.

C

Use Amazon Aurora Serverless to store both metadata and document files in a fully managed relational database with auto-scaling capabilities.

D

Store document metadata in Amazon ElastiCache for Redis and the document files in Amazon FSx for Lustre to optimize low-latency access.

Explanation

Answer B is correct because:
1. Scalability: Amazon S3 is optimized for storing large files (e.g., PDFs) with virtually unlimited scalability.
2. Efficient Metadata Querying: DynamoDB provides fast, flexible querying of metadata using indexes and integrates seamlessly with AWS APIs.
3. Cost-Effectiveness: Separating metadata (DynamoDB) and files (S3) avoids exceeding DynamoDB's 400KB item limit (Option A) and reduces relational database costs (Option C).
4. Durability: S3 ensures high durability for documents, while DynamoDB reliably stores metadata.

Other options fail because:
- A: DynamoDB's item size limit makes storing large documents impractical.
- C: Aurora Serverless is not optimized for storing large binary files (costly and inefficient).
- D: ElastiCache (in-memory) is unsuitable for persistent metadata storage, and FSx for Lustre is overkill for general document storage.

Key Takeaway: Use S3 for large objects and DynamoDB for metadata to balance scalability, cost, and performance.

Answer

The correct answer is: B