Question #878
A team of data scientists is using Amazon SageMaker instances and SageMaker APIs to train machine learning (ML) models.
The SageMaker instances are deployed in a VPC that does not have access to or from the internet. Datasets for ML model training are stored in an Amazon S3 bucket. Interface VPC endpoints provide access to Amazon S3 and the SageMaker APIs.
Occasionally, the data scientists require access to the npm registry to update JavaScript packages that they use as part of their workflow. A solutions architect must provide access to the npm registry while ensuring that the SageMaker instances remain isolated from the internet.
Which solution will meet these requirements?
Create an AWS CodeCommit repository for each package that the data scientists need to access. Configure code synchronization between the npm registry and the CodeCommit repository. Create a VPC endpoint for CodeCommit.
Create a NAT gateway in the VPC. Configure VPC routes to allow access to the internet with a network ACL that allows access to only the npm registry endpoint.
Create a NAT instance in the VPC. Configure VPC routes to allow access to the internet. Configure SageMaker notebook instance firewall rules that allow access to only the npm registry endpoint.
Create an AWS CodeArtifact domain and repository. Add an external connection for public:npm to the CodeArtifact repository. Configure the npm client to use the CodeArtifact repository. Create a VPC endpoint for CodeArtifact.
Explanation
Answer D is correct because:
1. CodeArtifact Integration: AWS CodeArtifact allows creating a private repository with an external connection to the public npm registry. This enables secure package access without direct internet connectivity.
2. VPC Endpoint: A VPC endpoint for CodeArtifact ensures traffic remains within the AWS network, avoiding public internet exposure.
3. Isolation Compliance: This approach maintains the VPC's isolation while fulfilling the npm access requirement.
Why other options are incorrect:
- A: CodeCommit is designed for source control, not package management. Syncing npm packages is impractical.
- B/C: NAT gateways/instances expose the VPC to the internet, violating the isolation requirement.
Key Points:
- Use AWS-managed services (CodeArtifact) for package management.
- VPC endpoints enable private connectivity to AWS services.
- Avoid NAT/internet gateways when isolation is required.
Answer
The correct answer is: D