Pre-Winter Sale Limited Time 65% Discount Offer - Ends in 0d 00h 00m 00s - Coupon code: pass65

Amazon Web Services AWS Certified Generative AI Developer-Professional AIP-C01 Question # 8 Topic 1 Discussion

Amazon Web Services AWS Certified Generative AI Developer-Professional AIP-C01 Question # 8 Topic 1 Discussion

AIP-C01 Exam Topic 1 Question 8 Discussion:
Question #: 8
Topic #: 1

A company is using Amazon Bedrock and Anthropic Claude 3 Haiku to develop an AI assistant. The AI assistant normally processes 10,000 requests each hour but experiences surges of up to 30,000 requests each hour during peak usage periods. The AI assistant must respond within 2 seconds while operating across multiple AWS Regions.

The company observes that during peak usage periods, the AI assistant experiences throughput bottlenecks that cause increased latency and occasional request timeouts. The company must resolve the performance issues.

Which solution will meet this requirement?


A.

Purchase provisioned throughput and sufficient model units (MUs) in a single Region. Configure the application to retry failed requests with exponential backoff.


B.

Implement token batching to reduce API overhead. Use cross-Region inference profiles to automatically distribute traffic across available Regions.


C.

Set up auto scaling AWS Lambda functions in each Region. Implement client-side round-robin request distribution. Purchase one model unit (MU) of provisioned throughput as a backup.


D.

Implement batch inference for all requests by using Amazon S3 buckets across multiple Regions. Use Amazon SQS to set up an asynchronous retrieval process.


Get Premium AIP-C01 Questions

Contribute your Thoughts:


Chosen Answer:
This is a voting comment (?). It is better to Upvote an existing comment if you don't have anything to add.