Microsoft AI Transformation Leader AB-731 Question # 21 Topic 3 Discussion
AB-731 Exam Topic 3 Question 21 Discussion:
Question #: 21
Topic #: 3
Your company plans to implement a proof of concept PoC agent that uses Azure OpenAI. The solution must start small and provide flexibility to scale usage as demand grows. Which pricing model should you use?
For a proof of concept , the key requirements are low commitment , quick start , and the ability to scale up or down as you learn what real usage looks like. Azure OpenAI Standard On-Demand pricing is designed for exactly that: you pay per token consumed (input and output) on a pay-as-you-go basis, which makes it ideal when demand is uncertain or variable—typical in early pilots and PoCs.
By contrast, Provisioned (PTUs) is best when you have well-defined, predictable throughput and latency requirements —usually a more mature, production workload. PTUs involve reserving model processing capacity to achieve consistent performance and more predictable costs, which is usually premature for a PoC where actual traffic patterns are not yet known.
Batch API is optimized for asynchronous high-volume jobs with a target turnaround (for example, up to 24 hours) and discounted pricing. That’s great for offline processing, but it does not match an interactive “agent” PoC that typically needs near-real-time responses and iterative testing.
Microsoft 365 Copilot is a separate SaaS licensing model and is not the Azure OpenAI pricing model for building your own agent solution.
Contribute your Thoughts:
Chosen Answer:
This is a voting comment (?). You can switch to a simple comment. It is better to Upvote an existing comment if you don't have anything to add.
Submit