Section1.6.3 – Reinforcement Learningof the ISTQB CT-AI syllabus states that reinforcement learning (RL) is based on anagent interacting with an environment, performing actions, and receivingrewards or penalties. The core concept is thereward function, which guides the agent’s learning process. The syllabus emphasizes that training in RL isdriven by rewards, and the agent aims to maximize cumulative reward over time. Therefore, OptionCdirectly reflects the correct description: the agent learns by being rewarded for successful actions .
Option A is incorrect because RL doesnotuse labeled data; that applies to supervised learning. Option B contradicts the syllabus definition: RL fundamentallyrequiresinteraction with the environment. Option D is incorrect because the reward function isdefined by humans, not learned by the agent; the agent learns apolicy, not the reward function itself.
Thus, OptionCis the only statement consistent with RL as defined in the syllabus.
Contribute your Thoughts:
Chosen Answer:
This is a voting comment (?). You can switch to a simple comment. It is better to Upvote an existing comment if you don't have anything to add.
Submit