Salaam
Does the word "reward" in the following text necessarily have a positive meaning, or it could simply mean "feedback"? I ask this because the article discusses reinforcement-learning through punishment and reward. Then it points out "real reward" which logically could be interpreted as "feedback", since real reward isn't always favorable.
"Drawing on the work of physiologist Ivan Pavlov, who famously used dogs to show how animals learn through punishments and rewards, Minsky created a computer that could continuously learn through similar reinforcement to solve a virtual maze... At a high level, reinforcement learning follows the insight derived from Pavlov’s dogs: it’s possible to teach an agent to master complex, novel tasks through only positive and negative feedback. An algorithm begins learning an assigned task by randomly predicting which action might earn it a reward. It then takes the action, observes the real reward, and adjusts its prediction based on the margin of error. Over millions or even billions of trials, the algorithm’s prediction errors converge to zero, at which point it knows precisely which actions to take to maximize its reward and so complete its task."
Source: http://technologyreview.com/s/615054/
New words, one handy idiom, and a 2-minute quiz — delivered to your inbox to keep your streak alive.
According to the normal meaning of the word, a "reward" is always positive/favourable. The fact that the so-called "real reward" apparently may not always be favourable is more just loose wording than an actual variant meaning of "reward".
An algorithm begins learning an assigned task by randomly predicting which action might earn it a reward. It then takes the action, observes the actual outcome, real reward, and adjusts its prediction based on the margin of error.
If the result could be the equivalent of either an electric shock (a penalty) or a morsel of meat