---
product_id: 6268580
title: "Reinforcement Learning: An Introduction (Adaptive Computation and Machine Learning) First Edition"
brand: "richard s. suttonandrew g. barto"
price: "S/.399"
currency: PEN
in_stock: true
reviews_count: 8
url: https://www.desertcart.pe/products/6268580-reinforcement-learning-an-introduction-adaptive-computation-and-machine-learning-first
store_origin: PE
region: Peru
---

# Reinforcement Learning: An Introduction (Adaptive Computation and Machine Learning) First Edition

**Brand:** richard s. suttonandrew g. barto
**Price:** S/.399
**Availability:** ✅ In Stock

## Quick Answers

- **What is this?** Reinforcement Learning: An Introduction (Adaptive Computation and Machine Learning) First Edition by richard s. suttonandrew g. barto
- **How much does it cost?** S/.399 with free shipping
- **Is it available?** Yes, in stock and ready to ship
- **Where can I buy it?** [www.desertcart.pe](https://www.desertcart.pe/products/6268580-reinforcement-learning-an-introduction-adaptive-computation-and-machine-learning-first)

## Best For

- richard s. suttonandrew g. barto enthusiasts

## Why This Product

- Trusted richard s. suttonandrew g. barto brand quality
- Free international shipping included
- Worldwide delivery with tracking
- 15-day hassle-free returns

## Description

Full description not available

## Images

![Reinforcement Learning: An Introduction (Adaptive Computation and Machine Learning) First Edition - Image 1](https://m.media-amazon.com/images/I/91+bG6uIDkL.jpg)
![Reinforcement Learning: An Introduction (Adaptive Computation and Machine Learning) First Edition - Image 2](https://m.media-amazon.com/images/I/91Mtoocca5L.jpg)

## Customer Reviews

### ⭐⭐⭐⭐⭐ An excellent introduction
*by D***N on November 5, 2004*

As a subfield of artificial intelligence, reinforcement learning has shown great success from both a theoretical and practical viewpoint. Taking the form of numerous applications in finance, network engineering, robot toys, and games, it is clear that his learning paradigm shows even greater promise for future developments. The authors summarize the foundations of reinforcement learning, some of this coming from their own work over the last decade.The authors define reinforcement learning as learning how to map situations to actions so as to maximize a numerical reward. The machine that is indulging in reinforcement learning discovers on its own which actions will optimize the reward by trying out these actions. It is the ability of such a machine to learn from experience that distinguishes it from one that is indulging in supervised learning, for in the latter examples are needed to guide the machine to the proper concept or knowledge. The authors emphasize the "exploration-exploitation" tradeoffs that reinforcement-learning machines have to deal with as they interact with the environment.For the authors, a reinforcement learning system consists of a `policy', a `reward function', a `value function', and a `model' of the environment. A policy is a mapping from the states of the environment that are perceived by the machine to the actions that are to be taken by the machine when in those states. The reward function maps each perceived state of the environment to a number (the reward). A value function specifies what is the good for the machine over the long run. A model, as the name implies, is a representation of the behavior of the environment. The authors emphasize that all of the reinforcement learning methods that are discussed in the book are concerned with the estimation of value functions, but they point out that other techniques are available for solving reinforcement learning problems, such as genetic algorithms and simulated annealing.The authors use dynamic programming, Monte Carlo simulation, and temporal-difference learning to solve the reinforcement learning problem, but they emphasize that each of these methods will not give a free-lunch. An entire chapter is devoted to each of these methods however, giving the reader a good overview of the weaknesses and strengths of each of these approaches. The differences between them usual boil down to issues of performance rather than accuracy in the generated solutions. Temporal difference learning in fact is viewed in the book as a combination of Monte Carlo and dynamic programming techniques, and in the opinion of this reviewer, has resulted in some of the most impressive successes for applications based on reinforcement learning. One of these is TD-Gammon, developed to play backgammon, and which is also discussed in the book.The authors emphasize that these three main strategies for solving reinforcement learning problems are not mutually exclusive. Instead each of them could be used simultaneously with the others, and they devote a few chapters in the book illustrating how this "unified" approach can be advantageous for reinforcement learning problems. They do this by using explicit algorithms and not just philosophical discussion. These discussions are very interesting and illustrate beautifully the idea that there is no "free lunch" in any of the different algorithms involved in reinforcement learning.In the last chapter of the book the authors overview some of the more successful applications of reinforcement learning, one of them already mentioned. Another one discussed is the `acrobot', which is a two-link, underactuated robot, which models to some extent the motion of a gymnast on a high bar. The motion of the acrobot is to be controlled by swinging its tip above the first joint, with appropriate rewards given until this goal is reached. The authors use the `Sarsa' learning algorithm, developed earlier in the book, for solving this reinforcement learning problem. The acrobot is an example of the current intense interest in machine learning of physical motion and intelligent control theory.Another example discussed in this chapter deals with the problem of elevator dispatching, which the authors include as an example of a problem that cannot be dealt with efficiently by dynamic programming. This problem is studied with Q-learning and via the use of a neural network trained by back propagation.The authors also treat a problem of great importance in the cellular phone industry, namely that of dynamic channel allocation. This problem is formulated as a semi-Markov decision problem, and reinforcement learning techniques were used to minimize the probability of blocking a call. Reinforcement learning has become very important in the communications industry of late, as well as in queuing networks.

### ⭐⭐⭐⭐⭐ The book.
*by J***. on May 3, 2024*

Its "The Reinforcement Learning Book"! But seriously it is super useful for understanding classic forms of control.

### ⭐⭐⭐⭐⭐ Good book. Good store.
*by A***R on July 13, 2018*

Its a very good book and the sellers are very nice. They answered all the questions I have and try to solve all my concerns. I very appreciate their consideration service. And the book's quality is also very good. It's the best book of reinforcement learning. I like it so much!

---

## Why Shop on Desertcart?

- 🛒 **Trusted by 1.3+ Million Shoppers** — Serving international shoppers since 2016
- 🌍 **Shop Globally** — Access 737+ million products across 21 categories
- 💰 **No Hidden Fees** — All customs, duties, and taxes included in the price
- 🔄 **15-Day Free Returns** — Hassle-free returns (30 days for PRO members)
- 🔒 **Secure Payments** — Trusted payment options with buyer protection
- ⭐ **TrustPilot Rated 4.5/5** — Based on 8,000+ happy customer reviews

**Shop now:** [https://www.desertcart.pe/products/6268580-reinforcement-learning-an-introduction-adaptive-computation-and-machine-learning-first](https://www.desertcart.pe/products/6268580-reinforcement-learning-an-introduction-adaptive-computation-and-machine-learning-first)

---

*Product available on Desertcart Peru*
*Store origin: PE*
*Last updated: 2026-05-18*