Full description not available
K**A
Excellent text
First of all, as some other reviewers have pointed out, the subtitle of the book should include the word 'Bayesian' in some form or the other. The reason this is important is because the Bayesian approach, although an important one, is not adapted across the board in machine learning, and consequently, an astonishing number of methods presented in the book (Bayesian versions of just about anything) are not mainstream. The recent Duda book gives a better idea of the mainstream in this sense, but because the field has evolved in such rapidity, it excludes massive recent developments in kernel methods and graphical models, which Bishop includes.Pedagogically, however, this book is almost uniformly excellent. I didn't like the presentation on some of the material (the first few sections on linear classification are relatively poor), but in general, Bishop does an amazing job. If you want to learn the mathematical base of most machine learning methods in a practical and reasonably rigorous way, this book is for you. Pay attention in particular to the exercises, which are the best I've seen so far in such a text; involved, but not frustrating, and always aiming to further elucidate the concepts. If you want to really learn the material presented, you should, at the very least, solve all the exercises that appear in the sections of the text (about half of the total). I've gone through almost the entire text, and done just that, so I can say that it's not as daunting as it looks. To judge your level regarding this, solve the exercises for the first two chapters (the second, a sort of crash course on probability, is quite formidable). If you can do these, you should be fine. The author has solutions for a lot of them on his website, so you can go there and check if you get stuck on some.As far as the Bayesian methods are concerned, they are usually a lot more mathematically involved than their counterparts, so solving the equations representing them can only give you more practice. Seeing the same material in a different light can never hurt you, and I learned some important statistical/mathematical concepts from the book that I'd never heard of, such as the Laplace and Evidence Approximations. Of course, if you're not interested, you can simply skip the method altogether.From the preceding, it should be clear that the book is written for a certain kind of reader in mind. It is not for people who want a quick introduction to some method without the gory details behind its mathematical machinery. There is no pseudocode. The book assumes that once you get the math, the algorithm to implement the method should either become completely clear, or in the case of some more complicated methods (SVMs for example), you know where to head for details on an implementation. Therefore, the people who will benefit most from the book are those who will either be doing research in this area, or will be implementing the methods in detail on lower level languages (such as C). I know that sounds offputting, but the good thing is that the level of the math required to understand the methods is quite low; basic probability, linear algebra and multivariable calculus. (Read the appendices in detail as well.) No knowledge is needed, for example, of measure-theoretic probability or function spaces (for kernel methods) etc. Therefore the book is accessible to most with a decent engineering background, who are willing to work through it. If you're one of the people who the book is aimed at, you should seriously consider getting it.Edited to Add:I've changed my rating from 4 stars to 5. Even now, 4-5 years later, there is simply no good substitute for this book.
E**E
Still (one of) the best
I recently had to quickly understand some facts about the probabilistic interpretation of pca. Naturally I picked up this book and it didn't disappoint. Bishop is absolutely clear, and an excellent writer as well.In my opinion, despite the recent publication of Kevin Murphy's very comprehensive ML book, Bishop is still a better read. This is mostly because of his incredible clarity, but the book has other virtues: best in class diagrams, judiciously chosen; a lot of material, very well organized; excellent stage setting (the first two chapters). Now, sometimes he's a bit cryptic, for example, the proof that various kinds of loss lead to conditional median or mode is left as an exercise (ex 1.27). Murphy actually discusses it in some detail. This is true in general: Murphy actually discusses many things that Bishop leaves to the reader. I thought chapters three and four could have been more detailed, but I really have no other complaints.Please note that in order to get an optimal amount out of reading this book you should already have a little background in linear algebra, probability, calculus, and preferably some statistics. The first time I approached it was without any background and I found it a bit unfriendly and difficult; this is no fault of the book, however. Still, you don't need that much, just the basics.Update: I should note that there are some puzzling omissions from this book. E.g. f-score & confusion matrices are not mentioned (see Murphy section 5.7.2) - it would have been very natural to mention these concepts in ch 1, along with decision theory. Nor is there much on clustering, except for K-means (see Murphy ch 25). Not a huge deal, it's easy to get these concepts from elsewhere. I recommend using Murphy as and when you need, to fill in gaps.One more update: I've been getting into Hastie et al's ESL recently, and I'm really impressed with it so far - I think the practitioner should probably get familiar with both ESL and PRML, as they have complementary strengths and weaknesses. ESL is not very Bayesian at all; PRML is relentlessly so. ESL does not use graphical models or latent variables as a unifying perspective; PRML does. ESL is better on frequentist model selection, including cross-validation (ch 7). I think PRML is better for graphical models, Bayesian methods, and latent variables (which correspond to chs 8-13) and ESL better on linear models and density based methods (and other stuff besides). Finally, ESL is way better on "local" models, like kernel regression & loess. Your mileage may vary...They are both excellent books. ESL seems a bit more mathematically dense than PRML, and is also better for people who are in industry as versus academia (I was in the latter but now in the former),
Trustpilot
Hace 2 meses
Hace 3 días