Search
Go

Shop by category
 
Pro Hadoop (Expert's Voice in Open Source)
Email a friendView larger image

Pro Hadoop (Expert's Voice in Open Source)

List Price: $39.99
Our Price: $26.25
You Save: $13.74 (34%)
Shipping: This item ships for FREE with Super Saver Shipping.
SKU:

ACOMMP2_book_usedverygood_1430219424

In Stock
Usually ships in 1 business days

Note: Item may be sold and shipped by another company. Learn more.
Product Promotions:
  • Buy $50 in qualifying physical textbooks, get $2 in Amazon MP3 Credit.  Here's how (restrictions apply)
Description:

You’ve heard the hype about Hadoop: it runs petabyte–scale data mining tasks insanely fast, it runs gigantic tasks on clouds for absurdly cheap, it’s been heavily committed to by tech giants like IBM, Yahoo!, and the Apache Project, and it’s completely open-source (thus free). But what exactly is it, and more importantly, how do you even get a Hadoop cluster up and running?

From Apress, the name you’ve come to trust for hands–on technical knowledge, Pro Hadoop brings you up to speed on Hadoop. You learn the ins and outs of MapReduce; how to structure a cluster, design, and implement the Hadoop file system; and how to build your first cloud–computing tasks using Hadoop. Learn how to let Hadoop take care of distributing and parallelizing your software—you just focus on the code, Hadoop takes care of the rest.

Best of all, you’ll learn from a tech professional who’s been in the Hadoop scene since day one. Written from the perspective of a principal engineer with down–in–the–trenches knowledge of what to do wrong with Hadoop, you learn how to avoid the common, expensive first errors that everyone makes with creating their own Hadoop system or inheriting someone else’s.

Skip the novice stage and the expensive, hard–to–fix mistakes...go straight to seasoned pro on the hottest cloud–computing framework with Pro Hadoop. Your productivity will blow your managers away.

What you’ll learn

  • Set up a stand–alone Hadoop cluster the smart way, laid out simply and step by step so you can get up and running quickly to build your next data center, collaborative, data–intensive Internet services application, Software as a Service (SaaS), and more.
  • Optimize your Hadoop production tasks like an experienced pro.
  • Work with time–proven, bulletproof standard patterns that have been tested and debugged in high–volume production.
  • Understand just enough theoretical knowledge to know why something works in Hadoop, without getting bogged down in abstruse walls of theory.
  • Get detailed explanations of not only how to do something with Hadoop, but also why, from a front–line coder with years in the Hadoop game.
  • Turn someone else’s expensive cluster–wide “wrong” into an orderly, productive "right" with professional–level debugging and testing.

Who this book is for

IT professionals interested in investigating Hadoop and implementing it in their organizations, and existing Hadoop users who want to deepen their professional toolkits.

Table of Contents

  1. Getting Started with Hadoop Core
  2. The Basics of a MapReduce Job
  3. The Basics of Multimachine Clusters
  4. HDFS Details for Multimachine Clusters
  5. MapReduce Details for Multimachine Clusters
  6. Tuning Your MapReduce Jobs
  7. Unit Testing and Debugging
  8. Advanced and Alternate MapReduce Techniques
  9. Solving Problems with Hadoop
  10. Projects Based On Hadoop and Future Directions
Product Details:
Author: Jason Venner
Paperback: 440 pages
Publisher: Apress
Publication Date: June 22, 2009
Language: English
ISBN: 1430219424
Product Length: 9.2 inches
Product Width: 6.9 inches
Product Height: 1.0 inches
Product Weight: 1.3 pounds
Package Length: 9.0 inches
Package Width: 6.9 inches
Package Height: 1.1 inches
Package Weight: 1.3 pounds
Average Customer Rating: based on 6 reviews
Customer Reviews:
Average Customer Review: 3.5 ( 6 customer reviews )
Write an online review and share your thoughts with other customers.


Most Helpful Customer Reviews

17 of 18 found the following review helpful:

4Less comprehensive than Tom White's Hadoop: The Definitive Guide but still a Good BuyJul 13, 2009
By Techie Evan
The reason why I say this book's still a Good Buy is because Jason Venner has used Hadoop in several scenarios, and this book contains a lot of practical and time-saving tips on what mistakes to avoid or how to troubleshoot problems, making it an especially good book for Hadoop newbies. His materials on Testing and Debugging MapReduce Applications are also a value-add.

Chapter One provides detailed instructions on how to install Hadoop and how to run a test to verify that everything went fine. The author mentions that Hadoop 0.19 works best with Sun's JDK 1.6 and that although Hadoop will work on Windows with Cygwin installed, you have to be careful when specifying file paths.

Chapters Two and Three introduce basic concepts pertaining to MapReduce Jobs and Multimachine Clusters, respectively, and how "master" and "slave" nodes are configured. Chapter Four teaches you how to install, configure, and troubleshoot Hadoop Distributed File System.

Chapters Five and Six provide tutorials on the different types of inputs and outputs that a Hadoop MapReduce job can handle, and how to tune MapReduce jobs.

Chapter Seven is an excellent tutorial on how to unit test and debug MapReduce jobs, while Chapter Eight discusses more advanced MapReduce techniques for addressing more complex application requirements.

Chapter Nine walks you through the evolution of a (somewhat boring) real-world application, discussing rationales behind design changes, etc. Chapter 10 provides a few descriptive paragraphs each for various projects related to Hadoop (e.g., Pig, HBase, Mahout, ZooKeeper,etc). Finally, Appendix A is a detailed discussion of the JobConf API, JobConf being the object that controls information relating to a MapReduce job.

8 of 8 found the following review helpful:

3Mostly configuration not too much conceptualFeb 13, 2010
By Sumit Pal
I had lot of hopes from this book - but it was a let down - apart from the 1st 2 chapters.

Rest of the chapters mostly concentrated on minute details of configuration of a host of different parameters.

I was looking for a book - that gave back to the readers more on the conceptual side of Hadoop and on Map Reduce - with examples of being able to solve different flavour of problems.

I just skimmed over the chapters from chapter 3 onwards - since I found the configuration details too detailed.

However, if you consider from the point of view - of how difficult it can be to setup Hadoop - may be the configurations as discussed from Chapter 3 onwards are essential.

Now that Cloudera has come up with an easy to install Hadoop install - going though configuration and setup in a book at a very detailed level seems not necessary.

The pictures and diagrams ( though very few on this book ) are not very helpful and I felt were not thoughtfully made.

4 of 4 found the following review helpful:

5Great book, couldn't have setup our Hadoop cluster without it.Dec 03, 2009
By Bryan Migliorisi
I've been hearing about Hadoop and the MapReduce paradigm for some time now and I have been wondering how it would work for me. I decided to pick up this book and learn even further how I could use Hadoop.

The author does a nice job of explaining what a MapReduce job is and how you can put it to use and get usable data out of seemingly uncomprehensible junk. This was instrumental in pitching the idea to upper management.

Chapters 2 through 5 were quite helpful while installing and setting up a cluster (and single instance) of Hadoop. There is alot of information out on the web, but it is very unstructured and difficult to follow. I don't think we could have done it without help from the book. It is worth mentioning that Cloudera does have a nice virtual machine image that you can download for free which already has everything set up. This VM image could save you alot of time during a Proof of Concept.

Chapters 8 and 9 further explain different problems and the Hadoop approaches to solving them. I'm not sure how applicable these examples are in the real world, but they definitely illustrate how you should approach a problem that you intend to solve via MapReduce with Hadoop.

Since reading this book, my team and I have successfully built a 4 machine Hadoop cluster to process logs from our application so that we may provide better analytics and better predict spammers. Pro Hadoop served as a good reference each time we hit a roadblock.

I'd recommend this book to anyone who is looking to learn more about Hadoop and MapReduce techniques and I'd say it is a must have for anyone who is looking to implement Hadoop.

13 of 17 found the following review helpful:

1Kindle Edition is UnreadableApr 15, 2010
By Chris Perkins
I bought the Kindle edition of this book, but unfortunately it is unreadable. All the code samples (which are an absolutely essential part of a book like this) have been included as images in which the text has been reduced to the point where you would need a scanning electron microscope to read any of it. Amazon did promptly give me a refund, so no harm done, but if you have a regular-sized Kindle, don't bother with this one. It may be OK on the Kindle DX.

1 of 1 found the following review helpful:

4Good Hadoop overview, no issues reading on KindleJul 01, 2011
By F. Vines
This book has a good overview of Hadoop concepts and plenty of detail on Hadoop cluster setup. I also have Tom White's "Hadoop: The Definitive Guide" which has more detail on APIs.

The Kindle edition of this book is perfectly readable on my 6" Kindle 2, although the code samples are significantly lighter than the rest of the text.

See all 6 customer reviews on Amazon.com
About Us   Contact Us
Privacy Policy Copyright © , Security Books. All rights reserved.
Web business powered by Amazon WebStore