• Fine-Grained Concurrency with the Guava Striped Class

    This post is going to cover how to use the Striped class from Guava to achieve finer-grained concurrency. The ConcurrentHashMap uses a striped locked approach to increase concurrency and the Striped class extends this principal by giving us the ability to have striped Locks, ReadWriteLocks and Semaphores. When accessing an...


  • Configuring Hadoop with Guava MapSplitters

    In this post we are going to provide a new twist on passing configuration parameters to a Hadoop Mapper via the Context object. Typically, we set configuration parameters as key/value pairs on the Context object when starting a map-reduce job. Then in the Mapper we use the key(s) to retrieve...


  • Request For Book Reviews - Completed!

    Note: I’ve been informed by the publisher, Packt, that we have enough reviewers as of now, so I am no longer accepting any more. Thanks to those that have signed upGreetings all! This is not my typical post, but I have a good reason. I’ve recently written and published my...


  • MapReduce Algorithms - Understanding Data Joins Part 1

    In this post we continue with our series of implementing the algorithms found in the Data-Intensive Text Processing with MapReduce book, this time discussing data joins. While we are going to discuss the techniques for joining data in Hadoop and provide sample code, in most cases you probably won’t be...


  • Book Review : Hadoop - Beginners Guide

    This post I am going to review the book “Hadoop - Beginners Guide” by Gary Turkington. In the interest of full disclosure, while I received a free copy of the book for reviewing, I do not receive any compensation for books purchased from this blog. With that out of the...