Tweet We continue with our series on implementing MapReduce algorithms found in Data-Intensive Text Processing with MapReduce book. Other posts in this series: Working Through Data-Intensive Text Processing with MapReduce Working Through Data-Intensive Text Processing with MapReduce – Local Aggregation Part II Calculating A Co-Occurrence Matrix with Hadoop MapReduce Algorithms – Order Inversion This post [...]
Google Guava BloomFIlter
Tweet When the Guava project released version 11.0, one of the new additions was the BloomFilter class. A BloomFilter is a unique data-structure used to indicate if an element is contained in a set. What makes a BloomFilter interesting is it will indicate if an element is absolutely not contained, or may be contained in [...]
What’s New in Java 7: WatchService
Tweet Of all the new features in Java 7, one of the more interesting is the WatchService, adding the capability to watch a directory for changes. The WatchService maps directly to the native file event notification mechanism, if available. If a native event notification mechanism is not available, then the default implementation will use polling. [...]
Creating An Asynchronous, Recursive DirectoryStream in Java 7
Tweet Continuing with my series on the Java 7 java.nio.file package, this time covering the DirectoryStream interface. In this post we are going implement our own DirectoryStream that will iterate over the files in an entire directory tree, not just a single directory. Our goal in the end is to have something that works similar [...]
What’s New In Java 7: Copy and Move Files and Directories
Tweet This post is a continuation of my series on the Java 7 java.nio.file package, this time covering the copying and moving of files and complete directory trees. If you have ever been frustrated by Java’s lack of copy and move methods, then read on, for relief is at hand. Included in the coverage is [...]
Event Programming with Google Guava EventBus
Tweet It’s a given in any software application there are objects that need to share information in order to get work done. In Java applications, one way of achieving information sharing is to have event listeners, whose sole purpose is to take some action when a desired event occurs. For the most part this process [...]
Simple WordPress Backups
Tweet Backing up your data is an important task. As we all know, it’s not a matter of if you are going to experience a crash or failure, but when. Blogs are no exception. I wanted to take a break from my regular style of posts to share my simple backup script. While I’m know [...]
Google Guava Concurrency – ListenableFuture
Tweet In my last post I covered using the Monitor class from the com.google.common.util.concurrent package in the Guava Library. In this post I am going to continue my coverage of Guava concurrency utilities and discuss the ListenableFuture interface. A ListenableFuture extends the Future interface from the java.util.concurrent package, by adding a method that accepts a [...]
Google Guava – Synchronization with Monitor
Tweet The Google Guava project is a collection of libraries that every Java developer should become familiar with. The Guava libraries cover I/O, collections, string manipulation, and concurrency just to name a few. In this post I am going to cover the Monitor class. Monitor is a synchronization construct that can be used anywhere you [...]
Micro Benchmarking with Caliper
Tweet From time to time I think all developers have done some form of benchmarking. I recently discovered Caliper which is according to the site – “Caliper is Google’s open-source framework for writing, running and viewing the results of Java Microbenchmarks”. I am aware that micro-benchmarking can be misleading depending on who is writing the [...]



