Is facebook ditching Hadoop? Hardly.

14 Nov

There are so many sensational headlines these days about facebook ditching Hadoop. Well, actually it is on the contrary.

“Hadoop Corona is the next version of Map-Reduce. The current Map-Reduce has a single Job Tracker that reached its limits at Facebook. The Job Tracker manages the cluster resource and tracks the state of each job. In Hadoop Corona, the cluster resources are tracked by a central Cluster Manager. Each job gets its own Corona Job Tracker which tracks just that one job. The design provides some key improvements…”

What facebook does is like others, they improved on jobtracker and now renamed their much improved Corona to remedy issues with the existing Map-Reduce implementation. It is not different from their creation of Hives as query tool, or Cloudera’s Impala which added real-time query engine in Hbase or Quantcast/MapR’s tweaking on the file systems side.

The point is that Hbase (BigTable model) already won over Cassandra (Dynamo model) in architecture design to be the de factor standard in really big data. Now various vendors and users are using Hadoop, tweaking and build upon it. As a young product, there will be more improvement and dramatic makeover for Hadoop/Hbase.

Separately Google also went further from their GFS/BigTable to Colossus/Spanner during these years.

Hopefully Hadoop can also make the big quantum leaps soon with increased adoption and battle test. Before that happens, we still have to rely on Hadoop/HBase of today.


