“Many years ago it became abundantly clear that some way had to be found to shield programmers from the complexity of the hardware… [T]his layer is the operating system.”—Andrew Tanenbaum, Operating Systems The cloud is the new operating system for enterprises, and services are the new applications. The cloud provides the computing fabric upon which the next generation of services, from Pinterest to ...
Posts by mike
Beyond Hadoop: Fast Queries from Big Data
November 4th, 2011 • mike
Filed in Technology
There's an unspoken truth lurking behind the scourge of Big Data and the heralding of Hadoop as its savior: while Hadoop shines as a processing platform, it is painfully slow as a query tool. Hive was developed by the folks at Facebook in 2008, as a means of providing an easy-to-use, SQL-like query language that would compile to MapReduce code. ...
The Rise of Interactive Data Visualization
June 28th, 2011 • mike
Filed in Technology
The visualization below highlights something only recently possible on the web: a dynamic, interactive canvas. Titled "Disaster Strikes: A World In Sight", it visualizes a century of floods, fires, droughts, and earthquakes around the globe. (Below is a snapshot of 1996, an apparently costly year for disasters). It's not a passively animated graphic, but one that users can actively engage ...
Node.js and the JavaScript Age
April 8th, 2011 • mike
Filed in Technology
Three months ago, we decided to tear down the framework we were using for our dashboard, Python's Django, and rebuild it entirely in server-side JavaScript, using node.js. (If there is ever a time in a start-ups life to remodel parts of your infrastructure, it's early on, when your range of motion is highest.) This decision was driven by a realization: ...
Four Lessons for Building A Petabyte Data Platform
January 31st, 2011 • mike
Filed in Technology
In the last year, we at Metamarkets have built a big data stack for processing, analyzing, and visualizing nearly one trillion ad pricing events from our partners around the globe. In this post I share some of the thinking behind our choices for the the Big Data stack that powers our petabyte platform.