More ways to contact us

Posts by Eric Tschetter

Real Real-Time. For Real.

Filed in Druid

Danny Yuan, Cloud System Architect at Netflix, and I recently co-presented at the Strata Conference in Santa Clara. The presentation discussed how Netflix engineers leverage Druid, Metamarkets’ open-source, distributed, real-time, analytical data store, to ingest 150,000 events per second (billions per day), equating to about 500MB/s of data at peak (terabytes per hour) while still maintaining real-time, exploratory querying capabilities. Before and after ...

Clocks
Read Post 4 Comments

Introducing Druid: The Real-Time Analytics Data Store

Filed in Announcement, Druid, Technology

  In April 2011, we introduced Druid, our distributed, real-time data store. Today I am extremely proud to announce that we are releasing the Druid data store to the community as an open source project. To mark this special occasion, I wanted to recap why we built Druid, and why we believe there is broader utility for Druid beyond Metamarkets’ ...

Druid
Read Post 9 Comments

Scaling the Druid Data Store

Filed in Druid, Technology

"Give me a lever long enough... and I shall move the world" — Archimedes Parallelism is computing’s leverage, a force multiplier acting against the weight of big data.  Cloud-hosted, horizontally scalable systems have the power to move even planetary sized data sets with speed. This blog post discusses our efforts to lift one such data set, achieving a scan rate ...

scaling2
Read Post 8 Comments

Druid, Part Deux: Three Principles for Fast, Distributed OLAP

Filed in Druid, Technology

In a previous blog post we introduced the distributed indexing and query processing infrastructure we call Druid. In that post, we characterized the performance and scaling challenges that motivated us to build this system in the first place. Here, we discuss three design principles underpinning its architecture. 1. Partial Aggregates + In-Memory + Indexes => Fast Queries We work with ...

toyota-sized-470x288
Read Post 3 Comments

Introducing Druid: Real-Time Analytics at a Billion Rows Per Second

Filed in Druid, Technology

Here at Metamarkets we have developed a web-based analytics console that supports drill-downs and roll-ups of high dimensional data sets – comprising billions of events – in real-time. This is the first of two blog posts introducing Druid, the data store that powers our console. Over the last twelve months, we tried and failed to achieve scale and speed with ...

fastcar-sized-470x288
Read Post 21 Comments