More ways to contact us

Posts by Fangjin Yang

Maximum Performance with Minimum Storage: Data Compression in Druid

Filed in Algorithms, Druid, Technology

The Metamarkets solution allows for arbitrary exploration of massive data sets. Powered by Druid, our in-house distributed data store and processor, users can filter time series and top list queries based on Boolean expressions of dimension values. Given that some of our dataset dimensions contain millions of unique values, the subset of things that may match a particular filter expression ...

Computer_Chip-470x3401-470x288
Read Post 4 Comments

Fast, Cheap, and 98% Right: Cardinality Estimation for Big Data

Filed in Algorithms, Druid

The nascent era of big data brings new challenges, which in turn require new tools and algorithms. At Metamarkets, one such challenge focuses on cardinality estimation: efficiently determining the number of distinct elements within a dimension of a large-scale data set. Cardinality estimations have a wide range of applications from monitoring network traffic to data mining. If leveraged correctly, these ...

cardinality1
Read Post 7 Comments