LinkedIn Profile Insights: Apache Pinot vs Apache Druid & Real-Time Analytics
A favorite from our archives, StarTree CEO, Kishore Gopalakrishna, discusses Apache Pinot and performance comparisons with Apache Druid at Crunch Data Conference 2018 in Budapest. Most analytical use cases are for internal users within the company. While they require sub-second latency, the number of concurrent requests is low. However, LinkedIn has many site facing applications such as “who viewed my profile” that serve a large user base (500+ million) and demands low latency response time at very large qps. There is another class of application such as anomaly detection that generates bursty workload. Even though the underlying data and the query pattern is the same, we used different systems to power these varied use cases. This resulted in duplication of data and functionality along with operational overhead of maintaining many systems. In order to address these challenges, we built Pinot, a real-time distributed OLAP engine. Pinot is a single system used at Linkedin to power 50+ site facing applications along with dozens of internal applications.