How a Major Car Retailer Uses StarTree for Personalized, Real-Time Search Results
A major national online car retailer migrated to StarTree for real-time recommendations and saved $260,000 annually while increasing query performance and reducing query latencies.
- Annual savings
- $260K
- Average query latency
- 6 ms
- Of database updates daily
- Millions
Summary
- A major online car retailer uses the power of StarTree’s real-time analytics platform to develop customer profiles and provide fast, personalized recommendations.
- Migrating to StarTree allowed the retailer to significantly improve its search performance and maintain a consistent user experience as it scaled.
- The streamlined infrastructure has also helped the retailer achieve a cost savings of approximately $260,000 annually.
- The upserts feature has allowed the retailer to manage millions of daily database updates efficiently.
Building personalized, real-time recommendations
A major national online car retailer provides a fast, convenient platform that allows car buyers to search for new and used cars from dealerships across the United States. This retailer must be able to process search queries quickly and learn customer preferences to suggest vehicles that meet their criteria. That’s why this company chose StarTree, powered by Apache Pinot, for real-time analytics.
To provide customers with the best search results, the retailer uses machine learning to build 360-degree customer profiles, taking into account customers’ search history and preferences for car model, color, and other features. Knowing the kind of vehicles users are looking for allows the company to provide accurate, real-time recommendations.
At the same time, the retailer’s machine learning (ML) model also takes into account the types of cars that are popular or in high demand, which further helps the retailer suggest cars that customers may be most interested in purchasing.
The challenge: Maintaining consistent performance
The retailer had originally architected its search platform on DocumentDB, a JSON database. This was adequate when volumes were smaller, but proved challenging to scale as the business grew and was becoming prohibitively expensive — costing the company $30,000 per month.
Moreover, in order to provide a good user experience, it’s critical for the retailer to be able to process queries quickly, with low latency. Maintaining that consistent performance proved challenging at higher volumes, which can hit 200 queries per second at peak for the retailer. With DocumentDB, the latency could be as high as 30 milliseconds, creating a less-than-ideal user experience.
Reducing costs and latency with Apache Pinot and StarTree
Migrating to StarTree, powered by Apache Pinot, allowed the retailer to both reduce its costs and improve performance. The retailer was able to cut latency to 15 milliseconds at peak, and 6 milliseconds on average — a significant improvement. And not only was the retailer able to achieve a significant cost savings by replacing DocumentDB, it was also able to switch from Redis to StarTree, saving an additional $4,000 to $5,000 per month. In total, adopting StarTree has saved the retailer approximately $260,000 annually.
Migrating from DocumentDB to Apache Pinot was simple and straightforward. The retailer had been using Kafka Batch to load data into DocumentDB, and with Apache Pinot, they were able to simply send a fork of that data to Pinot in parallel. And since Apache Pinot provides a SQL interface, the retailer was able to use StarTree’s API to run queries.
Apache Pinot is uniquely suited to power personalization through real-time analytics due to its architecture, performance, and flexibility. Pinot is built to handle high-throughput, low-latency queries on massive data sets, an essential feature when a few milliseconds can make the difference between a successful transaction and a lost customer.
Streamlined architecture that increases efficiency and lowers costs
Another major advantage of using StarTree for real-time analytics is its ability to handle “upserts” (a combination of “update” and “insert”). The upsert operation allows a system to insert a new record if it doesn’t exist or update it if it does — effectively combining two common database operations into one. Instead of running separate insert and update operations, upserts streamline the process, and ensure that the latest data is always available for querying.
Doing upserts in real-time efficiently on very large tables is challenging, and it’s something that other analytic systems on the market often struggle with. The upsert feature as implemented in the open source Apache Pinot works well at medium scale but gets expensive and hard to use at larger scales, due to the problems of heap memory usage and bootstrapping time. StarTree solved those problems by leveraging RocksDB as the upsert metadata store and using the Pinot Minion task framework to prepare a metadata snapshot to speed up the bootstrapping. Those techniques make the upsert feature efficient and simple to use on very large tables for StarTree customers.
This core strength of StarTree was a key feature for the car retailer and a major reason why they did not consider other alternatives. The retailer was performing millions of updates to its database every day in order to build out customer profiles. So many updates threatened to significantly degrade performance and the overall user experience.
By using StarTree, the retailer was able to get rid of a lot of duplication and complexity, and streamline its architecture significantly — reducing costs and boosting performance. Now, instead of needing dozens of servers, the retailer has been able to reduce its footprint to just eight.
StarTree is currently working with the retailer to streamline its architecture even further and migrate other use cases from a batch to a streaming model.
Discover StarTree Cloud
If this case study sparks ideas for your own data architecture and use cases, you’re encouraged to learn more about StarTree Cloud, its advantages over open source Apache Pinot. If you already have heard enough, feel free to schedule a demo or sign up for a 30-day free trial.