Top Use Cases Driving Real-Time Analytics in the Impatience Economy
Over the past year, I gathered stories of open source Apache Pinot users and commercial StarTree customers. I wanted to understand their perspective. What compelled them to move to real-time analytics? How did it affect their industries, and their specific businesses? And most of all, I wanted to understand how they were using real-time analytics. Now I would like to share what I have learned with you, our user community.
There is a theme behind all of these use cases. It was described to me some years ago when I was at Aerospike by then-CMO, Monica Pal. She called it “The Impatience Economy.” It’s the impetus in both B2C and B2B business models to move from “soon” to “right now.” Blink-of-an-eye responsiveness in applications and communications. In database terms, shifting from batch to real-time processing. And if a business doesn’t move at that speed? Consumers change the channel, or find a different thing to spend their time and money on. And in business, if you are working too slow, customers bolt to a competitor.
Over the past decade since I first heard it, the term has stuck with me. Others have picked up on it too; there’s a popular book available by the same name. Once you know to look for it, you can see it everywhere. How we’ve moved from packages taking 3-5 days to cross the country to next-day delivery, and then to same-day delivery, to within-the-hour delivery.
In Internet and mobile applications, we’ve moved from load times that could take upwards of a minute to single-digit seconds to sub-second delays. Or data-intensive applications only as current as yesterday to data freshness likewise measured in seconds or milliseconds.
Initially it was mostly transactional systems like Aerospike that required such extremely low latency. Analytics a decade ago, for the most part, were still being done in data warehouses, with freshness delays often measured in hours or a day or more. Today businesses increasingly want their analytics to be kept in sync with their transactional systems.
Architectures for real-time analytics
Fortunately, you have technologies for Change Data Capture (CDC) like Debezium that allow you to move the deltas of data changes in your transactional database systems into a real-time data streaming service, like Apache Kafka (or Kinesis, or RedPanda, or Pulsar). From there you can do stream processing with an Apache Flink (or equivalent). Such as to denormalize and pre-JOIN tables for ingestion into a real-time analytics database. Then Apache Pinot can produce fresh analytics for scalable, user-facing applications.
This creates a larger architectural pattern with three stages of streaming-processing-analytics. I’m hearing more colleagues refer to this as the Kafka-Flink-Pinot (KFP) “stack.” Even when you swap out specific technologies for your own favorites, the metapattern is predominating in current architectures.
Top use cases for real-time analytics
What sorts of use cases are organizations using such real-time data architecture to solve? Here’s what I found:
- Business metrics monitoring — From top-line Key Performance Indicators (KPIs) to ad hoc drill-down reporting, companies want to understand how their business is operating in real-time. Are they seeing sudden spikes or dips in demand? Is there one particular demographic or region where a trend is localized to, or are there across-the-board changes in behavior? I’ve come to summarize real-time analytics in terms of a basic food delivery paradigm: “You don’t have until tomorrow to meet your lunchtime demand today.”
- Personalization — While there is a focus on fast aggregates in real-time analytics, you also need to ensure your efforts are conducted in service of the individual. There is no “average” consumer. And even an individual consumer’s choices may vary widely depending on the context. So your personalization needs to look for contextual clues of behavior and interest.
- Customer 360º — Your organization may have many different touchpoints with a consumer, from their social media interactions or customer service calls, to their browsing or purchasing history. In order to do personalization right, you need to draw insights from across myriad enterprise systems, and also from external data partners.
- Fraud detection — Businesses want to keep their brands, their partners, and their consumers safe. Otherwise rampant abuse can lead to a chilling effect on an entire business ecosystem. The acceptable time frame for fraud detection has dropped significantly. Whereas in the past it was mostly post-fraud detection and refund, there is increasing pressure to provide real-time detection and up-front prevention.
- Location-based services — With the revolution in smartphones and Internet of Things (IoT) devices, there has been an increased desire to present contextual information to individuals based on real-world location. Whether that’s the list of restaurants located within a 30 minute delivery time, or alerts to local weather emergencies such as a tornado warning. Ride sharing and delivery services, marketing and advertising, healthcare, sports fitness — there are any number of applications that can be optimized to where you are, right now, in the world.
- Observability — Just as we, as humans, are increasingly reliant on real-time analytics of our interests and activities, our computers, networking devices, and machines are also producing a vast amount of data that needs to be analyzed in real time. Observability (often abbreviated as “o11y”) is the sum of the analysis of logs, traces, and metrics produced by our devices. What if that favorite mobile app keeps crashing? What bug is causing that? What if a website or a back-end microservice is performing poorly? Is it a global slowdown, or is it localized to one region, or a flaky server in a cluster?
This is just a quick survey of the most prominent use cases I came across. Because once you have a database like Apache Pinot — capable of processing data at petabyte scales within seconds or even sub-second times, at the rate of a 100,000 queries per second or more — then there are no ends to the applications such a database can be applied to. To find out more, make sure you download our free report.
Tell us about your own use case
If you have a story of your own to share about implementing real-time analytics on Apache Pinot, thoughts on the driving factors of the impatience economy for your business, or questions about how you’d go about implementing real-time data applications in your own organization, we’d love to hear about it. Contact us and let us know if you’re interested in sharing your success story, or if you’d like more information on how to get started.
Join your peers at Real-Time Analytics Summit
If you want to learn more directly from your peers, there’s no better opportunity this year than to attend the Real-Time Analytics Summit, this May 7-9, 2024, in San Jose, California. You’ll be able to hear from practitioners at DoorDash, Uber, Stripe, Atlassian, Moloco, Dialpad and more.