User Story
23px Burdaforward Logo Dark

BurdaForward: Analyzing Quality News Performance Using Real-Time Analytics

BurdaForward, a leading German digital news delivery organization, supports real-time analytics using StarTree Cloud for advertisers, media brands, and consumers.

digital users
40m
query SLA
5-10 ms
user-facing applications served
100

Summary

  • BurdaForward provides news and information to over 40 million users across Germany
  • BurdaForward supports real-time analytics using StarTree Cloud for advertisers, media brands, and consumers
  • BurdaForward can meet their query SLA of 5-10 ms with StarTree Cloud
  • Query Failure down from 80% to 18% during performance testing, and close to zero in production
  • BurdaForward was able to use StarTree Cloud to store more information than Elasticsearch for the same price, allowing week-over-week analysis, versus only single-day data storage
BurdaForward performs real-time analytics queries in 5-10 ms with StarTree Cloud

BurdaForward is a leading German digital news delivery organization, publishing reliable, relevant and, above all, solution-oriented information. It is the critical modern digital component of Hubert Burda Media, the publishing powerhouse founded more than a century ago.  BurdaForward has broad reach; its news and content is read by more than 40 million people — half the German population — delivering information from the brands they trust, whether BUNTE, FOCUSonline, Finanzen100, TV SPIELFILM, or The Weather Channel. Its CHIP.de brand has grown to be the single largest German digital information service — reaching 23 million readers monthly on its own.

BurdaForward’s mission is to deliver high-quality news free of paywalls, using the latest technology and innovation. Moreover, it is dedicated to “constructive journalism,” focusing not just on the problems of society but also on constructive solutions to these problems. From this mission and ethos springs their motto, “Das sind gute Nachrichten” — “That’s good news.”

BurdaForward’s Many Use Cases for Real-Time Analytics

BurdaForward turned to StarTree Cloud, powered by Apache Pinot, for a real-time analytics platform. They believed Pinot would enable their businesses to increase engagement, allow their brands to extend customer reach, improve user experience, and increase accuracy for A/B testing.

BurdaForward has three main constituencies to serve:

BurdaForward's user groups for real-time analytics
  • Consumers — readers and viewers are the ultimate customer; only if their interests are met and their trust is kept will the BurdaForward community continue to grow and thrive
  • BurdaForward Media Brands — news producers and writers of the associated brands need to understand which of their articles are resonating with the public in real-time
  • Advertisers — since news on BurdaForward is not paywalled, providing advertisers with good data is vital for their continuing revenue growth

The first StarTree Cloud use case for BurdaForward to get into production was for user behavior analytics — clickstream analysis. They had to consolidate and integrate data coming from 16 different publishers — all part of the BurdaForward family — delivered across 100 separate applications. This required providing fast answers for queries within a Service Level Agreement (SLA) of 5-10 ms.

BurdaForward integrates data from 16 different publishers, delivers it across 100 applications, and maintains an SLA of 5-10 ms.

This powers the internal dashboard for editors to see what’s going on with their pages: What news is trending? What’s being viewed most? Where was traffic coming from? What fast reactions did they need to take? The vital charts shown on the dashboard, all powered by StarTree Cloud on the back end, now allow news editors to know what articles they need to promote to their home pages in real-time.

BurdaForward also uses Apache Pinot to power various internal systems consuming the data, such as widgets on partner websites displaying the hottest articles and trending news.

Migration from Elasticsearch to StarTree Cloud

BurdaForward had been using AWS Elasticsearch, but storing just one week’s worth of data for week-over-week comparison was deemed prohibitively expensive. Furthermore, the data team at BurdaForward couldn’t scale throughput of queries independently of storage.

In an internal hackathon for a replacement solution one team had heard of Apache Pinot, but didn’t wish to take on the administrative burden of managing their own self-managed cluster. That’s when they discovered StarTree Cloud. Compared to another real-time analytics database they were considering, it worked with Amazon Kinesis right out of the box. They could easily and directly ingest data from their existing infrastructure, using Snowbridge to do stream processing transformations.

StarTree Cloud’s underlying Apache Pinot architecture allows for independent scaling of query brokers and storage servers, which the BurdaForward team found immediately appealing.

StarTree Cloud’s use of tiered storage meant BurdaForward could span their data across both performant Elastic Block Storage (EBS) for the most-recent, frequently-accessed data and lower-cost Amazon S3 for the older data. This allowed them to store far more data and perform their week-over-week comparisons while keeping within their budget.

The data engineers were able to use the multiple types of indexing supported in StarTree Cloud as needed for their different use cases, such as inverted indexing and timestamp indexing.

The “cherry on top” for the BurdaForward team was that Apache Pinot uses standard SQL queries, making it easier than training their teams on the proprietary Elasticsearch query dialect.

Furthermore, the integration of Apache Pinot with Apache Superset meant it would be very facile to create the necessary dashboards and visualizations. Superset served as a replacement for Kibana in their realtime stack. According to the BurdaForward team, “Pinot’s integration with Superset was more appealing and super easy to use.”

Lastly, the BurdaForward team wrapped StarTree Cloud with FastAPI to prevent having to give developers direct access to the database. This provided a layer of abstraction and security. It also provided a cache with a Time-to-Live (TTL) of 1 minute, which helped prevent inadvertent overloading of the cluster.

While StarTree Cloud itself has never caused an issue, data loading from their upstream Snowbridge service did fail one day. To make the architecture more resilient, they added an autoscaling mechanism for ingestion and added alerting to it.

BurdaForward architecture diagram for real-time analytics

Other Businesses Moving to StarTree Cloud and Apache Pinot from ElasticSearch

BurdaForward is not alone in migrating its business to StarTree Cloud and Apache Pinot from Elasticsearch. For example, Uniqode (formerly Beaconstac) moved its QR Code generator business for similar reasons and with similar beneficial results. Uber also fully replaced Elasticsearch with Apache Pinot for its time-sensitive real-time analytics, saving more than $2 million annually. Cisco WebEx likewise moved from Elasticsearch to keep up with the demands of their rapid growth.

Learn more about why StarTree Cloud is a great alternative for Elasticsearch, and check out our six reasons to switch.

If you’d like to get started exploring a migration path yourself, you can get started today with a free tier serverless environment for your development team.

Ready to deploy real-time analytics?

Start for free or book a demo with our team.