StarTree Cloud in 2025: Recapping a Year of Innovation

StarTree Cloud is a fully-managed cloud native service for all your real-time analytical needs, built on top of Apache Pinot. In a previous blog post, we went through all the exciting new recently-added features introduced in Apache Pinot. In this blog, we will do the same for StarTree Cloud.
2024 was a big year for us. StarTree Cloud improved its scalability, data freshness, security, and usability. We also added support for a new class of use cases—observability of logs, metrics, and traces. Here are some of the major highlights of what we built within StarTree Cloud.
Vertical use cases
Observability
StarTree Cloud is well positioned to be the storage platform of choice for building cost-effective observability solutions. Designed to handle terabytes of logs, metrics, and traces, it employs state-of-the-art indexing for ultra-low latency and cloud-tiered storage for historical data. Key features we added during 2024 include:
- Data format support: Ability to efficiently ingest, index and query JSON, Prometheus, and OTel formatted data. Advances like CLP make it possible to search through highly-compressed JSON data.
- Grafana plugin: Visualize data stored in Pinot using PinotQL (native) or PromQL dialects.
- MAP data type: Introduced a storage-efficient way to handle key-value pairs like metric tags.
With StarTree Cloud, you can ingest your telemetry data and visualize it in Grafana as well as generate user facing, real-time insights — all in one platform.
Case Study: Discover how Cisco WebEx uses Apache Pinot and Grafana for real-time observability
Anomaly detection
Our flagship vertical product, StarTree ThirdEye, focuses on real-time anomaly detection and root cause analysis for business metrics. Key 2024 updates include:
- Workspace support: Create multiple virtual workspaces for logical isolation of use cases across business entities.
- Performance improvements: A single instance can now handle thousands of real-time alerts.
- Improved anomaly detection accuracy: Better handling of missing or bad data. Plus we added Daylight Savings Time support.
- Revamped UX: Simplified alert creation with new Simple Alert wizard, modernized user interface
- Free Tier experience: You can now use ThirdEye as part of StarTree Cloud Free Tier. You can prototype and test its capabilities without the pressure of a timed trial account.
Case Study: Learn how DoorDash manages on-time delivery with StarTree ThirdEye

Core engine enhancements
Fast restarts with real-time upserts and dedup
Real-time upserts are critical for accurate, up-to-the-minute business decisions. Apache Pinot has been supporting real-time upserts for years, setting it apart from alternatives like Druid and ClickHouse. However, the StarTree version has significantly improved scalability, enabling the handling of billions of primary keys in a persistent data store, without needing to maintain them entirely in on-heap memory—substantially lowering hardware costs.
In 2024, we introduced segment-level metadata snapshots, archived in deep storage and leveraged during server restarts. This enhancement resulted in 5x faster cluster restarts, a significant benefit for large-scale deployments. You can read more about this feature in this blog: Reduce the Cost of Real-Time Upserts in Apache Pinot by 10X with StarTree.
Deduplication (“dedup”) is crucial in an OLAP platform to ensure data accuracy, prevent misleading insights, and maintain the integrity of analytics by removing duplicates that can skew metrics, inflate results, or misclassify data, while also optimizing storage costs. This feature is available in Apache Pinot (read more here). Yet similar to Upserts, the StarTree Cloud implementation of Dedup can manage billions of keys and store snapshots to enable faster cluster operations in case of large deployments.
Case Study: Dialpad Powers Real-Time Customer Intelligence with StarTree Cloud’s Scalable Upserts
Pauseless ingestion
Ensuring strict p99th SLA compliance for data freshness, we unveiled pauseless ingestion in StarTree Cloud. This innovation eliminates the brief lags in data freshness caused during real-time ingestion, thus ensuring data is always fresh for generating insights. In the past, these occasional lags happened during a process named ‘segment commit’ – which is responsible for persisting Pinot’s in-memory accumulated data to disk.
Offline ingestion scalability and pre-processing
StarTree Cloud provides out-of-the-box support for ingesting data from offline sources such as Amazon S3, Google GCS, Snowflake, and BigQuery. Although users loved this experience, there were a few limitations in the past such as
- Not being able to automatically scale minions according to workload and needing manual intervention
- Lack of tooling to re-partition or re-sort data in Pinot. This is essential in some cases to optimize query performance.
In 2024, we added critical improvements:
- Autoscaling Minions: Dynamically scale ingestion infrastructure up or down based on load.
- Re-partitioning: Efficiently re-partition data on the fly on specific columns.
- Data sorting: Added support for sorting data on the fly on specific columns.
These features eliminate the need for users to manage additional components like Flink or Spark solely for the purpose of ingestion into Pinot. This also made-minion based ingestion cost efficient and seamless to use for new users.
Data Manager improvements
We made a bunch of enhancements to StarTree Data Manager, our low/no-code tool for data ingestion:
- Self-serve sources: Added support for new data sources like Amazon Kinesis, enabling effortless data onboarding.
- Reimagined UX: Delivered an intuitive user experience with a streamlined UI which makes it easier for new users to sample data, make changes to the data model on the fly, improved error handling earlier in the workflow and minimizing steps for table creation.

New query console
Interacting with data in Pinot is now more efficient with the new StarTree Query Console. Retaining all existing features of Pinot’s original query console, it introduces:
- Multi-tab support: Work on multiple queries simultaneously.
- Save query support: 1-click way to save frequently used queries
- Syntax highlighting: Improve readability and reduce errors.
- Semantic validation: Ensure query accuracy before execution.
- Modern design: A sleek and user-friendly interface.

Cloud deployment model
StarTree Serverless
We announced a new Serverless offering for a free tier of StarTree Cloud. Registered users can enjoy a free forever workspace perfect for prototyping and development, and experience all the features of our Enterprise offering. This model has seen tremendous growth and served as the playground for the StarTree Mission Impossible contest held in Fall 2024. Feel free to give it a spin!
New cloud architecture
Our team introduced a new cloud deployment architecture, designed to be more modular, secure, and easy to maintain/upgrade. All new StarTree environments now operate on this architecture, which is shared across all deployment models: SaaS, BYOC (Bring Your Own Cloud), and Serverless.
Security
In 2024, we significantly improved our security posture:
- Role-Based Access Control (RBAC): Added RBAC for controlling fine grained access to various resources within StarTree Cloud. This could pertain to Pinot tables or workflows in Data Manager, Query console or StarTree ThirdEye. This works seamlessly with the configured IDentity Provider (IDP). Users can assign either pre-defined or custom roles and policies to any IDP user or group as well as service accounts.
- ISO 27001 certification: Achieved this industry-standard certification, demonstrating our commitment to best-in-class security, integrity, privacy and compliance. This builds on our existing achievement of SOC II Type 2 certification.
Conclusion
2024 was a transformative year for StarTree Cloud. We achieved major advancements in scalability, data freshness, usability, and security while expanding our capabilities to support observability use cases. From introducing pauseless ingestion to launching a robust serverless model and achieving ISO 27001 certification, our innovations have solidified StarTree Cloud’s position as a leader in real-time analytics. We’re excited to continue pushing boundaries and delivering value to our users in the years ahead!
If you’d like to stay abreast of the changes coming to StarTree Cloud in 2025, make sure you sign up now for RTA Summit 2025, which will be held online this year on May 14. There we’ll share more of our engineering advances, and you can hear from your industry peers on how they are using real-time analytics to transform their businesses.