
Indexing
Create flexible indexes
Harness multiple indexing options to analyze your data any way you want. StarTree supports a wide range of index types that you can apply to any column, letting you analyze your data flexibly and easily.
Handle large aggregation queries at scale with sub-second performance

Automatically apply the best index for your use case

Faster query performance with dynamic tuning

Immediately index your data
Results
- reduction in storage costs
- 50%
- faster queries
- 4x
- spans supported in a single request
- 100k
Uses Cases
Add indexes instantly — no downtime, no rebuilds
Unlike competitors, StarTree optimizes query performance on the fly as your needs evolve, without re-ingesting data. When query patterns change, simply introduce new indexes and keep moving.
Vector Index
Quickly and efficiently search lookups on high-dimension vector data with StarTree. Ideal for use cases around GenerativeAI (GenAI), Retrieval Augmented Generation (RAG), chatbots and Natural Language Processing (NLP), and recommendation systems.
Star-Tree Index
Build an intelligent materialized view for pre-computing certain aggregations for a wide range of dimension filters while improving query latency and query throughput. Star-Tree Index, unique to Apache Pinot, provides an innovative way to control the degree of materialization, thus allowing the user to trade off query latency versus storage overhead.
JSON Index
Handle structured and semi-structured data while enabling extremely fast query processing on highly nested JSON columns. Users can analyze free structured JSON or text documents and ingest nested JSON columns as-is — without needing to preprocess or transform them in any way.
Text Index
Perform powerful text search queries on arbitrary text data by configuring a Text Index to accelerate query processing.
Sorted Index
Applied to a pre-sorted column, enabling efficient range queries and binary search, reducing the need for scanning unnecessary data blocks.
Range Index
Stores min-max values for column segments, allowing fast pruning of irrelevant data during range queries.
Inverted Index
Maps unique column values to the row IDs where they appear, allowing for fast filtering and lookups, especially on high-cardinality columns.
Geospatial Index
In Pinot, you can use the geospatial index to optimize lookups based on geospatial data. This includes geospatial data types, such as point, line, and polygon; geospatial functions for querying spatial properties and relationships; and geospatial indexing, used to process spatial operations.
Bloom Filter
A Bloom filter is a space-efficient probabilistic data structure that is used to test whether an element is a member of a set. IN Pinot, Bloom filter helps prune segments that do not contain any record matching an equality predicate.
Timestamp Index
Pinot provides a timestamp index type for optimizing queries based on time series data. This data type stores value as a millisecond-long epoch value internally. Typically for analytics queries, you don’t need this low level of granularity, as scanning the data and time value conversion can result in costly implications given the big size of data. The time series index optimizes these queries.
Sparse Index
A lightweight index that selectively indexes only a subset of data, reducing storage overhead while still enabling faster lookups. It is particularly useful for optimizing queries on large datasets where full indexing would be too costly.
Blinkit uses real-time analytics [from StarTree] for three purposes: experimentation, observability, and alerting. [For] observability, we have CXO dashboards and sales dashboards built in real-time where all people — from top leadership to ground folks — can see what is happening in the business in any segment in real-time.

resources for Flexible Indexing
keep exploring
The most powerful real-time analytics platform
Simplified tiered storage
Scalable real-time upserts
Multi-stage query engine
Autoscaling minions
