Indexing

Create flexible indexes

Harness multiple indexing options to analyze your data any way you want. StarTree supports a wide range of index types that you can apply to any column, letting you analyze your data flexibly and easily.

Flexible indexing with StarTree Cloud

Handle large aggregation queries at scale with sub-second performance

Updates Inserts

Automatically apply the best index for your use case

Freshest Data

Faster query performance with dynamic tuning

Bolt Power

Immediately index your data

57px Finserv Logo Logo Dark

Real-Time Observability of Financial Transactions Using StarTree Cloud

Results

reduction in storage costs
50%
faster queries
4x
spans supported in a single request
100k

Uses Cases

Observability
01
/02

Add indexes instantly — no downtime, no rebuilds

Unlike competitors, StarTree optimizes query performance on the fly as your needs evolve, without re-ingesting data. When query patterns change, simply introduce new indexes and keep moving.

Vector Index

Quickly and efficiently search lookups on high-dimension vector data with StarTree. Ideal for use cases around GenerativeAI (GenAI), Retrieval Augmented Generation (RAG), chatbots and Natural Language Processing (NLP), and recommendation systems.

Vector Index Padding Graphic V1

Star-Tree Index

Build an intelligent materialized view for pre-computing certain aggregations for a wide range of dimension filters while improving query latency and query throughput. Star-Tree Index, unique to Apache Pinot, provides an innovative way to control the degree of materialization, thus allowing the user to trade off query latency versus storage overhead.

St Index Padding Graphic V1

JSON Index

Handle structured and semi-structured data while enabling extremely fast query processing on highly nested JSON columns. Users can analyze free structured JSON or text documents and ingest nested JSON columns as-is — without needing to preprocess or transform them in any way.

Json Index Padding Graphic V1

Text Index

Perform powerful text search queries on arbitrary text data by configuring a Text Index to accelerate query processing.

Text Index Padding Graphic V1

Sorted Index

Applied to a pre-sorted column, enabling efficient range queries and binary search, reducing the need for scanning unnecessary data blocks.

Sorted Index Padding Graphic V1

Range Index

Stores min-max values for column segments, allowing fast pruning of irrelevant data during range queries.

Range Index Padding Graphic V1

Inverted Index

Maps unique column values to the row IDs where they appear, allowing for fast filtering and lookups, especially on high-cardinality columns.

Inverted Index Padding Graphic V1

Geospatial Index

In Pinot, you can use the geospatial index to optimize lookups based on geospatial data. This includes geospatial data types, such as point, line, and polygon; geospatial functions for querying spatial properties and relationships; and geospatial indexing, used to process spatial operations.

Geospatial Index Padding Graphic V1

Bloom Filter

A Bloom filter is a space-efficient probabilistic data structure that is used to test whether an element is a member of a set. IN Pinot, Bloom filter helps prune segments that do not contain any record matching an equality predicate.

Bloom Index Padding Graphic V1

Timestamp Index

Pinot provides a timestamp index type for optimizing queries based on time series data. This data type stores value as a millisecond-long epoch value internally. Typically for analytics queries, you don’t need this low level of granularity, as scanning the data and time value conversion can result in costly implications given the big size of data. The time series index optimizes these queries.

Timestamp Index Padding Graphic V1

Sparse Index

A lightweight index that selectively indexes only a subset of data, reducing storage overhead while still enabling faster lookups. It is particularly useful for optimizing queries on large datasets where full indexing would be too costly.

Sparse Index Padding Graphic V1
44px Blinkit Logo Light

Blinkit uses real-time analytics [from StarTree] for three purposes: experimentation, observability, and alerting. [For] observability, we have CXO dashboards and sales dashboards built in real-time where all people — from top leadership to ground folks — can see what is happening in the business in any segment in real-time.

Abhinesh Hada, Engineering Manager, Blinkit
Abhinesh Hada
Engineering Manager

Ready to deploy real-time analytics?

We’re here to help!