Precise Fetching in Tiered Storage

Fast Analytics on Object Storage

Unlike lazy loading—which transfers large volumes of unnecessary data—precise fetching minimizes data movement between object storage and the query engine. Delivering interactive performance without waste.

Precise Fetching Home Image

Reduce costs, increase query speeds

Updates Inserts

Better Performance, Lower Cost

Savings

Minimized Data Movement

Control

True Storage Decoupling

Lazy Loading vs Precise Fetching

StarTree minimizes inefficient data movement

Object storage isn’t inherently slow—the bottleneck is inefficient data movement. Most OLAP systems use ‘lazy loading’ to download partitions when queries only need a small portion of that data, resulting in:

  • Excessive data movement: Queries retrieve more data than needed, wasting bandwidth
  • Increased latency: Users wait longer for data transfers to complete
  • Higher costs: Processing unnecessary data consumes more computational resources
Precise Fetching Examples
Overview

How Precise Fetching works

Columnar Database

Selective columnar fetch

Unlike systems that must retrieve entire partitions (in Pinot partitions are called ‘segments’) with all columns, StarTree can selectively fetch only the columns or indexes needed for a specific query.

Icon Selective3
Precise

Block-level reads

Beyond column-level precision, StarTree reads only the specific blocks containing matching data. After applying filters, it fetches just the relevant blocks within columns, dramatically reducing data transfer volume.

Icon Block2
Parallel

Pipeline execution

StarTree decouples data fetching from execution, beginning retrieval during query planning. By pipelining I/O and processing in parallel, StarTree reduces query latency by 5x or more.

Icon Pipeline2
Query Optimizations

Index pinning & pruning

StarTree uses metadata (min/max values, bloom filters) to skip irrelevant segments entirely. It selectively pins small, frequently-used structures locally and leverages specialized indexes to identify only the most relevant data blocks—keeping queries fast even on object storage.

Icon Pinning2
Sovrn Logo 1

“We needed to empower our publishers to easily understand and optimize their content’s revenue impact. We’ve partnered with StarTree to move from 24-48 hour delayed data to near real-time data. We can return data in less than a second on key metrics around revenue, clicks, and page views.”

Sovrn Ryan Chichiroco
Ryan Chichiroco
Former VP of Engineering

Ready to deploy real-time analytics?

We’re here to help!