Jun 17 - Webinar - High-performance full text search directly on Iceberg : RSVP Here

StarTree Cloud Fabric — Bringing Order to Multi-Cloud Chaos

How StarTree tackled the complexity of running Apache Pinot across clouds and turned chaos into composable control.
Written By
Published
Reading Time

Managing real-time analytics across multiple clouds is powerful — but doing it reliably, securely, and at scale isn’t simple.

In today’s data-infrastructure landscape, organizations increasingly adopt multi-cloud strategies to gain flexibility, cost efficiency, and resilience. Yet every layer of that flexibility introduces new operational friction — especially when deploying and managing distributed systems like Apache Pinot across heterogeneous environments.

At StarTree, we built an extensible cloud-management and deployment framework to simplify how StarTree Cloud operates across AWS, GCP, and Azure. This ‘Cloud Fabric’ enables both customers and internal teams to deliver consistent, secure, and scalable Pinot deployments — whether as a multi-tenant SaaS, a Bring-Your-Own-Cloud (BYOC) solution, or a Bring-Your-Own-Kubernetes (BYOK) setup.

Learn more about StarTree’s multi-cloud deployment flexibility

Managing Complexity at Scale

As StarTree expanded its cloud footprint, we needed to support hundreds of clusters across different environments and deployment models. Each model — from fully managed SaaS to customer-hosted BYOC — introduced its own operational and architectural hurdles that compounded at scale.

Operational Challenges

  • Configuration sprawl : Every deployment required unique/customized entity specifications and infrastructure definitions.
  • Upgrade friction : Updating tens to hundreds of clusters safely, without downtime or manual steps, was non-trivial.
  • Security and access boundaries : Customer data and environments had to remain fully isolated while still allowing centralized management.
  • Cloud heterogeneity : Different APIs, network policies, and identity frameworks across AWS, GCP, and Azure added significant operational overhead.

Architectural Challenges

  • Extensibility : As StarTree Cloud grew and new services appeared, the platform needed to absorb them quickly. Each capability — a service, component, or integration — had to plug in seamlessly without creating coupling or bottlenecks for the core platform team.
  • Composable architecture : Delivering multiple cloud products (BYOC, BYOK, and multi-tenant SaaS) from a single unified stack is inherently complex. The architecture had to stay modular and configurable enough to support diverse deployment models, yet cohesive enough to guarantee operational consistency, security, and scalability across them all.

The Turning Point

It became clear that traditional platform architectures wouldn’t scale. We needed an architecture grounded in composability — the ability to build new cloud products from reusable building blocks — combined with decoupled orchestration that cleanly separated the Control Plane from the Data Plane, all while remaining cloud-agnostic and extensible.

That realization became the foundation of StarTree Cloud Fabric — a control-plane-first architecture that brings order to multi-cloud chaos.

Cloud Fabric: A Framework for composability and control

StarTree Cloud Fabric brings structure, automation, and elegance to an inherently chaotic problem — managing distributed systems across clouds.

Let’s review the components of cloud infrastructure:

  • Control Plane: The brain of the architecture which stores the cluster metadata and maintains a per-environement state machine
  • Infra-As-A-Service: Collection of services that know how to provision a granular resource (e.g. like a VPC with configurable subnets), Automatically manages terraform state and does safe plan and apply operations
  • Support Service: Custom services which support dedicated concerns like configuration management and versioning, scoped secrets & token generation, upgrade orchestration etc
  • Cloud Portal: Separate portals with views tailored towards external customers as well as internal SRE/Support teams
  • Agent Framework: Collection of dedicated connectivity agents and kubernetes operators which help execute control plane intents, and materialize the intents into valid cluster changes 

StarTree Cloud Fabric embodies a few simple but powerful ideas:

  • Composability : Define cloud products as modular, reusable entities that can be combined to form any deployment model. Enables new services to be added to the stack seamlessly
  • Decoupling : Separate orchestration logic from execution so the system can scale safely across clouds.
  • Configuration as a first-class citizen : Manage complexity through centralized, versioned configuration rather than ad-hoc scripts.
  • Security and isolation by design : Build boundaries into the architecture instead of layering them on later.

These principles became the blueprint for a control-plane architecture capable of managing thousands of environments with predictability and grace.

Separating Intent from Execution

In traditional systems, orchestration and execution live side by side: the same control logic that provisions resources also manages them day to day. That coupling works for a few clusters, but it doesn’t scale.

StarTree Cloud Fabric flips that model.  It separates intent (defined and enforced in the Control Plane) from execution (performed in the Data Plane). 

For example: changing a configuration value from A→B (change replicas from 3 to 5) is an intent. This change is passed to the Data Plane, and the execution logic there knows how to make this change in the right component in the right way. 

The Control Plane is responsible for orchestration, policy, and metadata; the Data Plane focuses on doing — creating resources, configuring clusters, and monitoring health.

This separation brings three big advantages:

  1. Consistency across clouds — a single orchestration model drives deployments on AWS, GCP, and Azure.
  2. Resilience at scale — failures in one environment don’t cascade across the system.
  3. Faster innovation — new services or integrations can plug in without re-architecting the platform.

Why Composability Matters

StarTree Cloud Fabric treats every part of StarTree Cloud — Pinot clusters, storage, networking, and auxiliary services — as entities. Each entity defines its behavior and relationships within a directed acyclic graph (DAG) that represents the full deployment topology. This model turns infrastructure into a set of reusable, declarative building blocks.

Need to build a BYOK environment? Combine the Kubernetes, Pinot, and Auth entities.

Need to roll out a multi-tenant SaaS region? Reuse the same blocks under a different orchestration graph.

The diagram above shows the DAG for a standard BYOC environment. The right side shows the same DAG executed with a “no-op” flag to get a BYOK. 

Composability replaces custom code and snowflake deployments with software-defined patterns that scale.

The Early Impact

With Cloud Fabric. several pain points have disappeared:

  • Upgrades are now repeatable. Every cluster followed the same state-machine transitions, enabling zero-touch rollouts.
  • Configuration drift has vanished. Templates and versioned configs ensured that every environment looked the same, regardless of cloud.
  • Security boundaries are strengthened. Outbound-only connectivity and cell-based isolation drastically reduced attack surface and blast radius.
  • Developer velocity has soared. Engineering teams could introduce new services without re-tooling the control logic or dependencies on cloud engineering

StarTree’s Cloud Fabric proves that composable architecture and decoupled control can transform multi-cloud operations from a manual balancing act into an automated, elegant system.

Contents
Share
Confluent White Paper

Data Streaming Report

The Confluent | StarTree 2024 Data Streaming Report taps into the collective wisdom of 4,110 IT leaders to reveal how real-time data streaming is transforming businesses.
Download your free copy
Subscribe to get notifications of the latest news, events, and releases at StarTree