Real-Time Upserts
Scale up, costs down
StarTree performs upserts at ingest for millions of rows per second while reducing infrastructure costs.
StarTree performs upserts at ingest, making it incredibly efficient, fast, and scalable
Reduce costs by minimizing compute and memory intensive operations
Increase data freshness by removing a step
Massive scalability through minimizing processing overhead
Amberdata: Real-Time Analytics for the Entire Cryptoeconomy
Results
- reduced infrastructure costs
- 66%
- insert events per second
- 350k
- query latencies
- Sub-second
Uses Cases
The challenge with upserts
An upsert is a database operation that combines update and insert functions into a single operation, offering improved insight accuracy, enhanced data freshness and simplified architecture.
Upserts, while simple in operational databases, can pose challenges in analytical databases by disrupting efficient bulk-loading, conflicting with columnar storage, and degrading query performance due to re-indexing.
overview
What makes StarTree upserts unique
StarTree upserts
Ingestion-time reconciliation
StarTree revolutionizes upserts with ingestion-time reconciliation. It appends new records, uses metadata for status, marks obsolete records, and balances storage between memory and external systems. This approach improves scalability and reduces memory overhead.
Traditional upserts
Pre-ingestion reconciliation
Many vendors attempt to determine if a record is an insert or an update before data enters the database. This method increases data pipeline complexity and memory usage. It requires large volumes of temporary data, slows ingestion, and leads to ballooning overhead as data volumes grow
Traditional upserts
Query run-time ingestion
Query-time reconciliation works for low-volume scenarios but becomes costly and inefficient as queries increase. It adds processing time, requires extra resources, and impairs real-time analytics performance
Ready for real-time?
StarTree Cloud customers experience over 50% infrastructure cost savings. Book a demo to learn how your organization can get started.
features
Perform upserts for millions of rows per second
Achieve real-time data insights with scalable, efficient storage.
Deletes
Remove obsolete data from your database that you don’t want to see in future queries. Once our real-time engine encounters a record with a delete column, the primary key will no longer be part of queryable documents — saving you headaches.
Partial upserts
Update some of the fields in a record, but not all of them. Partial upserts enable flexible handling of data streams with partial information — perfect for Change Data Capture (CDC) streams where only updated columns are received.
Compaction
Minimize disk space and overall infrastructure costs by periodically replacing obsolete records in your database with newer compacted segments that only contain valid records.
Configurable comparison column
Select a different time column for resolving upserts, rather than using the default. Perfect for when you need to prioritize records based on a different timestamp — not the most up-to-date record.
Bootstrap and backfill
Efficiently bootstrap (initialize) and backfill (update past data) into an upsert-enabled table, using a batch data pipeline to directly upload segments.