Using a star-tree index
In this recipe we’ll learn how to use the StarTree index.
To understand how this recipe processes data, examine the commands executed in the Makefile.
make recipe
This will build up the infrastructure: Pinot and Kafka, create the tables, and produce streaming data.
Three tables will be generated:
- webtraffic – a real-time table with any Pinot indexes.
- webstraffic_inverted – a real-time table with an
inverted index
on the columns: country, browserType, and deviceBrand.
"invertedIndexColumns": [
"country",
"browserType",
"deviceBrand"
],
- webtraffic_startree – a real-time table with a
startree index
.
"starTreeIndexConfigs": [
{
"dimensionsSplitOrder": [
"country",
"browserType",
"deviceBrand"
],
"skipStarNodeCreationForDimensions": [],
"functionColumnPairs": [
"COUNT__*",
"SUM__timeSpent",
"AVG__timeSpent"
],
"maxLeafRecords": 10000
}
],
Open your browser to the Pinot console(opens in a new tab) and execute the SQL statements below.
select browserType, count(*)
from webtraffic
WHERE country = 'Uruguay'
GROUP BY browserType
limit 10
select country, sum(timeSpent) AS totalTime
from webtraffic
group by country
order by totalTime DESC
limit 10
select count(*), sum(timeSpent) AS totalTime
from webtraffic
where country = 'United Kingdom'
order by totalTime DESC
limit 10
select browserType, count(*)
from webtraffic
WHERE country IN ('Germany', 'United Kingdom', 'Spain')
GROUP BY browserType
limit 10
Try changing FROM webtraffic
to FROM webtraffic_inverted
or FROM webtraffic_stree
Clean up
make clean
Troubleshooting
To clean up old Docker installations that may be interfering with your testing of this recipe, run the following command:
docker system prune