Resources
Blog

How to Build Data Visualization Dashboards with Apache Superset and Pinot


20220624 PNG Startree Logo Mark Fill Storm
StarTree
released on
June 26, 2024

Using Apache Pinot and Apache Superset for data visualization

This blog is the first in a series of blogs on integrating various visualization platforms with Apache Pinot. In this blog, I’ll focus on how to use Apache Superset and Apache Pinot together for data visualization.

Apache Superset is an open-source data visualization and data exploration platform designed to be highly intuitive and visually appealing. It allows users to create and share interactive dashboards and visualizations, supporting various data sources. Superset’s strength lies in its ability to provide a user-friendly interface for exploring and visualizing large datasets, making data analysis accessible to non-technical users.

Apache Pinot is a real-time distributed OLAP datastore, optimized for low-latency, high-throughput analytical queries. It is designed to provide instant insights on data at scale, making it a popular choice for applications in time-sensitive data analysis scenarios.

Why use Apache Superset + Apache Pinot?

Pairing Apache Superset with Apache Pinot makes sense due to the complementary strengths of the two popular Apache Software Foundation (ASF) sponsored open source software (OSS) systems. While Pinot provides the backend infrastructure capable of querying massive datasets at high speed, Superset offers the front-end interface that allows users to explore, visualize, and share those insights easily. Together, they form a powerful stack for real-time analytics, enabling users to derive actionable insights from their data rapidly.

Getting started with Superset and Pinot

As you start your own journey into Superset and Pinot, you may find the following resources of use.

How to build a dashboard using Pinot and Superset

We need to deploy both Pinot and Superset to get started. To make life easier, I created a docker-compose for deployment.

NOTE: The Pinot Superset image supports the latest version of superset as well as dependencies defined in the requirements-db.txt.

You can get it from here: https://github.com/Barkha/apache-pinot-workshops/tree/main/SuperSetVisualization

1. Run the following command to start docker instances:

docker-compose up -d

Give it some time to start, then verify deployment by navigating to:

http://localhost:9000 ←– Pinot

http://localhost:8088 ←– Superset

You should see Superset dashboard like so:

How to start docker instances with Apache Superset

2. Create Superset admin

Before we go any further, we need to create the Superset admin account.

Let’s do that by running the following command in the Superset docker container:

# Get container id docker ps
    
    # create the admin user
    docker exec -it <containerid> superset fab create-admin --username admin --firstname Superset --lastname Admin --email admin@superset.com --password admin
    
    # upgrade and init superset
    docker exec -it <containerid> superset db upgrade
    docker exec -it <containerid> superset init

This should complete the Superset setup. Next, we will connect to Pinot.

Before we connect to Pinot, verify that the Pinot server is up by navigating to http://loacalhost:1000

You can check that tables were created and data exists by navigating to: http://loacalhost:1000/#/query

Create Superset admin

Notice that we have eight tables. Good.

3. Connecting Superset to Pinot

Let’s connect Superset to Pinot. To do this, Navigate to out SuperSet deployment at https://localhost:8088 , them select the Settings -> Database Connections

Because we are running the image with a built-in Pinot connector, you will see Apache Pinot in the drop-down.

How to connect Pinot to Superset

Connection string format:

Superset connecting string format is:

engine+driver://user:password@host:port/dbname

For our deployment of Pinot, the connectionstring is:

pinot://pinot:8000/query/sql?controller=http://pinot:9000

Where the engine + driver is pinot, the docker URL for our pinot broker deployment is pinot:8000/query/sql and the controller URL is pinot:9000.

You can change the URLs based on your deployment, and add the user:password if needed.

At this point, Test Connection and save.

4. Add dataset

Once you have a connection, you can add datasets. Select the Dataset option from the top menu:

How to add a dataset in Superset

Select the “+ Dataset” button, and select BaseBallStats:

How to select a Pinot dataset in Superset

This will take you to the create a chart screen:

How to create a new chart using Pinot and Superset

5. Create new charts

Select the Scatter Plot and set the dimensions and Metric by choosing YearID and SUM(hits):

How to set up a new chart with Apache Pinot and Superset

Hit the create Chart button to see the chart. Next, name your chart and save. This will prompt you to add to a Dashboard. You can create a new dashboard at this point.

How to name and save a chart in Apache Superset

Next, you can navigate to the dashboard, and drag and drop the chart where you want it and save. At this point, you can play around and add some more charts.

Here’s one of my dashboards:

Example charts using Apache Pinot and Superset

There you have it! You have successfully created a dashboard using Superset, with a Pinot back end.

We have just scratched the surface of Superset functionality and the power of Pinot.

6. Import/export dashboards

Superset also allows users to Export and Import Dashboards. You can export our newly created dashboard by selecting Dashboard from the top Menu, and selecting export as shown here:

How to export an Apache Superset dashboard

You can import previously created dashboards by choosing import as shown here:

How to import a previously created dashboard in Superset

For your convenience, you will find same already created dashboards for Apache Pinot Sample data here: https://github.com/Barkha/apache-pinot-workshops/tree/main/SuperSetVisualization

Concluding thoughts

Apache Superset serves as a crucial bridge between raw data and actionable insights. Its ability to integrate with various data sources like Apache Pinot, combined with its user-friendly interface and powerful visualization capabilities, makes it an invaluable tool in today’s data-driven landscape. Whether for real-time analytics, periodic reporting, or ad-hoc data exploration, Superset empowers organizations to harness the full potential of their data, fostering a culture of informed decision-making and strategic insight.

Interested in trying a fully-managed version of Apache Pinot? Check out StarTree Cloud Free Tier — perfect for development and prototyping — and start running queries in minutes.

TRY FREE TIER

Ready to deploy real-time analytics?

Start for free or book a demo with our team.