NO CODE

Getting Started - Connect Superset To Google Sheets

Tai Dupree

This post is now slightly out of date. We recommend this post instead or to check out our documentation.

Utilizing an external database such as Druid or Postgres requires that you have access to such a database or that you spin one up. Although docker makes this easier than ever, it can still require some technical knowledge and time that could be spent evaluating Superset. In this post, we'll cover how to install the gsheetsdb driver that can be used to query data in Google Sheets.

If you haven't already done so, check out our previous posts in our Getting Started series: Getting Started - Installing Apache Superset and Getting Started - Installing Additional Drivers.

Now that you have Superset up and running on your machine and know how to install some database drivers, let's query some data from Google Sheets.

Credit

Most of the work making this blog post possible was contributed by Beto de Almeida, one of Superset's prolific contributors. Beto wrote the underlying DBAPI driver, the SQLAlachemy dialect as well as Superset's compatibility layer to make all of this possible. This blog post is simply showing how to make that integration work in your environment.

Installing the gsheetsdb driver in your local environment

As covered in Getting Started - Installing Additional Drivers we'll add a driver to the ./docker/requirements-local.txt file and rebuild the Superset docker container.

# Add the `gsheetsdb` driver
echo "gsheetsdb" >> ./docker/requirements-local.txt

# Rebuild your local image with the new driver baked in
docker-compose build --force-rm

# Fire things back up
docker-compose up

Adding a Google Sheet as a database in Apache Superset

Now that we have the required driver installed, we can add the database in the Superset UI. Navigate to add a new database connection (Source > Databases > +). Fill out the form by giving the database a name and entering the connection string gsheets://.

Add database

Testing the connection will throw an error as we are not connecting to an actual database that can respond to our request.

Hit Save. You are now ready to start querying Google Sheets.

Running a query in SQL Lab

With the Google Sheets connection created you can now query any publicly available spreadsheet. Head over to SQL Lab (SQL Lab > SQL Editor). The creator of the gsheets driver, Beto De Almeida, has made a few example sheets available for testing which we will use.

  1. Select Google Sheets (or the name you gave it in the previous step) from the dropdown list of available databases on the left.

  2. Enter the following query in the query editor

    SELECT * FROM "https://docs.google.com/spreadsheets/d/1_rN3lm0R_bU3NemO0s9pbFkY5LQPcuy1pscv8ZXPtg8/"

    and hit run. You should see results come back.

Run Query

You have now successfully queried from Google Sheets! You can now make a copy of this sheet, make edits, find other publicly available sheets, or you can create your own (just make sure the sharing settings allow anyone with the link to view).

Creating a Chart

In order to create a Chart that uses Google Sheets as its datasource we will have to add the url as a datasource. Navigate to the Add Table form (Sources > Tables > +) and fill out the form by selecting the Google Sheets database and the url as the Table Name.

Add Table

Hit Save. You are now ready to start creating charts against this datasource.

Head over to the chart creation page (New > Chart) and select the Google Sheets datasource and Bar Chart as the Visualization Type the hit Create New Chart, which should take you to the explore page. An error will appear on the right saying Control labeled "Series" cannot be empty, which is telling us that we must select a series. Find the Series control (which should be marked red) and select "Country", then hit Run Query and you should see a bar chart with the counts of each time a country appears in the dataset.

Explore Chart

You can change the metrics and series on the left to find other interesting stories in your data. For example, try removing the COUNT(*) metric and adding SUM(cnt).

Congratulations you have now created a Chart from Google Sheets data! Using Google Sheets is an excellent way to get to visualizing something quickly, especially if you already have data stored in Google Drive or a csv file.

Running queries against Google Sheets is significantly slower than against a database such as PostgreSQL or MySQL so you should only use this approach for smaller datasets.

If you'd like a more production-ready setup for Superset, you can sign up for a hosted Superset solution here.

This post is now slightly out of date. We recommend this post instead or to check out our documentation.

Subscribe to our blog updates

Receive a weekly digest of new blog posts

Close