What you'll learn in this article: How to export a CSV from BigQuery using the three following methods:
- BigQuery console export
- bq command-line tool
- Client library
You probably already know Google BigQuery is an amazing data warehouse. You can set it up to automatically import data from a wide range of systems, use it as a source for a great number of dashboarding tools, and so much more that I could use the rest of the article detailing its use cases (that’s not what we’re going to do though, I promise).
Despite all of its capabilities, you can still find yourself in a situation where BigQuery just doesn't suit your needs for the task at hand and you want an old-fashioned CSV file. Maybe you're passing on data to someone who doesn't (and shouldn't) have access to BigQuery, perhaps you need that data to build something amazing in a tool that doesn't play well with BigQuery, or maybe you just want to know how to accomplish this for the sake of knowing it.
Whatever your motivation is, I've got you covered. Without further ado, here are three ways to export your Google BigQuery data to a CSV file for use in your destination apps.
Method #1: BigQuery console export
By far the easiest way of exporting your data to a CSV file is to use the web UI, also known as the console, which you can find here. This is a great option if you just need an easy way to export your data one time. To start, make sure you’re in the correct project, which you can confirm at the top of the screen, just left of the search bar. You can then complete the following steps:
- In the Explorer panel, expand the correct project and dataset. Select the table you want to export.
- On the far right, click on Export and then Export to GCS, which stands for Google Cloud Storage.
- In the menu that opens, choose the GCS bucket that you want to export the data to and the filename, for example, yourbucket/yourfolder/yourdata.csv. You can use the browse option to find the right bucket, or create a new one first if necessary. Optionally, you can choose a format other than CSV, as well as GZIP compressing the file to reduce its size.
- Go to GCS and navigate through the bucket to the location where you saved your CSV file. To the right of the filename is a download icon.
And voilà! You now should have a CSV with the data you care about for your use case (whatever it is).
Method #2: bq command-line tool
BigQuery comes with its own command-line tool called bq, which you can access in the console or in your local terminal by first installing the Google Cloud SDK. In the console, you'll find it here or by clicking on the terminal icon on the right side of the search bar. While it's not as straightforward as the previous option, it can be quicker than clicking through several menus once you've memorized the commands. The basic syntax to export a CSV file is:
bq extract [--FLAGS] RESOURCE DESTINATION
RESOURCE is the table that you are exporting and DESTINATION is the bucket and location in GCS that you are exporting to. FLAGS are optional and allow you to:
- Choose a compression type, if any.
- Choose a file format, CSV being the default.
- Choose a delimiter for CSV files.
- Choose whether to include the header, the default being true.
If you were to use all the available flags to export the well-known Austin bike-sharing data from BigQuery's public dataset, the command would look like this.
bq extract --compression=GZIP --destination_format=CSV --field_delimiter=tab --print_header=false bigquery-public-data:austin_bikeshare.bikeshare_stations gs://your_bucket/your_folder/your_file.csv.gzip
The reference documentation lists all the accepted values for the flags. The extract command will export the CSV file to your GCS. If you installed the Google Cloud SDK, you can use the following command in your local terminal to download the file from your GCS.
gsutil cp gs://your_bucket/your_folder/your_file.csv.gzip /your_local_folder
Using the Google Cloud SDK is useful if you frequently need to export your BigQuery data and especially if you know your way around cron jobs.
Method #3: Client library
Last but certainly not least is the option of exporting CSV files using the client library. If you’re familiar with BigQuery, it probably doesn't surprise you that you can access the BigQuery API through a client library in several different languages such as Node.js, Java, and Python.
This option takes the most effort to set up, but once you are up and running, you can integrate it perfectly in an automated workflow. Google has a quick start to get you ready for your first query. Once you are ready to run your first query, you can export your data by adapting the below example (Python) code to fit your needs.
# Import library
from google.cloud import bigquery
# Declare client
client = bigquery.Client()
Should declaring a Client result in a DefaultCredentialsError, you can easily resolve the error using this solution.
# Parameters for your case
project = 'bigquery-public-data'
dataset_id = 'austin_bikeshare'
table_id = 'bikeshare_stations'
table_location = 'US' # Open the table in the BigQuery web UI, go to the details tab and look for 'Data location'
bucket_name = 'your_bucket'
file_location = 'your_folder/austin_bikeshare.csv'
destination_uri = f'gs://{bucket_name}/{file_location}'
dataset_ref = bigquery.DatasetReference(project, dataset_id)
table_ref = dataset_ref.table(table_id)
# Exporting the entire table
extract_job = client.extract_table(
table_ref,
destination_uri,
location = table_location,
)
extract_job.result()
The entire table has now been stored in GCS as a CSV file. You can download the file with the below code.
from google.cloud import storage
bucket_name = 'your_bucket'
source_file_name = 'austin_bikeshare.csv'
destination_file_name = 'austin.csv'
storage_client = storage.Client()
bucket = storage_client.bucket(bucket_name)
blob = bucket.blob(source_file_name)
blob.download_to_filename(destination_file_name)
Alternatively, you can use a standard SQL query to limit the data that is exported and immediately store the result in a CSV, bypassing GCS.
# Querying a subset of the table
sql_query = (""" SELECT station_id, name, status FROM bigquery-public-data.austin_bikeshare.bikeshare_stations WHERE status = 'active' """)
# Storing the data in a pandas DataFrame
df = client.query(sql_query).to_dataframe()
# Saving the DataFrame to a CSV
df.to_csv('filename.csv')
Then, once you’ve executed the code above, you’re all set to continue your data adventure, CSV in hand (or file structure). Before you go, however, there are some final notes on exporting CSV files from Google BigQuery you should know.
Know before you go: Limitations with exporting CSV files from BigQuery
There you have it. Whether you really needed this information for day-to-day data use, ad hoc data analysis, or were just curious if it was possible, you now know three different methods to export your data as a CSV file from BigQuery. There are still multiple limitations that you need to keep in mind. Pay special attention to the ones below to save yourself some troubleshooting.
First: You can export 1 GB of table data to a single file in GCS, at most.
If your data exceeds that limit, use a wildcard to export the data to multiple files, like so: gs://your_bucket/your_file_name_*.csv.
This will create files called gs://your_bucket/your_file_name_000000000000.csv, gs://your_bucket/your_file_name_000000000001.csv and so on.
Secondly, It’s not possible to export nested and repeated data to a CSV file. You’ll have to use the UNNEST function first, after which you can proceed(e.g. Google Analytics data is nested).
If, after taking a look at that limitations breakdown, you’re not so sure you want to repeat the easy methods over and over again or go through the effort of setting up the export through the BigQuery client library, I have great news for you: If you just use Census, you won’t have to do any of this.
That’s right. A reverse ETL tool makes it fast and easy to send your data from BigQuery to just about any destination you need your data in so you can spend less time reading tutorials like this and more time on data analytics. After playing around with the CSV export options above, go ahead and check out how easy life could be with Census instead.