Google BigQuery Prerequisites

Introduction

This document includes the prerequisite steps for Google BigQuery.

Project

A new or existing project needs to be allocated for use by Gluent Data Platform. This project requires the BigQuery API to be enabled.

Bucket

A new or existing Cloud Storage bucket needs to be allocated for use by Gluent Data Platform.

Google’s documentation states the following:

If your BigQuery dataset is in a multi-regional location, the Cloud Storage bucket containing the data you’re loading must be in a regional or multi-regional bucket in the same location. If your dataset is in a regional location, your Cloud Storage bucket must be a regional bucket in the same location.

Service Account

A service account should be provisioned from the project allocated for use by Gluent Data Platform. The preferred approach is for the service account to be assigned to the instances used for Spark and Data Daemon. It is also support for a private JSON key for the service account to be created and stored in a readable path on the server(s) where Gluent Data Platform is installed.

Role

A role named GLUENT_OFFLOAD_ROLE should be created with the privileges listed below and assigned to the service account for use with Gluent Data Platform:

bigquery.datasets.create
bigquery.datasets.get
bigquery.jobs.create
bigquery.readsessions.create
bigquery.readsessions.getData
bigquery.tables.create
bigquery.tables.delete
bigquery.tables.get
bigquery.tables.getData
bigquery.tables.list
bigquery.tables.update
bigquery.tables.updateData
cloudkms.cryptoKeys.get
storage.buckets.get
storage.objects.create
storage.objects.delete
storage.objects.get
storage.objects.list

Note

The cloudkms.cryptoKeys.get privilege is only required if customer-managed encryption keys (CMEK) for BigQuery will be used.

The role can be created using gcloud (assuming appropriate permissions) with the following syntax, as documented in https://cloud.google.com/iam/docs/creating-custom-roles:

gcloud iam roles create GLUENT_OFFLOAD_ROLE --project <project id> \
    --title="GLUENT_OFFLOAD_ROLE" --description="Gluent Data Platform role." \
    --permissions=<CSV list of permissions> \
    --stage=GA

Documentation Feedback

Send feedback on this documentation to: feedback@gluent.com