Google BigQuery Custom Partition Functions

Table of Contents

Installation

In order to use custom functions (UDFs) to generate partition keys during an offload the UDFs must be created before offloading with Gluent Data Platform. BigQuery UDFs reside in a dataset. The name (and optionally the location) of the partition function is supplied using the --partition-functions option.

There is flexibility regarding where the UDFs are stored and the most appropriate location depends on the nature of the environment. At a high level the options are:

  1. Create UDFs in the same dataset the data will be offloaded to

  2. Create UDFs in a different dataset

Note

If the partition function will only be used for offloading a single schema then using the offload destination to also store the UDFs is a logical approach. If the same function is to be used across multiple source schemas then a common centralized dataset to store partition functions avoids duplication.

Permissions

The follow privileges must be granted to the service account used by Gluent Data Platform in order for partition functions to be used:

bigquery.routines.get
bigquery.routines.list

Documentation Feedback

Send feedback on this documentation to: feedback@gluent.com