Security

Documentation Conventions

  • Commands and keywords are in this font.

  • $OFFLOAD_HOME is set when the environment file (offload.env) is sourced, unless already set, and refers to the directory named offload that is created when the software is unpacked. This is also referred to as <OFFLOAD_HOME> in sections of this guide where the environment file has not been created/sourced.

  • Third party vendor product names might be aliased or shortened for simplicity. See Third Party Vendor Products for cross-references to full product names and trademarks.

Introduction

Gluent Data Platform acts as an interface between traditional proprietary relational database systems and backend data platforms. As such, Gluent Data Platform utilizes the security features of these systems.

In a data lake architecture, where large volumes of business data are brought together in a single system, it is more important than ever to apply a diligent approach to security. The remainder of this guide covers security functionality in Gluent Data Platform and how Gluent Data Platform interacts with other systems.

Details of configuration of security features in Oracle Database and backend data platforms is beyond the scope of this guide.

System Accounts

In order for Gluent Data Platform to operate, system accounts are required in both the RDBMS (such as Oracle Database) and backend data platforms (such as Cloudera Data Hub and Google BigQuery).

Oracle Database

In the source Oracle Database instance the following users, roles and privileges are provisioned during installation (see Install Oracle Database Components):

Table 1: Oracle Database Users

Username

Purpose

GLUENT_ADM

An administrative user with access to create objects in the hybrid schema

GLUENT_APP

A read-only user

GLUENT_REPO

Owner schema of Gluent Metadata Repository

Note

GLUENT is the default system user prefix in the Oracle Database instance but this can be changed during installation if desired. The suffixes of _ADM, _APP and _REPO are mandatory.

At installation time the above accounts are created with 30 character random string passwords. The passwords are not retained and the installing administrator is expected to set the password for the “ADM” and “APP” accounts to a known value as part of the configuration process.

The creation of GLUENT_ADM, GLUENT_APP and GLUENT_REPO allows the least privilege principle to be followed.

Table 2: Oracle Database Roles

Role

Grants

GLUENT_OFFLOAD_ROLE

READ and EXECUTE on OFFLOAD_BIN directory


READ on OFFLOAD_CACHE directory


READ on OFFLOAD_DATA directory


READ and WRITE on OFFLOAD_LOG directory


GLUENT_OFFLOAD_SQLMON_ROLE role

GLUENT_OFFLOAD_REPO_ROLE

SELECT on GLUENT_REPO tables and views


EXECUTE on GLUENT_REPO.OFFLOAD_REPO


EXECUTE on GLUENT_REPO.OFFLOAD_METADATA_OT

GLUENT_OFFLOAD_SQLMON_ROLE

SELECT on GLUENT_ADM.OFFLOAD_SQLMON_SUMMARY


SELECT on GLUENT_ADM.OFFLOAD_SQLMON_HYBRID_OBJECTS


EXECUTE on GLUENT_ADM.OFFLOAD_TOOLS

Table 3: Oracle Database User Privileges

User

Grants

GLUENT_ADM

CREATE SESSION


SELECT ANY DICTIONARY


GRANT ANY OBJECT PRVILEGE


SELECT ANY TABLE


ANALYZE ANY


EXECUTE on SYS.DBMS_LOCK


EXECUTE on SYS.DBMS_FLASHBACK


SELECT_CATALOG_ROLE


GLUENT_OFFLOAD_ROLE


GLUENT_OFFLOAD_REPO_ROLE

GLUENT_APP

CREATE SESSION


SELECT ANY DICTIONARY


SELECT ANY TABLE


FLASHBACK ANY TABLE


GLUENT_OFFLOAD_ROLE

GLUENT_REPO

CREATE SESSION


SELECT ANY DICTIONARY

Oracle Database Server

Gluent Data Platform must be installed as the same user that owns the Oracle software (typically oracle).

Metadata Daemon runs on the Oracle Database server and under certain conditions is required to run as the root user. Refer to Metadata Daemon OS User.

Cloudera Data Hub

A Gluent Data Platform OS user (typically named gluent, however, any valid operating system username is supported) is required on the Hadoop node(s) on which Gluent Offload Engine will be run. There are no specific group membership requirements for this user. Refer to Provision a Gluent Data Platform OS User.

A Kerberos principal may be required either for password-less SSH or authentication with a Kerberized cluster.

An LDAP user may be required for authentication with an LDAP enabled Impala.

Google BigQuery

A service account is required for use by Gluent Data Platform.

A role named GLUENT_OFFLOAD_ROLE should be created with the privileges listed below and assigned to the service account for use with Gluent Data Platform:

bigquery.datasets.create
bigquery.datasets.get
bigquery.jobs.create
bigquery.readsessions.create
bigquery.readsessions.getData
bigquery.tables.create
bigquery.tables.delete
bigquery.tables.get
bigquery.tables.getData
bigquery.tables.list
bigquery.tables.update
bigquery.tables.updateData
storage.buckets.get
storage.objects.create
storage.objects.delete
storage.objects.get
storage.objects.list

The creation of the service account and role allows the least privilege principle to be followed.

Hybrid Schemas

Gluent Data Platform creates a hybrid schema in the Oracle Database instance for each application schema that is offloaded. These accounts are created with 30 character random string passwords that are not stored.

The following grants are made to each hybrid schema:

Table 4: Hybrid Schema Grants

User

Grants

Hybrid schema

CREATE SESSION


CREATE ANY TRIGGER


CREATE MATERIALIZED VIEW


CREATE SEQUENCE


CREATE TABLE


CREATE VIEW


GLOBAL QUERY REWRITE


QUERY REWRITE


SELECT ANY TABLE


EXECUTE on SYS.DBMS_ADVANCED_REWRITE


EXECUTE on SYS.DBMS_FLASHBACK


GLUENT_OFFLOAD_ROLE

Note

GLUENT_ADM is granted CONNECT THROUGH for each hybrid schema.

Encryption

Password Encryption

Gluent Data Platform supports password encryption as follows.

Table 5: Password Encryption Support

Backend Scope

Password Source

Details

All

Gluent Data Platform Environment File

Clear-text passwords stored in this file can be encrypted using Gluent Data Platform Environment File Passwords

Cloudera Data Hub

Sqoop Command Line

Hadoop Credential Provider API authentication to Oracle Database instances during the data transport phase of Offload prevents clear-text password being exposed on the Sqoop command line. Refer to Hadoop Credential Provider API

Cloudera Data Hub

Sqoop Command Line

Oracle Wallet authentication to Oracle Database instances during the data transport phase of Offload prevents clear-text password being exposed on the Sqoop command line. Refer to Oracle Wallet

Google BigQuery, Cloudera Data Hub

Spark Command Line

Oracle Wallet authentication to Oracle Database instances during the data transport phase of Offload prevents clear-text password being exposed on the Spark command line. Refer to Oracle Wallet

Network Encryption

Gluent Data Platform supports network encryption as follows.

Table 6: Network Encryption Support

Backend Scope

Software Engine / Component

Details

All

Data Daemon ↔ Metadata Daemon, Smart Connector

TLS encryption in transit can be enabled (not enabled by default). Refer to Securing Data Daemon

All

Gluent Offload Engine ↔ Oracle Database

Oracle Native Encryption connections can be enabled (not enabled by default). Refer to Oracle Native Network Encryption

Cloudera Data Hub

Sqoop, YARN, Spark ↔ Oracle Database

Oracle Native Encryption connections can be enabled (not enabled by default). Refer to Oracle Native Network Encryption, Sqoop Encryption and Data Integrity in Transit and Spark Encryption and Data Integrity in Transit

Cloudera Data Hub

Data Daemon, Gluent Offload Engine ↔ Impala

TLS encryption in transit to Impala supported. Refer to SSL_ACTIVE and SSL_TRUSTED_CERTS

Cloudera Data Hub

Gluent Offload Engine ↔ WebHDFS

TLS encryption in transit to WebHDFS supported. Refer to WEBHDFS_VERIFY_SSL

Cloudera Data Hub

Data Daemon ↔ HDFS

TLS encryption in transit to Secure DataNodes supported. Refer to HDFS Client Configuration File

Google BigQuery

Data Daemon ↔ Google BigQuery

TLS encryption in transit is enabled by default for Google Cloud Services. No Gluent Data Platform configuration is required

Google BigQuery

Spark ↔ Oracle Database

Oracle Native Encryption can be enabled (not enabled by default). Refer to Oracle Native Network Encryption and Spark Encryption and Data Integrity in Transit

Google BigQuery

Spark ↔ Google Cloud Storage

TLS encryption in transit is enabled by default for Google Cloud Services. No Gluent Data Platform configuration is required

Google BigQuery

Gluent Offload Engine ↔ Google BigQuery, Google Cloud Storage

TLS encryption in transit is enabled by default for Google Cloud Services. No Gluent Data Platform configuration is required

Note

In addition to Oracle Native Encryption, Gluent Data Platform supports TLS enabled connections to Oracle Database instances for both Gluent and backend components acting on behalf of Gluent Data Platform. Contact Gluent Support for further details.

Encryption at Rest

Gluent Data Platform supports encryption at rest as follows.

Table 7: Encryption at Rest Support

Scope

Details

Oracle Database

Oracle Transparent Data Encryption (TDE). No Gluent Data Platform configuration is required

Cloudera Data Hub

HDFS Transparent Encryption (Encryption Zones). Configuration may be required depending on the HDFS_LOAD location. Refer to HDFS Client Configuration File

Google BigQuery

Google BigQuery and Google Cloud Storage encrypt and decrypt all data written to disk by default. Encryption keys can be managed by Google or by customers. No Gluent Data Platform configuration is required

Data Integrity

Gluent Data Platform supports Oracle Network Data Integrity for ensuring the integrity of data in transit for connections to Oracle Database instances for both Gluent Offload Engine and Sqoop or Spark components acting on behalf of Gluent Data Platform. Refer to Oracle Network Data Integrity.

Authentication

Gluent Data Platform supports authentication as follows.

Table 8: Authentication Support

Backend Scope

Software Engine / Component

Details

All

Metadata Daemon, Gluent Offload Engine → Oracle Database

Password based authentication to Oracle Database instances

Cloudera Data Hub

Data Daemon, Gluent Offload Engine → HDFS, Impala

SASL/GSSAPI (Kerberos) authentication to HDFS (including Secure DataNodes) and Impala. Refer to KERBEROS_KEYTAB, KERBEROS_PRINCIPAL, KERBEROS_SERVICE, KERBEROS_TICKET_CACHE_PATH and HDFS Client Configuration File

Cloudera Data Hub

Data Daemon, Gluent Offload Engine → Impala

SASL (LDAP) authentication to Impala. Refer to HIVE_SERVER_USER and HIVE_SERVER_PASS

Cloudera Data Hub

Sqoop → Oracle Database

Sqoop password file authentication to Oracle Database instances during the data transport phase of Offload. Refer to Sqoop Password File

Cloudera Data Hub

Sqoop → Oracle Database

Hadoop Credential Provider API authentication to Oracle Database instances during the data transport phase of Offload. Refer to Hadoop Credential Provider API

Cloudera Data Hub

Sqoop, Spark → Oracle Database

Oracle Wallet authentication to Oracle Database instances during the data transport phase of Offload. Refer to Oracle Wallet and Oracle Wallet

Google BigQuery

Gluent Offload Engine, Data Daemon → Google BigQuery, Google Cloud Storage

Private key based authentication to Google Cloud Services

Google BigQuery

Spark → Oracle Database

Oracle Wallet authentication to Oracle Database instances during the data transport phase of Offload. Refer to Oracle Wallet

Authorization

Gluent Data Platform supports authorization as follows.

Table 9: Authorization Support

Backend Scope

Software Engine / Component

Details

All

Oracle Database

Oracle Database’s authorization mechanism is the primary access control in an environment where Gluent Data Platform is used. This is because Smart Connector is invoked by accessing a table in an Oracle Database instance that has either been previously offloaded or presented. The principle of least privilege is followed. Refer to Oracle Database

Cloudera Data Hub

Impala

Authorization in Impala is controlled by Sentry. The Gluent Data Platform user requires Sentry privileges to function. Refer to Sentry

Cloudera Data Hub

HDFS

The Gluent Data Platform user must have read and write access to HDFS locations. Refer to Create HDFS Directories

Cloudera Data Hub

User-Defined Functions (UDFs)

A user with authorization to create UDFs in Impala is required. This does not have to be the Gluent Data Platform user, but the Gluent Data Platform user must be able to access the UDFs once installed. Refer to Creation of User-Defined Functions

Google BigQuery

Google BigQuery API

The service account used by Gluent Data Platform requires authorization to interact with Google BigQuery datasets. The principle of least privilege is followed. Refer to Google BigQuery

Google BigQuery

Google Cloud Storage

The service account used by Gluent Data Platform requires authorization to interact with Google Cloud Storage underlying Google BigQuery tables. The principle of least privilege is followed. Refer to Google BigQuery

Google BigQuery

Spark

The service account used by Gluent Data Platform requires authorization to write to a Google Cloud Storage bucket during the data transport phase of Offload. Refer to Bucket

Documentation Feedback

Send feedback on this documentation to: feedback@gluent.com