Security¶
Table of Contents
Documentation Conventions¶
Commands and keywords are in this
font
.$OFFLOAD_HOME
is set when the environment file (offload.env
) is sourced, unless already set, and refers to the directory namedoffload
that is created when the software is unpacked. This is also referred to as<OFFLOAD_HOME>
in sections of this guide where the environment file has not been created/sourced.Third party vendor product names might be aliased or shortened for simplicity. See Third Party Vendor Products for cross-references to full product names and trademarks.
Introduction¶
Gluent Data Platform acts as an interface between traditional proprietary relational database systems and backend data platforms. As such, Gluent Data Platform utilizes the security features of these systems.
In a data lake architecture, where large volumes of business data are brought together in a single system, it is more important than ever to apply a diligent approach to security. The remainder of this guide covers security functionality in Gluent Data Platform and how Gluent Data Platform interacts with other systems.
Details of configuration of security features in Oracle Database and backend data platforms is beyond the scope of this guide.
System Accounts¶
In order for Gluent Data Platform to operate, system accounts are required in both the RDBMS (such as Oracle Database) and backend data platforms (such as Cloudera Data Hub and Google BigQuery).
Oracle Database¶
In the source Oracle Database instance the following users, roles and privileges are provisioned during installation (see Install Oracle Database Components):
Table 1: Oracle Database Users¶
Username |
Purpose |
---|---|
|
An administrative user with access to create objects in the hybrid schema |
|
A read-only user |
|
Owner schema of Gluent Metadata Repository |
Note
GLUENT
is the default system user prefix in the Oracle Database instance but this can be changed during installation if desired. The suffixes of _ADM
, _APP
and _REPO
are mandatory.
At installation time the above accounts are created with 30 character random string passwords. The passwords are not retained and the installing administrator is expected to set the password for the “ADM” and “APP” accounts to a known value as part of the configuration process.
The creation of GLUENT_ADM
, GLUENT_APP
and GLUENT_REPO
allows the least privilege principle to be followed.
Table 2: Oracle Database Roles¶
Role |
Grants |
---|---|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Table 3: Oracle Database User Privileges¶
User |
Grants |
---|---|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Oracle Database Server¶
Gluent Data Platform must be installed as the same user that owns the Oracle software (typically oracle).
Metadata Daemon runs on the Oracle Database server and under certain conditions is required to run as the root user. Refer to Metadata Daemon OS User.
Cloudera Data Hub¶
A Gluent Data Platform OS user (typically named gluent, however, any valid operating system username is supported) is required on the Hadoop node(s) on which Gluent Offload Engine will be run. There are no specific group membership requirements for this user. Refer to Provision a Gluent Data Platform OS User.
A Kerberos principal may be required either for password-less SSH or authentication with a Kerberized cluster.
An LDAP user may be required for authentication with an LDAP enabled Impala.
Google BigQuery¶
A service account is required for use by Gluent Data Platform.
A role named GLUENT_OFFLOAD_ROLE should be created with the privileges listed below and assigned to the service account for use with Gluent Data Platform:
bigquery.datasets.create
bigquery.datasets.get
bigquery.jobs.create
bigquery.readsessions.create
bigquery.readsessions.getData
bigquery.tables.create
bigquery.tables.delete
bigquery.tables.get
bigquery.tables.getData
bigquery.tables.list
bigquery.tables.update
bigquery.tables.updateData
storage.buckets.get
storage.objects.create
storage.objects.delete
storage.objects.get
storage.objects.list
The creation of the service account and role allows the least privilege principle to be followed.
Hybrid Schemas¶
Gluent Data Platform creates a hybrid schema in the Oracle Database instance for each application schema that is offloaded. These accounts are created with 30 character random string passwords that are not stored.
The following grants are made to each hybrid schema:
Table 4: Hybrid Schema Grants¶
User |
Grants |
---|---|
Hybrid schema |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Note
GLUENT_ADM
is granted CONNECT THROUGH
for each hybrid schema.
Encryption¶
Password Encryption¶
Gluent Data Platform supports password encryption as follows.
Table 5: Password Encryption Support¶
Backend Scope |
Password Source |
Details |
---|---|---|
All |
Gluent Data Platform Environment File |
Clear-text passwords stored in this file can be encrypted using Gluent Data Platform Environment File Passwords |
Cloudera Data Hub |
Sqoop Command Line |
Hadoop Credential Provider API authentication to Oracle Database instances during the data transport phase of Offload prevents clear-text password being exposed on the Sqoop command line. Refer to Hadoop Credential Provider API |
Cloudera Data Hub |
Sqoop Command Line |
Oracle Wallet authentication to Oracle Database instances during the data transport phase of Offload prevents clear-text password being exposed on the Sqoop command line. Refer to Oracle Wallet |
Google BigQuery, Cloudera Data Hub |
Spark Command Line |
Oracle Wallet authentication to Oracle Database instances during the data transport phase of Offload prevents clear-text password being exposed on the Spark command line. Refer to Oracle Wallet |
Network Encryption¶
Gluent Data Platform supports network encryption as follows.
Table 6: Network Encryption Support¶
Backend Scope |
Software Engine / Component |
Details |
---|---|---|
All |
Data Daemon ↔ Metadata Daemon, Smart Connector |
TLS encryption in transit can be enabled (not enabled by default). Refer to Securing Data Daemon |
All |
Gluent Offload Engine ↔ Oracle Database |
Oracle Native Encryption connections can be enabled (not enabled by default). Refer to Oracle Native Network Encryption |
Cloudera Data Hub |
Sqoop, YARN, Spark ↔ Oracle Database |
Oracle Native Encryption connections can be enabled (not enabled by default). Refer to Oracle Native Network Encryption, Sqoop Encryption and Data Integrity in Transit and Spark Encryption and Data Integrity in Transit |
Cloudera Data Hub |
Data Daemon, Gluent Offload Engine ↔ Impala |
TLS encryption in transit to Impala supported. Refer to |
Cloudera Data Hub |
Gluent Offload Engine ↔ WebHDFS |
TLS encryption in transit to WebHDFS supported. Refer to |
Cloudera Data Hub |
Data Daemon ↔ HDFS |
TLS encryption in transit to Secure DataNodes supported. Refer to HDFS Client Configuration File |
Google BigQuery |
Data Daemon ↔ Google BigQuery |
TLS encryption in transit is enabled by default for Google Cloud Services. No Gluent Data Platform configuration is required |
Google BigQuery |
Spark ↔ Oracle Database |
Oracle Native Encryption can be enabled (not enabled by default). Refer to Oracle Native Network Encryption and Spark Encryption and Data Integrity in Transit |
Google BigQuery |
Spark ↔ Google Cloud Storage |
TLS encryption in transit is enabled by default for Google Cloud Services. No Gluent Data Platform configuration is required |
Google BigQuery |
Gluent Offload Engine ↔ Google BigQuery, Google Cloud Storage |
TLS encryption in transit is enabled by default for Google Cloud Services. No Gluent Data Platform configuration is required |
Note
In addition to Oracle Native Encryption, Gluent Data Platform supports TLS enabled connections to Oracle Database instances for both Gluent and backend components acting on behalf of Gluent Data Platform. Contact Gluent Support for further details.
Encryption at Rest¶
Gluent Data Platform supports encryption at rest as follows.
Table 7: Encryption at Rest Support¶
Scope |
Details |
---|---|
Oracle Database |
Oracle Transparent Data Encryption (TDE). No Gluent Data Platform configuration is required |
Cloudera Data Hub |
HDFS Transparent Encryption (Encryption Zones). Configuration may be required depending on the |
Google BigQuery |
Google BigQuery and Google Cloud Storage encrypt and decrypt all data written to disk by default. Encryption keys can be managed by Google or by customers. No Gluent Data Platform configuration is required |
Data Integrity¶
Gluent Data Platform supports Oracle Network Data Integrity for ensuring the integrity of data in transit for connections to Oracle Database instances for both Gluent Offload Engine and Sqoop or Spark components acting on behalf of Gluent Data Platform. Refer to Oracle Network Data Integrity.
Authentication¶
Gluent Data Platform supports authentication as follows.
Table 8: Authentication Support¶
Backend Scope |
Software Engine / Component |
Details |
---|---|---|
All |
Metadata Daemon, Gluent Offload Engine → Oracle Database |
Password based authentication to Oracle Database instances |
Cloudera Data Hub |
Data Daemon, Gluent Offload Engine → HDFS, Impala |
SASL/GSSAPI (Kerberos) authentication to HDFS (including Secure DataNodes) and Impala. Refer to |
Cloudera Data Hub |
Data Daemon, Gluent Offload Engine → Impala |
SASL (LDAP) authentication to Impala. Refer to |
Cloudera Data Hub |
Sqoop → Oracle Database |
Sqoop password file authentication to Oracle Database instances during the data transport phase of Offload. Refer to Sqoop Password File |
Cloudera Data Hub |
Sqoop → Oracle Database |
Hadoop Credential Provider API authentication to Oracle Database instances during the data transport phase of Offload. Refer to Hadoop Credential Provider API |
Cloudera Data Hub |
Sqoop, Spark → Oracle Database |
Oracle Wallet authentication to Oracle Database instances during the data transport phase of Offload. Refer to Oracle Wallet and Oracle Wallet |
Google BigQuery |
Gluent Offload Engine, Data Daemon → Google BigQuery, Google Cloud Storage |
Private key based authentication to Google Cloud Services |
Google BigQuery |
Spark → Oracle Database |
Oracle Wallet authentication to Oracle Database instances during the data transport phase of Offload. Refer to Oracle Wallet |
Authorization¶
Gluent Data Platform supports authorization as follows.
Table 9: Authorization Support¶
Backend Scope |
Software Engine / Component |
Details |
---|---|---|
All |
Oracle Database |
Oracle Database’s authorization mechanism is the primary access control in an environment where Gluent Data Platform is used. This is because Smart Connector is invoked by accessing a table in an Oracle Database instance that has either been previously offloaded or presented. The principle of least privilege is followed. Refer to Oracle Database |
Cloudera Data Hub |
Impala |
Authorization in Impala is controlled by Sentry. The Gluent Data Platform user requires Sentry privileges to function. Refer to Sentry |
Cloudera Data Hub |
HDFS |
The Gluent Data Platform user must have read and write access to HDFS locations. Refer to Create HDFS Directories |
Cloudera Data Hub |
User-Defined Functions (UDFs) |
A user with authorization to create UDFs in Impala is required. This does not have to be the Gluent Data Platform user, but the Gluent Data Platform user must be able to access the UDFs once installed. Refer to Creation of User-Defined Functions |
Google BigQuery |
Google BigQuery API |
The service account used by Gluent Data Platform requires authorization to interact with Google BigQuery datasets. The principle of least privilege is followed. Refer to Google BigQuery |
Google BigQuery |
Google Cloud Storage |
The service account used by Gluent Data Platform requires authorization to interact with Google Cloud Storage underlying Google BigQuery tables. The principle of least privilege is followed. Refer to Google BigQuery |
Google BigQuery |
Spark |
The service account used by Gluent Data Platform requires authorization to write to a Google Cloud Storage bucket during the data transport phase of Offload. Refer to Bucket |