Gluent Express

Documentation Conventions

  • Commands and keywords are in this font.

  • $OFFLOAD_HOME is set when the environment file (offload.env) is sourced, unless already set, and refers to the directory named offload that is created when the software is unpacked. This is also referred to as <OFFLOAD_HOME> in sections of this guide where the environment file has not been created/sourced.

  • Third party vendor product names might be aliased or shortened for simplicity. See Third Party Vendor Products for cross-references to full product names and trademarks.

Introduction

The Gluent Express sandbox image provides a fully functional Oracle and Cloudera Data Hub environment in which to learn more about Gluent Data Platform.

Gluent Data Platform supports other backends as noted in Supported Backend Data Platforms.

The tutorials in this guide will give you examples of how to get started with some of the tools provided in Gluent Data Platform.

If you are not familiar with Gluent Data Platform then it is recommended that you begin by watching the Gluent Overview video (duration: approximately 10 minutes) and reading the Concepts guide.

Should you require any assistance with the use of Gluent Express, please refer to Support.

Administrative Information

Unless specified otherwise, the following should be adhered to:

  • Run all operating system commands and Impala Shell commands as the oracle OS user. Use sudo to gain access to the oracle OS user:

centos$ sudo -iu oracle
  • Run all Oracle SQL commands as the SYSTEM database user. To connect to the XEPDB1 pluggable database in the XE container:

oracle$ sqlplus <username>/<password>@localhost:1521/xepdb1 [as sysdba]

The following table lists credentials for Oracle Database users:

Username

Password

SYS

Gluent123

SYSTEM

Gluent123

SH

sh

Pre-loaded Data

Oracle

Table Name

Num Rows

Partitioned

Num Partitions

CAL_MONTH_SALES_MV

54

NO

_

CHANNELS

5

NO

_

COSTS

1,787,540

YES

229

COUNTRIES

23

NO

_

CUSTOMERS

55,500

NO

_

FWEEK_PSCAT_SALES_MV

11,266

NO

_

PRODUCTS

72

NO

_

PROMOTIONS

503

NO

_

SALES

2,923,527

YES

229

SUPPLEMENTARY_DEMOGRAPHICS

4,500

NO

_

TIMES

1,826

NO

_

Backend

Table Name

Num Rows

Partitioned

Num Partitions

SALES_HISTORY

48,685,528

NO

_

Tutorials

The tutorials presented below are designed to give you an understanding of how to use some of the core components in Gluent Data Platform.

Preparation

Confirm that the Gluent Express sandbox is configured correctly by running the validation component of Gluent Data Platform called connect:

oracle$ . /u01/app/gluent/offload/conf/offload.env
oracle$ $OFFLOAD_HOME/bin/connect

Tip

The offload.env script only needs to be sourced once for a session. It is included in all tutorials below, although strictly unnecessary unless using a new shell or the shell environment changes.

Ensure that all validation checks show Passed.

Note

Should any unrecoverable errors be encountered then contact Gluent Support. See Support.

Offload: Incrementally Offload a Partitioned Table

Objective

The objective of this tutorial is to demonstrate the Offload process of Gluent Data Platform, which provides a simple and flexible way for users to copy data from an Oracle Database instance to backends while retaining the ability to query offloaded data.

Overview

In this tutorial we will:

  1. Offload all partitions up to and including year 2011 from the SALES table

  2. Incrementally offload Q1 2012 partitions from the SALES table

Step 1.1: Source parameter file

To source the parameter file for the duration of this scenario, run the following commands:

oracle$ . /u01/app/gluent/offload/conf/offload.env

Step 1.2: Offload up to year 2011 partitions

To offload all partitions up to and including year 2011 from the SALES table, run the following command:

oracle$ $OFFLOAD_HOME/bin/offload -t SH.SALES -x --older-than-date=2012-01-01

Step 1.3: Offload Status Report

To report on the offload status of the SALES table, run the following command:

oracle$ $OFFLOAD_HOME/bin/offload_status_report -s sh -t sales

Note

A text format report containing the results will be produced in the $OFFLOAD_HOME/log directory. The exact report location will be displayed in the Generate Report section of the command output.

See also

Offload Status Report section in the Utilities Guide.

Step 1.4: Offload Q1 2012 partitions

To incrementally offload Q1 2012 partitions from the SALES table, run the following command:

oracle$ $OFFLOAD_HOME/bin/offload -t SH.SALES -x --older-than-date=2012-04-01

Note

Gluent Data Platform detects that all 2011 partitions are already offloaded and skips them. This demonstrates the Partition-Based Offload capability of Offload.

Step 1.5: Offload Status Report

To report on the new offload status of the SALES table, run the following command:

oracle$ $OFFLOAD_HOME/bin/offload_status_report -s sh -t sales

Present: Present a Table to Oracle Database

Objective

The objective of this tutorial is to demonstrate the Present process of Gluent Data Platform, which enables backend-only tables to be presented to an Oracle Database instance so they can be queried.

Overview

In this tutorial we will:

  1. Present the backend SALES_HISTORY table

  2. Query the data in the Oracle Database instance

Step 2.1: Source parameter file

To source the parameter file for the duration of this scenario, run the following commands:

oracle$ . /u01/app/gluent/offload/conf/offload.env

Step 2.2: Present the backend table

To present the backend SALES_HISTORY table, run the following command:

oracle$ $OFFLOAD_HOME/bin/present -t legacy.sales_history -x --target-name=sh.sales_history

Step 2.3: Query the data in the Oracle Database instance

To query the data, run the following command:

SQL> SELECT COUNT(*) FROM sh_h.sales_history;

Query: Run Queries Against Hybrid Schema

Objective

The objective of this tutorial is to demonstrate querying data through the Hybrid Schema, and to show a number of the optimizations that Gluent Data Platform provides.

Overview

In this tutorial we will:

  1. Run a query demonstrating Predicate Pushdown

  2. Run a query demonstrating partition pruning

  3. Run a query demonstrating Advanced Aggregation Pushdown

  4. Run a query demonstrating (Adaptive) Join Filter Pulldown

Step 3.1: Run a query demonstrating predicate pushdown

To execute the query and view the execution plan, run the following commands:

SQL> SET LINESIZE 200 PAGESIZE 100

SQL> ALTER SESSION SET STATISTICS_LEVEL = ALL;

SQL> SELECT prod_id
     ,      cust_id
     ,      amount_sold
     FROM   sh_h.sales
     WHERE  time_id <= DATE '2011-01-01'
     AND    prod_id = 23;

SQL> SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY_CURSOR(NULL,NULL,'ALLSTATS LAST'));

Note

Setting the statistics level to ALL is used for demonstration purposes only, it is not required to run hybrid queries.

Note

The SALES_EXT external table is the interface to the offloaded SALES data in the backend. When the SALES_EXT table is accessed, Smart Connector is invoked to build an Impala query, execute it and format the results for the external table.

To view the generated Impala query for this example, run the following command:

SQL> @/home/oracle/backend_sql_text 80b42fts9ctz4

Note

Observe that the filter predicates on both TIME_ID and PROD_ID have been pushed down to Impala.

Step 3.2: Run a query demonstrating partition pruning

To execute the query and view the execution plan, run the following commands:

SQL> SET LINESIZE 200 PAGESIZE 100

SQL> ALTER SESSION SET STATISTICS_LEVEL = ALL;

SQL> SELECT prod_id
     ,      cust_id
     ,      amount_sold
     FROM   sh_h.sales
     WHERE  time_id BETWEEN DATE '2012-03-01' AND DATE '2012-03-02'
     AND    prod_id = 23;

SQL> SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY_CURSOR(NULL,NULL,'ALLSTATS LAST'));

To view the generated Impala query for this example, run the following command:

SQL> @/home/oracle/backend_sql_text c89c3ma4ycphu

Note

Observe that Smart Connector has pushed the filter predicates on the RDBMS partition key column (TIME_ID) to the equivalent backend partition key column (GL_PART_M_TIME_ID) to achieve partition pruning.

To view the HDFS scanning section of the Impala query profile, run the following command:

SQL> @/home/oracle/backend_profile c89c3ma4ycphu

Note

Notice the partitions=1/15 line in the profile showing that Impala only scanned 1 of the 15 offloaded partitions.

Step 3.3: Run a query demonstrating Advanced Aggregation Pushdown

To execute the query and view the execution plan, run the following commands:

SQL> SET LINESIZE 200 PAGESIZE 100

SQL> ALTER SESSION SET STATISTICS_LEVEL = ALL;

SQL> SELECT prod_category
     ,      SUM(amount_sold) AS total_sold
     FROM   sh_h.sales
     ,      sh.products
     WHERE  sales.prod_id = products.prod_id
     AND    sales.quantity_sold < 2
     AND    sales.time_id BETWEEN DATE '2012-03-01' AND DATE '2012-03-02'
     GROUP BY
            prod_category
     ORDER BY
            total_sold DESC;

SQL> SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY_CURSOR(NULL,NULL,'ALLSTATS LAST'));

Note

The SALES hybrid view is referenced in the SQL statement above. Note that Oracle Database used a rewrite rule, created automatically when the SALES table was offloaded, to rewrite the query to use the SALES_AGG hybrid view.

To view the generated Impala query for this example, run the following command:

SQL> @/home/oracle/backend_sql_text 6tm4h4fp7g3xp

Note

Smart Connector detected that it was invoked for an aggregate Hybrid External Table and constructed and executed a corresponding aggregate Impala query in Impala.

Step 3.4: Run a query demonstrating (Adaptive) Join Filter Pulldown

To execute the query and view the execution plan, run the following commands:

SQL> SET LINESIZE 200 PAGESIZE 100

SQL> ALTER SESSION SET STATISTICS_LEVEL = ALL;

SQL> SELECT prod_category
     ,      SUM(amount_sold) AS total_sold
     FROM   sh_h.sales
     ,      sh.products
     WHERE  sales.prod_id = products.prod_id
     AND    sales.time_id BETWEEN DATE '2012-03-01' AND DATE '2012-03-02'
     AND    products.prod_category IN ('Hardware','Photo')
     GROUP BY
            prod_category
     ORDER BY
            total_sold DESC;

SQL> SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY_CURSOR(NULL,NULL,'ALLSTATS LAST'));

Note

There are no filter predicate values against the PROD_ID column on the SALES_AGG_EXT table.

To view the generated Impala query for this example, run the following command:

SQL> @/home/oracle/backend_sql_text 7g3djzp6scm1d

Note

Smart Connector detected the join between the SALES Hybrid View and the PRODUCTS table and the filter predicate on the PRODUCTS table. It resolved the PROD_ID values corresponding to the PRODUCTS filter predicate and pulled those values down to Impala to be included in the query of the offloaded SALES data.

Advisor

Gluent Advisor is a set of tools, including Offload Advisor that allows organizations to get immediate visibility into the benefits that Gluent Data Platform offers.

Data is gathered from a candidate database via simple SQL scripts that extract storage and workload information from ASH, AWR and catalog views. This data is uploaded to Gluent Advisor Portal and is ingested and analyzed by Offload Advisor to generate rich HTML reports on the offload potential of the database.

For more information and to download Gluent Advisor Data Extractor visit this link.

Next Steps

To discuss how Gluent can help your company embrace the hybrid world, please get in touch.

Support

For support with Gluent Express contact Gluent Support via gluent-express@gluent.com

Documentation Feedback

Send feedback on this documentation to: feedback@gluent.com