Gluent Express¶
Table of Contents
Documentation Conventions¶
Commands and keywords are in this
font
.$OFFLOAD_HOME
is set when the environment file (offload.env
) is sourced, unless already set, and refers to the directory namedoffload
that is created when the software is unpacked. This is also referred to as<OFFLOAD_HOME>
in sections of this guide where the environment file has not been created/sourced.Third party vendor product names might be aliased or shortened for simplicity. See Third Party Vendor Products for cross-references to full product names and trademarks.
Introduction¶
The Gluent Express sandbox image provides a fully functional Oracle and Cloudera Data Hub environment in which to learn more about Gluent Data Platform.
Gluent Data Platform supports other backends as noted in Supported Backend Data Platforms.
The tutorials in this guide will give you examples of how to get started with some of the tools provided in Gluent Data Platform.
If you are not familiar with Gluent Data Platform then it is recommended that you begin by watching the Gluent Overview video (duration: approximately 10 minutes) and reading the Concepts guide.
Should you require any assistance with the use of Gluent Express, please refer to Support.
Administrative Information¶
Unless specified otherwise, the following should be adhered to:
Run all operating system commands and Impala Shell commands as the oracle OS user. Use
sudo
to gain access to the oracle OS user:
centos$ sudo -iu oracle
Run all Oracle SQL commands as the SYSTEM database user. To connect to the XEPDB1 pluggable database in the XE container:
oracle$ sqlplus <username>/<password>@localhost:1521/xepdb1 [as sysdba]
The following table lists credentials for Oracle Database users:
Username |
Password |
---|---|
SYS |
oracle |
SYSTEM |
oracle |
SH |
sh |
Pre-loaded Data¶
Oracle¶
Table Name |
Num Rows |
Partitioned |
Num Partitions |
---|---|---|---|
CAL_MONTH_SALES_MV |
54 |
NO |
_ |
CHANNELS |
5 |
NO |
_ |
COSTS |
1,787,540 |
YES |
229 |
COUNTRIES |
23 |
NO |
_ |
CUSTOMERS |
55,500 |
NO |
_ |
FWEEK_PSCAT_SALES_MV |
11,266 |
NO |
_ |
PRODUCTS |
72 |
NO |
_ |
PROMOTIONS |
503 |
NO |
_ |
SALES |
2,923,527 |
YES |
229 |
SUPPLEMENTARY_DEMOGRAPHICS |
4,500 |
NO |
_ |
TIMES |
1,826 |
NO |
_ |
Backend¶
Table Name |
Num Rows |
Partitioned |
Num Partitions |
---|---|---|---|
SALES_HISTORY |
48,685,528 |
NO |
_ |
Tutorials¶
The tutorials presented below are designed to give you an understanding of how to use some of the core components in Gluent Data Platform.
Preparation¶
Confirm that the Gluent Express sandbox is configured correctly by running the validation component of Gluent Data Platform called connect
:
oracle$ . /u01/app/gluent/offload/conf/offload.env
oracle$ $OFFLOAD_HOME/bin/connect
Tip
The offload.env
script only needs to be sourced once for a session. It is included in all tutorials below, although strictly unnecessary unless using a new shell or the shell environment changes.
Ensure that all validation checks show Passed.
Note
Should any unrecoverable errors be encountered then contact Gluent Support. See Support.
Offload: Incrementally Offload a Partitioned Table¶
Objective¶
The objective of this tutorial is to demonstrate the Offload process of Gluent Data Platform, which provides a simple and flexible way for users to copy data from an Oracle Database instance to backends while retaining the ability to query offloaded data.
Overview¶
In this tutorial we will:
Offload all partitions up to and including year 2011 from the SALES table
Incrementally offload Q1 2012 partitions from the SALES table
Step 1.1: Source parameter file¶
To source the parameter file for the duration of this scenario, run the following commands:
oracle$ . /u01/app/gluent/offload/conf/offload.env
Step 1.2: Offload up to year 2011 partitions¶
To offload all partitions up to and including year 2011 from the SALES table, run the following command:
oracle$ $OFFLOAD_HOME/bin/offload -t SH.SALES -x --older-than-date=2012-01-01
Step 1.3: Offload Status Report¶
To report on the offload status of the SALES table, run the following command:
oracle$ $OFFLOAD_HOME/bin/offload_status_report -s sh -t sales
Note
A text format report containing the results will be produced in the $OFFLOAD_HOME/log
directory. The exact report location will be displayed in the Generate Report section of the command output.
See also
Offload Status Report section in the Utilities Guide.
Step 1.4: Offload Q1 2012 partitions¶
To incrementally offload Q1 2012 partitions from the SALES table, run the following command:
oracle$ $OFFLOAD_HOME/bin/offload -t SH.SALES -x --older-than-date=2012-04-01
Note
Gluent Data Platform detects that all 2011 partitions are already offloaded and skips them. This demonstrates the Partition-Based Offload capability of Offload.
Step 1.5: Offload Status Report¶
To report on the new offload status of the SALES table, run the following command:
oracle$ $OFFLOAD_HOME/bin/offload_status_report -s sh -t sales
Present: Present a Table to Oracle Database¶
Objective¶
The objective of this tutorial is to demonstrate the Present process of Gluent Data Platform, which enables backend-only tables to be presented to an Oracle Database instance so they can be queried.
Overview¶
In this tutorial we will:
Present the backend SALES_HISTORY table
Query the data in the Oracle Database instance
Step 2.1: Source parameter file¶
To source the parameter file for the duration of this scenario, run the following commands:
oracle$ . /u01/app/gluent/offload/conf/offload.env
Step 2.2: Present the backend table¶
To present the backend SALES_HISTORY table, run the following command:
oracle$ $OFFLOAD_HOME/bin/present -t legacy.sales_history -x --target-name=sh.sales_history
Step 2.3: Query the data in the Oracle Database instance¶
To query the data, run the following command:
SQL> SELECT COUNT(*) FROM sh_h.sales_history;
Query: Run Queries Against Hybrid Schema¶
Objective¶
The objective of this tutorial is to demonstrate querying data through the Hybrid Schema, and to show a number of the optimizations that Gluent Data Platform provides.
Overview¶
In this tutorial we will:
Run a query demonstrating Predicate Pushdown
Run a query demonstrating partition pruning
Run a query demonstrating Advanced Aggregation Pushdown
Run a query demonstrating (Adaptive) Join Filter Pulldown
Step 3.1: Run a query demonstrating predicate pushdown¶
To execute the query and view the execution plan, run the following commands:
SQL> SET LINESIZE 200 PAGESIZE 100
SQL> ALTER SESSION SET STATISTICS_LEVEL = ALL;
SQL> SELECT prod_id
, cust_id
, amount_sold
FROM sh_h.sales
WHERE time_id <= DATE '2011-01-01'
AND prod_id = 23;
SQL> SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY_CURSOR(NULL,NULL,'ALLSTATS LAST'));
Note
Setting the statistics level to ALL is used for demonstration purposes only, it is not required to run hybrid queries.
Note
The SALES_EXT external table is the interface to the offloaded SALES data in the backend. When the SALES_EXT table is accessed, Smart Connector is invoked to build an Impala query, execute it and format the results for the external table.
To view the generated Impala query for this example, run the following command:
SQL> @/home/oracle/impala_sql_text 80b42fts9ctz4
Note
Observe that the filter predicates on both TIME_ID and PROD_ID have been pushed down to Impala.
Step 3.2: Run a query demonstrating partition pruning¶
To execute the query and view the execution plan, run the following commands:
SQL> SET LINESIZE 200 PAGESIZE 100
SQL> ALTER SESSION SET STATISTICS_LEVEL = ALL;
SQL> SELECT prod_id
, cust_id
, amount_sold
FROM sh_h.sales
WHERE time_id BETWEEN DATE '2012-03-01' AND DATE '2012-03-02'
AND prod_id = 23;
SQL> SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY_CURSOR(NULL,NULL,'ALLSTATS LAST'));
To view the generated Impala query for this example, run the following command:
SQL> @/home/oracle/impala_sql_text c89c3ma4ycphu
Note
Observe that Smart Connector has pushed the filter predicates on the RDBMS partition key column (TIME_ID) to the equivalent backend partition key column (GL_PART_M_TIME_ID) to achieve partition pruning.
To view the HDFS scanning section of the Impala query profile, run the following command:
SQL> @/home/oracle/impala_profile c89c3ma4ycphu
Note
Notice the partitions=1/15 line in the profile showing that Impala only scanned 1 of the 15 offloaded partitions.
Step 3.3: Run a query demonstrating Advanced Aggregation Pushdown¶
To execute the query and view the execution plan, run the following commands:
SQL> SET LINESIZE 200 PAGESIZE 100
SQL> ALTER SESSION SET STATISTICS_LEVEL = ALL;
SQL> SELECT prod_category
, SUM(amount_sold) AS total_sold
FROM sh_h.sales
, sh.products
WHERE sales.prod_id = products.prod_id
AND sales.quantity_sold < 2
AND sales.time_id BETWEEN DATE '2012-03-01' AND DATE '2012-03-02'
GROUP BY
prod_category
ORDER BY
total_sold DESC;
SQL> SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY_CURSOR(NULL,NULL,'ALLSTATS LAST'));
Note
The SALES hybrid view is referenced in the SQL statement above. Note that Oracle Database used a rewrite rule, created automatically when the SALES table was offloaded, to rewrite the query to use the SALES_AGG hybrid view.
To view the generated Impala query for this example, run the following command:
SQL> @/home/oracle/impala_sql_text 6tm4h4fp7g3xp
Note
Smart Connector detected that it was invoked for an aggregate Hybrid External Table and constructed and executed a corresponding aggregate Impala query in Impala.
Step 3.4: Run a query demonstrating (Adaptive) Join Filter Pulldown¶
To execute the query and view the execution plan, run the following commands:
SQL> SET LINESIZE 200 PAGESIZE 100
SQL> ALTER SESSION SET STATISTICS_LEVEL = ALL;
SQL> SELECT prod_category
, SUM(amount_sold) AS total_sold
FROM sh_h.sales
, sh.products
WHERE sales.prod_id = products.prod_id
AND sales.time_id BETWEEN DATE '2012-03-01' AND DATE '2012-03-02'
AND products.prod_category IN ('Hardware','Photo')
GROUP BY
prod_category
ORDER BY
total_sold DESC;
SQL> SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY_CURSOR(NULL,NULL,'ALLSTATS LAST'));
Note
There are no filter predicate values against the PROD_ID column on the SALES_AGG_EXT table.
To view the generated Impala query for this example, run the following command:
SQL> @/home/oracle/impala_sql_text 7g3djzp6scm1d
Note
Smart Connector detected the join between the SALES Hybrid View and the PRODUCTS table and the filter predicate on the PRODUCTS table. It resolved the PROD_ID values corresponding to the PRODUCTS filter predicate and pulled those values down to Impala to be included in the query of the offloaded SALES data.
Advisor¶
Gluent Advisor is a set of tools, including Offload Advisor that allows organizations to get immediate visibility into the benefits that Gluent Data Platform offers.
Data is gathered from a candidate database via simple SQL scripts that extract storage and workload information from ASH, AWR and catalog views. This data is uploaded to Gluent Advisor Portal and is ingested and analyzed by Offload Advisor to generate rich HTML reports on the offload potential of the database.
For more information and to download Gluent Advisor Data Extractor visit this link.
See also
Next Steps¶
To discuss how Gluent can help your company embrace the hybrid world, please get in touch.
Support¶
For support with Gluent Express contact Gluent Support via gluent-express@gluent.com