Friday, December 13, 2019

Exadata : Hybrid Columnar Compression HCC



Exadata : Hybrid Columnar Compression

Introduction


Hybrid Columnar Compression (HCC) is a storage-related Oracle Database feature that causes the database to store the same column for a group of rows together.
Its storing column data together in this way can dramatically increase the storage savings achieved from compression. Because database operations work transparently against compressed objects, no application changes are required.
The HCC feature is available on database deployments created on Oracle Database Cloud Service using the Enterprise Edition - High Performance or Enterprise Edition - Extreme Performance software edition
Purpose

·       Hybrid Columnar Compression on Exadata enables the highest levels of data compression and provides enterprises with tremendous cost-savings and performance improvements due to reduced I/O.

·       HCC is optimized to use both database and storage capabilities on Exadata to deliver tremendous space savings AND revolutionary performance.

·       Average storage savings can range from 10x to 15x depending on which Hybrid Columnar Compression level is implemented



Best usage in data warehouse bulk loading techniques.

To maximize storage savings with Hybrid Columnar Compression, data must be loaded using data warehouse bulk loading techniques. Examples of bulk load operations commonly used in data warehouse environments are:
  • Insert statements with the APPEND hint 
  • Parallel DML
  • Direct Path SQL*LDR  
  • Create Table as Select (CTAS)



  1. Queries  on hybrid columnar compressed data often run in the Exadata storage cells with Smart Scans, using a high performance query engine that utilizes special columnar processing techniques.


Note
Data remains compressed not only on disk, but also remains compressed in the Exadata Smart Flash Cache, on Infiniband, in the database server buffer cache, as well as when doing back-ups or log shipping to Data Guard.

The best part of this feature is data process without sacrificing the robust feature set of the Oracle Database.

Migration and Best Practices 

Building type of compression in tables as per business requirement

·         COMPRESS FOR QUERY HIGH
·         COMPRESS FOR QUERY LOW
·         COMPRESS FOR ARCHIVE LOW
·         COMPRESS FOR ARCHIVE HIGH

For new tables and partitions, enabling Hybrid Columnar Compression is as easy as simply Creating the table or partition and specifying a compression level, such as “COMPRESS FOR QUERY HIGH”.

Example:
CREATE TABLE dbaocm (sno  NUMBER, ename VARCHAR2(128), dept VARCHAR2(128)) COMPRESS FOR QUERY HIGH;
CREATE TABLE dbaocm (sno  NUMBER, ename VARCHAR2(128), dept VARCHAR2(128)) COMPRESS FOR QUERY LOW;
CREATE TABLE dbaocm (sno  NUMBER, ename VARCHAR2(128), dept VARCHAR2(128)) COMPRESS FOR ARCHIVE LOW;
CREATE TABLE dbaocm (sno  NUMBER, ename VARCHAR2(128), dept VARCHAR2(128)) COMPRESS FOR ARCHIVE HIGH;

Practice

1.    Establish a terminal connection to 192.168.1.140 (rac140.ora.com) exadata compute node as the oracle user.

2.    Connect to your database with SQL*Plus. Login as the sh user.

[oracle@192.168.1.140 ~]$ sqlplus sh/sh

3.    Determine the size of the uncompressed INVENTORY table in sh schema

SQL> SET LIN200 PAGES 200
COL SEGMENT_NAME FORMAT A40
SELECT SEGMENT_NAME, SUM(BYTES)/1024/1024 SIZE_MB
FROM USER_SEGMENTS
WHERE SEGMENT_NAME LIKE 'INVENT%'
GROUP BY SEGMENT_NAME;

SEGMENT_NAME                                SIZE_MB
---------------------------------------- ----------
INVENTORY                                       366

4.    Verify that the INVENTORY table is uncompressed

SQL> select table_name, compression, compress_for
from user_tables
where table_name like 'INVENT%';


TABLE_NAME                     COMPRESS COMPRESS_FOR
------------------------------ -------- ------------
INVENTORY                      DISABLED


5.    Exadata Hybrid Columnar Compression achieves its highest levels of compression with data that is direct-path inserted.

alter session force parallel query;
alter session force parallel ddl;
alter session force parallel dml;

6.    Create a compressed copy of the INVENTORY table using the QUERY HIGH warehouse compression mode.

SQL> CREATE TABLE INVENT_QUERY
COMPRESS FOR QUERY HIGH
PARALLEL 4 NOLOGGING
AS SELECT * FROM INVENTORY;

Table created.


7.    Create a compressed copy of the INVENTORY table using the ARCHIVE HIGH archive compression mode.

SQL> SQL> CREATE TABLE INVENT_ARCHIVE
COMPRESS FOR ARCHIVE HIGH
PARALLEL 4 NOLOGGING
AS SELECT * FROM INVENTORY;

Table created.

8.    Verify the compression mode settings for the tables just created

SQL> SELECT TABLE_NAME, COMPRESSION, COMPRESS_FOR
FROM USER_TABLES
WHERE TABLE_NAME LIKE 'INVENT%';


TABLE_NAME                     COMPRESS COMPRESS_FOR
------------------------------ -------- ------------
INVENT_QUERY                   ENABLED  QUERY HIGH
INVENT_ARCHIVE                 ENABLED  ARCHIVE HIGH
INVENTORY                      DISABLED


Compare the size of the original uncompressed table with the two compressed copies you created. Calculate the compression ratios achieved using the formula:

SQL> SELECT SEGMENT_NAME,SUM(BYTES)/1024/1024 MB
FROM USER_SEGMENTS
WHERE SEGMENT_NAME LIKE 'INVENT%'
GROUP BY SEGMENT_NAME;


SEGMENT_NAME                                     MB
---------------------------------------- ----------
INVENTORY                                       366
INVENT_ARCHIVE                              34.1875
INVENT_QUERY                                     55

So here we can easily see the size of tables using HCC which is having same number of rows in different size , which gives 10X storage saving  using HCC storage feature.

CONCLUSION

The massive growth in data volume, experienced by enterprises, introduces significant challenges. Industries  must quickly adapt to the changing business landscape without influencing the bottom line. Its responsibility of IT managers need to efficiently manage their existing infrastructure to control costs and deliver extraordinary application performance. Advanced Compression  provide a robust set of compression, performance and data storage optimization capabilities that succeed in this complex environment.