Best Practices

From LucidDB Wiki

Best workflow for query performance

1. Load data
2. Create Indexes
3. Analyze Tables

Background:
Statistics generated during ANALYZE TABLE cover the full range from 'data' to 'indexes'. If you get out of order, you may have indexes, but the query planner will not use them because no statistics exist indicating their usefulness.

What does a complete ETL lifecycle look like?

This is a real-life example and may need to be adjusted if you are using MERGE updates, in which case you will not drop tables etc. You also would have data warehouse labels as an option for providing consistency of data presentation between ETL runs for your users if you do not have to drop tables each ETL run. This example does not reflect that scenario.

A clear out and pull all data cycle is illustrated below.

1. SET SCHEMA to your existing data to be updated.

SET SCHEMA 'TESTING';

2. DROP TABLE on the old stuff. This will also delete old indexes you had on the tables.

3. Compact the database on disk.

ALTER SYSTEM DEALLOCATE OLD;

4. Get rid of your old foreign schema

DROP SCHEMA MYSQL_TESTING CASCADE;

5. Get rid of your old foreign server connection definition

DROP SERVER jdbc_link_testing CASCADE;

6. Recreate a foreign server connection definition

create server jdbc_link_testing

  foreign data wrapper sys_jdbc
  options(
     driver_class 'com.mysql.jdbc.Driver',
     url 'jdbc:mysql://172.27.0.204:3306/testing?useCursorFetch=true',
     user_name 'sg',
     password 'Complex38403BlahPass',   
     login_timeout '10',
     fetch_size '1000',
     validation_query 'select 1',
     schema_name 'testing',
     table_types 'TABLE');

7. Give LucidDB a schema where it can show the foreign tables

CREATE SCHEMA MYSQL_TESTING; 
import foreign schema "testing" limit to ("FACT_ANSWERS","DIM_QUESTION","DIM_SUBMISSION_TYPE","FACT_VBP_CENSUS",
"FACT_SUBMISSION_RESULTS","DIM_PATIENT","DIM_HOSPITAL","DIM_DISEASE",
"DIM_READMISSION_REASON","DIM_DOCTOR")
from server jdbc_link_testing into MYSQL_TESTING;

8. Switch active schema to the destination for your imported tables

SET SCHEMA 'TESTING';

9. Perhaps use applib.create_table to help replicate the foreign tables from source to destination tables and schema

   CALL applib.create_table_as(null,'DIM_SUBMISSION_TYPE','select * from "MYSQL_TESTING"."DIM_SUBMISSION_TYPE"',true);
   CALL applib.create_table_as(null,'FACT_VBP_CENSUS','select * from "MYSQL_TESTING"."FACT_VBP_CENSUS"',true);
.....

10. Create indexes on all the columns involved in joining and filtering in all the tables.
11. Tell LucidDB to calculate statistics about indexes and all tables for the query planner.

....
ANALYZE TABLE DIM_QUESTION COMPUTE STATISTICS FOR ALL COLUMNS;
ANALYZE TABLE FACT_ANSWERS COMPUTE STATISTICS FOR ALL COLUMNS;
... more tables

12. Grant access to users for the new schema you created during the process if necessary

CALL APPLIB.GRANT_SELECT_FOR_SCHEMA( 'TESTING', 'pentaho_lucid' );

Importing tables from other sources:

CALL applib.create_table_as(null,'DIM_SUBMISSION_TYPE','select * from "MYSQL_TESTING"."DIM_SUBMISSION_TYPE"',true);

The above lets LucidDB help you out when importing data from other locations by recreating a suitable table structure for the incoming data. Keep in mind that no indexes or primary key information is transferred over as a result. Be sure to create your indexes then update your statistics.

Easy ways to create indexes:

Discover what may make for a great index
AppLib_SHOW_IDX_CANDIDATES

Have LucidDB automatically create indexes
AppLib_CREATE_INDEXES

Mondrian Schemas with LucidDB/Other DB

Sometimes you will want to create a Mondrian schema and begin testing it in MySQL or other database. It may make design easier if you specify your default schema/database in both databases. This way your Mondrian schema will not require the following entries

 <Table name="DIM_FACILITY" schema="TESTING">

You should instead be able to get away with just

 <Table name="DIM_FACILITY">

Best Practices

Contents

Best workflow for query performance

What does a complete ETL lifecycle look like?

Importing tables from other sources:

Easy ways to create indexes:

Mondrian Schemas with LucidDB/Other DB

Views

Personal tools

Product Documentation

Eigenbase Projects

Wiki Tools

Search

Toolbox