Best Practices
Contents |
Best workflow for query performance
1. Load data
2. Create Indexes
3. Analyze Tables
Background:
Statistics generated during ANALYZE TABLE cover the full range from 'data' to 'indexes'. If you get out of order, you may have indexes, but the query planner will not use them because no statistics exist indicating their usefulness.
What does a complete ETL lifecycle look like?
This is a real-life example and may need to be adjusted if you are using MERGE updates, in which case you will not drop tables etc. You also would have data warehouse labels as an option for providing consistency of data presentation between ETL runs for your users if you do not have to drop tables each ETL run. This example does not reflect that scenario.
A clear out and pull all data cycle is illustrated below.
1. SET SCHEMA to your existing data to be updated.
SET SCHEMA 'TESTING';
2. DROP TABLE on the old stuff. This will also delete old indexes you had on the tables.
3. Compact the database on disk.ALTER SYSTEM DEALLOCATE OLD;4. Get rid of your old foreign schema
DROP SCHEMA MYSQL_TESTING CASCADE;5. Get rid of your old foreign server connection definition
DROP SERVER jdbc_link_testing CASCADE;
6. Recreate a foreign server connection definition
create server jdbc_link_testing
foreign data wrapper sys_jdbc
options(
driver_class 'com.mysql.jdbc.Driver',
url 'jdbc:mysql://172.27.0.204:3306/testing?useCursorFetch=true',
user_name 'sg',
password 'Complex38403BlahPass',
login_timeout '10',
fetch_size '1000',
validation_query 'select 1',
schema_name 'testing',
table_types 'TABLE');
7. Give LucidDB a schema where it can show the foreign tables
CREATE SCHEMA MYSQL_TESTING;
import foreign schema "testing" limit to ("FACT_ANSWERS","DIM_QUESTION","DIM_SUBMISSION_TYPE","FACT_VBP_CENSUS",
"FACT_SUBMISSION_RESULTS","DIM_PATIENT","DIM_HOSPITAL","DIM_DISEASE",
"DIM_READMISSION_REASON","DIM_DOCTOR")
from server jdbc_link_testing into MYSQL_TESTING;
8. Switch active schema to the destination for your imported tables
SET SCHEMA 'TESTING';
9. Perhaps use applib.create_table to help replicate the foreign tables from source to destination tables and schema
CALL applib.create_table_as(null,'DIM_SUBMISSION_TYPE','select * from "MYSQL_TESTING"."DIM_SUBMISSION_TYPE"',true); CALL applib.create_table_as(null,'FACT_VBP_CENSUS','select * from "MYSQL_TESTING"."FACT_VBP_CENSUS"',true); .....
10. Create indexes on all the columns involved in joining and filtering in all the tables.
11. Tell LucidDB to calculate statistics about indexes and all tables for the query planner.
.... ANALYZE TABLE DIM_QUESTION COMPUTE STATISTICS FOR ALL COLUMNS; ANALYZE TABLE FACT_ANSWERS COMPUTE STATISTICS FOR ALL COLUMNS; ... more tables
12. Grant access to users for the new schema you created during the process if necessary
CALL APPLIB.GRANT_SELECT_FOR_SCHEMA( 'TESTING', 'pentaho_lucid' );
Importing tables from other sources:
CALL applib.create_table_as(null,'DIM_SUBMISSION_TYPE','select * from "MYSQL_TESTING"."DIM_SUBMISSION_TYPE"',true);
The above lets LucidDB help you out when importing data from other locations by recreating a suitable table structure for the incoming data. Keep in mind that no indexes or primary key information is transferred over as a result. Be sure to create your indexes then update your statistics.
Easy ways to create indexes:
Discover what may make for a great index
AppLib_SHOW_IDX_CANDIDATES
Have LucidDB automatically create indexes
AppLib_CREATE_INDEXES
Mondrian Schemas with LucidDB/Other DB
Sometimes you will want to create a Mondrian schema and begin testing it in MySQL or other database.
It may make design easier if you specify your default schema/database in both databases. This way your Mondrian schema will not require the following entries
<Table name="DIM_FACILITY" schema="TESTING">
You should instead be able to get away with just
<Table name="DIM_FACILITY">