Monday 9 April 2012

Oracle STATS collection. 



Optimizer statistics are a collection of data that describe more details about the database and the objects in the database. These statistics are used by the query optimizer to choose the best execution plan for each SQL statement. Optimizer statistics include table statistics (number of rows,number of blocks,average row length etc..),column statistics (number of  NULLs, cardinality, histograms etc..),Index statistics and System statistics. Since the data in the DB  constantly  changes its very important the STATs are updated properly. Otherwise this will cause serious performance issues.

Automatic Optimizer STATS Collection.

By default 11g have Auto STATs collection enabled during DB  maintenance windows. This is the suggested method for STAT collection. 
Automatic optimizer statistics collection gathers optimizer statistics by calling the DBMS_STATS.GATHER_DATABASE_STATS_JOB_PROC procedure. The GATHER_DATABASE_STATS_JOB_PROC procedure collects statistics on database objects when the object has no previously gathered statistics or the existing statistics are stale because the underlying object has been modified significantly (more than 10% of the rows).The DBMS_STATS.GATHER_DATABASE_STATS_JOB_PROC is an internal procedure, but it operates in a very similar fashion to the DBMS_STATS.GATHER_DATABASE_STATSprocedure using the GATHER AUTO option. The primary difference is that the DBMS_STATS.GATHER_DATABASE_STATS_JOB_PROC procedure prioritizes the database objects that require statistics, so that those objects which most need updated statistics are processed first. This ensures that the most-needed statistics are gathered before the maintenance window closes

Disabling auto stat collection (use ENABLE if at all you need to enable)
BEGIN
  DBMS_AUTO_TASK_ADMIN.DISABLE(
    client_name => 'auto optimizer stats collection', 
    operation => NULL, 
    window_name => NULL);
END;
/

Use following query to verify scheduled jobs are running. Here you can find AUTO STAT ran fine


select * from dba_autotask_client ;
You can check the history of this job using the following query . This way we can figure out Auto STATs on a given date ran fine.
SELECT *FROM dba_autotask_client_history
select * from dba_autotask_job_history

make sure background process CJQ0 is running for AUTO STAT to be successful. You can enable the parameter JOB_QUEUE_PROCESSES (maybe to 1000 ) and this will start the CJQ0 process. 

Manual STATS Collection

Even though we have an automated job for collecting STATS regularly, lot of scenarios will come where we need to collect STATS manually for a table,schema or for an index etc..Examples are say you have a table which got truncated and loaded with huge amount of data, the subsequent select on this table will become slow. You have a table where data changes continually than 10%, these are good candidates for collecting STATS.

The DBMS_STATS package can gather statistics on table and indexes and individual columns and partitions of tables
GATHER_INDEX_STATSIndex statistics
GATHER_TABLE_STATSTable, column, and index statistics
GATHER_SCHEMA_STATSStatistics for all objects in a schema
GATHER_DICTIONARY_STATSStatistics for all dictionary objects
GATHER_DATABASE_STATSStatistics for all objects in a database

Collecting Schema Stats

EXEC DBMS_STATS.gather_schema_stats (ownname => 'SYSADM', cascade =>true,estim
ate_percent => dbms_stats.auto_sample_size);
Collecting table stats

EXEC DBMS_STATS.gather_table_stats ('SCHEMA_NAME','TABLE_NAME',cascade=>true,estimate_percent => 5)
Few parameters of interest

cascade- will include all related indexes also.
Using this option is equivalent to runninggather_table_stats plus running gather_index_stats for each index on the table
estimated_percent- decides the sample size. Auto is default in 11g and oracle decides what percentage (sample_size) to take. In Oracle 11g, the AUTO_SAMPLE_SIZE constant is the preferred sample size as the mechanism for determining the actual sample size has been improved. In addition, the statistics estimate based on the auto sampling are near to 100% accurate and much faster to gather than in previous versions. Sample size is notyhing but how much rows you are taking for calculating the STATS. If you are not giving this parameter it will take all rows and SAMPLE_SIZE will be equel to number of rows. SAMPLE_SIZE and NUM_ROWS can be found from dba_tables. 
Degree- to parallelize the collection of statistics. degree=>x (where is the degree of parallelism, depends processors available and other parallel settings on DB)