Monday 24 June 2013

Peoplesoft Upgrades-A High level Process flow


This doc explains what exactly happens in a Peoplesoft Upgrade. This is generic to all Apps Upgrade but for explanations I am taking example of HR90-HR92
How many DBs we need for an initial upgrade?. Let’s say I am doing a HR90 (Tools 848)-HR92 Upgrade (Tools853)
1.       Copy of target.
2.       Old Demo DB (HR90Demo DB)
3.       New release Demo (HR92 Demo DB)

What I am trying here is to explain the important steps involved in a Peoplesoft Upgrade and in depth what these steps do. Since there is couple of changes introduced in Peoplesoft 9.2 the article is based on 9.2 upgrades

UPGCUST lifecycle during Upgrade
UPGCUST is a very important project throughout the upgrade process and is the main project which handles customization during the upgrade process. Peoplesoft takes care of customization on app objects during the upgrade process.

In first Chapter UPGCUST project is created in your target DB by comparing Copy of Production against the Copy of Current Demo database. So that means UPGCUST will have all the customized objects and will remain in target DB

Now In chapter Tools Upgrade target is converted to same tools as that of source. In this case 8.53.So what we have now
  1. In target we have UPGCUST which contains all customization
  2. We have a new release Demo which have all new objects which we need

Now at step Running New release UPGCUST compare we compare UPCUST project in COP with new release demo. This compare report needs to be analyzed in depth and have to decide which customizations to keep and which one to delete.  This may need involvement from Dev/Functional teams.  Once the decisions are made UPGCUST with the customizations we decided to preserver will get copied to New release Demo.  So Now the new release demo have all customizations in COP+ All the new objects.

SQL Renames
Why we do some SQL renames in chapter1.
This step renames the physical table to a “temporary” name at database level.  This step is needed to handle when a record name is changed in new release. Let me explain with an example
Let’s say a record PS_RECORD1 in Peoplesoft 9.0 is renamed to PS_RECORD2 in peoplesoft9.2. That means PS_RECORD2 should have PS_RECORD1 data after the upgrade process. Let’s see how this happens
  1. PS_RECORD1 exists in 9.0. In chapter1 we normally rename PS_RECORD1 to PS_RECORD1_TMP at database level.
  2. When new release definitions + customizations gets copied to COP RECORD2 will get copied and RECORD1 will get deleted from COP
  3. During create table step PS_RECORD2 will get built but will be empty
  4. Now data gets populated in PS_RECORD2 during DC from temporary table which got created in step1
Data mover Renames
You can see field renames using datamover also. This is needed when a field is renamed in the new release.
For example lets say a field FIELD1 is renamed to FIELD2 in 9.2. So when the upgrade is complete ideally all records using FIELD 1 should have FIELD2 and data from FIELD1.
Lets assume we don’t use DM rename
  1. In this case all records which use FIELD 1 will get replaced by new column FIELD2 when new release import happens
  2. Now when we alter the record what it will do is, it will drop column FIELD1 (as its not there in appdesigner after new release import) and add a new column FIELD2 (as FIELD2 gets copied during new release import), Since alter don’t know from anywhere FIELD2 is the new name of FIELD1. This it will result in value loss
-- 
-- Alters for record PS_RECORD
--      FIELD2- add
--       FIELD1 - delete

-- Start the Transaction


-- Create temporary table

CREATE TABLE PSY_RECORD (OTHERFIELD VARCHAR(10) NOT NULL
FIELD2 VARCHAR2(10) NOT NULL,
  
) TABLESPACE XXX STORAGE (INITIAL
 40000 NEXT 100000 MAXEXTENTS UNLIMITED PCTINCREASE 0) PCTFREE 10
 PCTUSED 80
/

-- Copy from source to temp table

INSERT INTO PSY_RECORD (
    OTHERFIELD,
    FIELD2,
    )
  SELECT
OTHERFIELD,
        ' '------ If you see here FIELD2 gets populated with default value and not with the FIELD1 value which results in data loss.
  FROM PS_RECORD
/

-- CAUTION: Drop Original Table

DROP TABLE PS_RECORD
/

-- Rename Table

RENAME PSY_RECORD PS_RECORD
/

Now ideally we need FIELD1 data to FIELD2 to avoid data loss during field rename. This is where rename of datamover comes in place
  1. In Chapter we rename FIELD1 to FIELD2 using datamover. This makes an entry in PSOBJCHNG table to indicate FIELD1 is renamed to FIELD2.
  2. Now all the records will have FIELD2 instead of FIELD1 and after new release import when we alter any such record, it first checks PSOBJCHNG and see that there is an entry for FIELD2 and it identifies that it’s a rename of FIELD1. So in this case FIELD2 is populated with FIELD1 values during alter

SQL> select count(*) from PSOBJCHNG;

  COUNT(*)
----------
         0

_________________________________________

rename field recordname.fieldname as newfieldname;


Started:  Fri Mar 22 08:00:13 2013
Data Mover Release: 8.53
Database: PF9U (ENG)
Rename  FIELD1  ->  FIELD2
Ended: Fri Mar 22 08:00:15 2013
Successful completion
Script Completed.


_________________________________________


SQL> select * from PSOBJCHNG;

E OLDNAME            NEWNAME
- ------------------ ------------------
OBJCHG_STAMP
---------------------------------------------------------------------------
OLDNAME2
------------------
3 FIELD1        FIELD2
22-MAR-13 08.00.15.208645 AM
RECORD1


_______________________________________________


Copy the new definition (FIELD2) to target by app designer during upgrade


________________________________________________


Alter

________________________________________________


-- Do not perform deletes at this time


-- 
-- Alters for record PS_GC_ACCT_BAL_TBL
--               FIELD2 - add to rename FIELD1 (76,51) 
--         FIELD1 - delete to rename FLAG1 (76,51) 

-- Start the Transaction


-- Create temporary table

CREATE TABLE PSY_RECORD (OTHERFIELD VARCHAR2(5) NOT NULL,
   FIELD2 VARCHAR2(10) NOT NULL) TABLESPACE XXX STORAGE (INITIAL
 40000 NEXT 100000 MAXEXTENTS UNLIMITED PCTINCREASE 0) PCTFREE 10
 PCTUSED 80
/

-- Copy from source to temp table

INSERT INTO PSYGC_ACCT_BAL_TBL (
        OTHERFIELD,
    FIELD2
    )
  SELECT
        OTHERFIELD,
    FIELD1
  FROM PS_GC_ACCT_BAL_TBL
/

-- CAUTION: Drop Original Table

DROP TABLE PS_RECORD
/

-- Rename Table

RENAME PSY_RECORD TO PS_RECORD
/

-- Done

ALLTABS Build
ALLTABS is a build process which handles the following
Alter existing tables to:
ADD new columns
CHANGE columns
RENAME columns/tables
So this is a build process where we don’t do any deletes so let’s see why we are not doing deletes
Let’s say a RECORD RECORD1 have the following structure in COP
RECORD1
FIELD1 VARCHAR(10)
FIELD2 NUMBER
FIELD3 VARCHAR(5)
The same record when it came to new release has the following structure in source
RECORD1
FIELD1 VARCHAR(10)
FIELD2 NUMBER
NEWFIELD VARCHAR(10)

So in target DB we will have a table PS_RECORD1 with the same structure as shown above with three fields. Now RECORD1 will get overwritten when new release import happens and will have two fields. That means RECORD1 in Appdesigner will have two fields after new release copy and PS_RECORD1 will have three fields in COP DB. Now during all tabs alter we don’t select deletes. So PS_RECORD gets built and won’t delete FIELD3 at this point of time. So PS_RECORD after ALLTABS will have a hybrid structure as shown below
RECORD1
FIELD1 VARCHAR(7)
FIELD2 NUMBER
FIELD3 VARCHAR(5)
NEWFIELD VARCHAR(10)

Why we need FIELD3 is because in DC this field will be used and data will get pumbed to other field or other tables as per new design. Once the data from this field is handled properly build process after DC will delete this extra field.
As I already mentioned if there were any field renames in chapter1 those will be handled by ALLTABS build as explained above.
ALLTABS also handles datatype changes which can be handled buy build process. Such as say a field is changed from varchar to char or length of a field is changed in new release . All the field datatype or datalength changes which can be handled by alter by table rename can be handled by ALLTABS alter.
But there will be some complex datatype changes or data length changes which cannot be handled by ALLTABS alter. Say for example a LOB field is changed to number in new release. We cannot simply do a insert from select as we normally do in ALLTABS alter. These are complex transformation and DC handles these.
Data Conversion Process
As the name indicates data conversion is the process of converting data in COP to the new structure.  In order to convert data we typically need old release data (and columns/tables which we preserver by renaming) and the new release tables/columns which will come as part of new release import. In this way DC app engines can read from old and write to the new.
ALLTABS build take care of normal conversions which can be carried out directly through build SQLs, like just increasing the coumns size and possible data type changes. What DC handles will be difficult conversions and conversions which needs some logic when moving to new release.


What Happens after DC
After DC there may new set old setup data that we need to export from new demo and import in target.

Alters after DC
So after DC the next alter handles deletes also. In alltabs alter the extra columns where not deleted as they were needed in DC and in this alter the columns which are not needed are deleted as DC is over. This will complete the structural changes. 






Monday 10 September 2012

Peoplesoft VERSION Application Engine




We normally run version AE in PeopleSoft during various circumstances. But it’s essential to understand what VERSION AE does
PSVERSION – this table stores an acronym (AEM-Application Engine Management, PJM-Projects Mangement, PPC-PeopleCode management,UPM-User Profile Management,RDM-Record Data Management,etc.) for each object type within the PeopleSoft application and it’s corresponding version value. PSVERSION is updated by the Panel Processor and every time the peoplesoft component processor tries to retrieve an object definition, it first checks the PSVERSION table to provide the most updated version of the specific object.

Therefore, whenever an object definition is changed, the PSVERSION table is updated, correspondingly. For example, if a record is modified in Application Designer, the application first updates the version in PSRECDEFN (UPDATE PSRECDEFN SET version = version from psversion table  +1 WHERE RECNAME=’YourRecord’) and afterwards it updates the PSVERSION table, as shown below:
UPDATE PSVERSION SET VERSION= VERSION+1 WHERE
OBJECTTYPENAME=’RDM’
UPDATE PSVERSION SET VERSION= VERSION+1 WHERE
OBJECTTYPENAME=’SYS’
This way, the next time the record object is accessed by the Application Server will issue the following data manipulation SQL statements:
SELECT max(VERSION) FROM PSVERSION WHERE OBJECTTYPENAME = ‘RDM’
SELECT max(VERSION) FROM PSRECDEFN
and will verify that the values returned by the select commands are the same. If these values are different you might encounter this kind of error, when running the SYSAUDIT :
Version Check Audits-Exception(s) Found, Manager- OBJECTTYPENAME (eg:RDM,UPM,etc.) Version check of table xxxxDEFN against PSVERSION failed.
The same approach is taken when one tries to modify an user profile, the PSVERSION is updated and, also, the VERSION filed of the corresponding user from the PSOPRDEFN table is changed accordingly.
UPDATE PSVERSION SET VERSION= VERSION+1 WHERE
OBJECTTYPENAME=’UPM’
UPDATE PSVERSION SET VERSION= VERSION+1 WHERE
OBJECTTYPENAME=’SYS’
UPDATE PSOPRDEFN SET VERSION= (SELECT max(VERSION) FROM PSVERSION WHERE OBJECTTYPENAME = ‘UPM’) WHERE OPRID=’YourOPRID’

When running the SQR Report SYSAUDIT from the sample Processes(PT<8.4xx) or System Process Requests (PT>=8.4xx) one might encounter the following error:
Version Check Audits-Exception(s) Found, Manager- OBJECTTYPENAME (eg:RDM,UPM,etc.) Version check of table xxxDEFN against PSVERSION failed

This might occur due to the fact that the Panel Processor failed to update PSVERSION.Normally, the Application server compares the version number in the file RDM.KEY (IDX) to that in the PSVERSION. if the PSVERSION database obeject type version is greater than that of the .IDX (KEY) file, then the object type will be retrieved from the database and the RDM.DAT file will be updated with the latest version of the object.

The PS objects are stored, at any given time, on .DAT files on the Application server and in the cache memory of the same server. The issues generaaly arise when projects containing various objects are migrated from one environment to another without previously stopping and clearing the Application Server cache to ensure that all the version fields get updated correctly, in both XXXDEFN and PSVERSION tables.Of course, deleting the application server cache is not the desired solution, every time a project/object is migrated, taking into consideration that this will impact the final users and generate useless disk loading during the rebuild of the newly migrated objects.
IN order to fix this error one should try the following solutions
- execute the VERSION Application Engine program in command line: psae -CD -CT -CO -CP -R -AI VERSION
afterwards ensure that all the versions values are 1 by issuing the follwoing command on the database:
SELECT * FROM PSVERSION;
and,in the end, empty the Application Servers cache (is recommended to have more than one application server, and that the available servers are designed in a load balancing architecture; this way the users would not be affected when the servers are stopped,cache cleaned, and restarted, one by one).The AE VERSION resets all version numbers for all XXXDEFN tables and sets VERSION=1 in PSVERSION and PSLOCK tables (will not determine every object to be recached as will deleting cache). In the end, the VERSION application engine will update the LASTREFRESHUPDDTM field in the psstatus table which tells the caching mechanism to compare objects in the cache files and the database and synchronize.
-another solution, which is usually not recommended due to its various implications, is to modifiy directly into the database the LASTREFRESHUPDDTM date to today’s date into the PSSTATUS table ( UPDATE PSSTATUS SET LASTREFRESHUPDTTM = sysdate), which will not require the shut down of the application servers. When the column LASTREFRESHDTTM gets updated, the result is the purging of the cache files and when the files are accessed by a process the first time,the process will first read the LASTREFRESHDTTM from PSLOCK, then compare this datetime with the datetime on the cache file (a cache file contains a datetime to reflect when the file was created). If the two datetime values are different, the cache file is purged. But this comparison occurs only when a cache file is accessed by a process the first time. After that, the process uses the cache file without comparing the datetime values again. In third tier , because the application server has been up, the comparison will not be done, so the cache files are not purged and refreshed.



The effect of the VERSION Application Engine program is fairly violent
upon the PeopleTools system tables, and as such should be used only in
certain, specific instances. Technically speaking, it should only be
run when there is reason to believe that the versions in the PSVERSION
table are no longer coherent, or when the versions in one of the
managed objects tables are out of sync with PSVERSION.

Generally speaking it should only be run when indicated by one of the following:

1. The SYSAUDIT report indicates that there are objects out of synch.

2. A GSC analyst recommends its use.

3. The PeopleTools development team recommendation for a specific issue.

4. Following a major PeopleTools upgrade or major application upgrade.

5. An Installation Guide indicates its need.

NOTE: The use of VERSION should NOT be run as matter of standard
operating procedure.

Due to some side effects from the Application Designer program
(PSIDE.EXE) VERSION must be run only when all Application Designer
clients are completely logged out and shut down. Many customers
choose to restart the database to ensure that this is the case. All
Application Servers, Process Schedulers, and other PeopleTools clients
should be completely shut down while it is being run.

PROCESS SCHEDULER OPERATION

Logically following the previous point, the use of the Process
Definition that allow the VERSION AE to be run from the Process
Scheduler is no longer recommended. VERSION AE should only be run
from a command line, and then only when that connection is the only
one active to the database. (Note: this does not mean that the
database is in =1Asingle user mode.=1A)

If the VERSION program is run incorrectly, performance can be
dramatically impacted. It is not uncommon for the stability of the
Application Server processes to be compromised. Additionally,
application integrity can possibly be affected; simple saving efforts
from Application Designer can fail.


PROPER PROCEDURE

The proper steps to follow when running VERSION AE are:

1. Shutdown all Application Servers, Process Schedulers, and
other PeopleTools clients.

2. *critical step* Ensure that all Application Designer session
are logged out and shut down. If necessary, shutdown and restart
the database and its communication software.

3. Establish the proper shell environment. Normally this
includes loging in as your PeopleSoft id, and
changing to the PSFT bin directory (i.e. cd $PS_HOME/bin)
setting the PS_SERVER_CFG environment variable (export
PS_SERVER_CFG=3D$PS_HOME/appserv/prcs/<dbna me>/psprcs.cfg)


4. Execute the command from a command line:
psae -CD <dbname> -CT <dbtype> -CO <oprid> -CP <pswd> -R
INSTALL [where INSTALL is a valid run control] -AI VERSION
(Note: INSTALL must be a valid run control for <oprid>)

5. Issue the following SQL and archive the output to ensure that
the program ran (all the versions should be 1).
SELECT * FROM PSVERSION ORDER BY VERSION DESC;

6. Clear the cache files from the Application Server, Web
Servers, Process Schedulers, and Client machines.

7. Restart the system per normal operational procedures. There
will be expected performance impact as the system rebuilds its cache
files.

8. Over the course of the following days, every 4 hours or so,
rerun the SQL from Step #5. You should observe a gradual growth of
the versions, typically in the order of dozens per day. The version
associated with SYS should always be equal to or greater than all
other values in the table.

Should you observe one of the following conditions contact the GSC
immediately for further advice.

1. The version value associated with SYS is no longer greater to
or equal all other value in the PSVERSION table.

2. Some of the values increase dramatically, on the order of
several thousand, and then remain fairly constant. Normal behavior is
for the values to increase by increments of 1. One exception would be
during the migration of a project with many records. Some values will
increase by the number of records migrated.



Histograms in Oracle




Where there is a high degree of skew in the column distribution, called a non-uniform distribution of data, histograms should lead to a better estimation of selectivity. This should produce plans that are more likely to be optimal.  That means histograms contains information on nature of table data.

What is Histogram? Histograms are feature in CBO and it helps to optimizer to determine how data are skewed(distributed) with in the column. Histogram is good to create for the column which are included in the WHERE clause where the column is highly skewed. Histogram helps to optimizer to decide whether to use an index or full-table scan or help the optimizer determine the fastest table join order.

What is a Bucket? When Histograms are created the number of buckets can be specified.It is this number that controls the type of histogram created.
 # Buckets = # of Rows of information.
When building histograms the information it stores is interpreted differently depending on whether the number of buckets requested is less than the number distinct values or if it is the same.  Specifically, ENDPOINT_NUMBER and
ENDPOINT_VALUE in dba/user/all_histograms would have different meanings.

Oracle uses two types of histograms for column statistics: height-balanced histograms and frequency histograms

Frequency histograms
Each value of the column corresponds to a single bucket of the histogram.
Each bucket contains the number of occurrences of that single value.
Each value of the column corresponds to a single bucket of the histogram. This is also called value based histogram. Each bucket contains the number of occurrences of that single value. Frequency histograms are automatically created instead of height-balanced histograms when the number of distinct values is less than or equal to the number of histogram buckets specified.
data sorted
Y              Y
Y              Y
Y              Y
N             Y
N             N
NA          N
N             N
Y              NA
NA          NA

Results
Bucket1 Y=4
Bucket2 N=3
Bucket3 NA=2

Height-balanced histogram
In a height-balanced histogram, the column values are divided into bands so that each band contains approximately the same number of rows. The useful information that the histogram provides is where in the range of values the endpoints fall.

A histogram is created using DBMS_STATS.GATHER_TABLE_STATS METHOD_OPT => 'FOR COLUMNS SIZE <# of buckets> <Column Name>‘
Size determines the number of buckets to be created
execute dbms_stats.gather_table_stats(ownname => 'scott', tabname => 'employee',METHOD_OPT => 'FOR COLUMNS SIZE 10 gender');-for a particular column
execute dbms_stats.gather_table_stats(ownname => 'scott', tabname => 'employee',METHOD_OPT => 'FOR ALL COLUMNS SIZE 10’);- for all columns
execute dbms_stats.gather_table_stats(ownname => 'scott', tabname => 'employee',METHOD_OPT => 'FOR all indexed columns size 10’);- for all columns with index




EXAMPLE
-------

Table TAB1

SQL> desc tab1
 Name                            Null?    Type
 ------------------------------- -------- ----
 A                                        NUMBER(6)
 B                                        NUMBER(6)

Column A contains unique values from 1 to 10000.

Column B contains 10 distinct values. The value '5' occurs 9991 times. Values
'1, 2, 3, 4, 9996, 9997, 9998, 9999, 10000' occur only once.

Test queries:

(1) select * from tab1 where b=5;
(2) select * from tab1 where b=3;

Both the above queries would use a FULL TABLE SCAN as there is no other
access method available.

Then we create an index on column B.

select lpad(INDEX_NAME,10), lpad(TABLE_NAME,10),
       lpad(COLUMN_NAME,10), COLUMN_POSITION, COLUMN_LENGTH
from user_ind_columns
where table_name='TAB1'

SQL> /

LPAD(INDEX LPAD(TABLE LPAD(COLUM COLUMN_POSITION COLUMN_LENGTH
---------- ---------- ---------- --------------- -------------
    TAB1_B       TAB1          B               1            22

Now,

(1) select * from tab1 where b=5;
(2) select * from tab1 where b=3;

Both do an INDEX RANGE SCAN to get the ROWID to do a lookup in the table.

With an INDEX present, it would preferrable to do an INDEX RANGE SCAN for
query (2), but a FULL TABLE SCAN for query (1).


ANALYZING THE TABLE
-------------------

Next, analyze the table using compute statistics:

SQL> execute dbms_stats.gather_table_stats(ownname => 'scott', tabname => 'tab1')

From dba_tables:

  NUM_ROWS     BLOCKS EMPTY_BLOCKS  AVG_SPACE  CHAIN_CNT AVG_ROW_LEN
---------- ---------- ------------ ---------- ---------- -----------                             
     10000         64            0         86          0          10

From dba_tab_columns:

NUM_DISTINCT LOW  HIGH   DENSITY  NUM_NULLS NUM_BUCKETS LAST_ANALYZ SAMPLE_SIZE
------------ ---- ---- --------- ---------- ----------- ----------- -----------
       10000 Full Full     .0001          0           1 30-JUN-1999       10000
          10 Full Full        .1          0           1 30-JUN-1999       10000


SQL> select lpad(TABLE_NAME,10), lpad(COLUMN_NAME, 10),
  2  bucket_number, endpoint_value
  3  from user_histograms
  4  where table_name='TAB1';

TABLE_NAME COLUMN_NAME BUCKET_NUMBER ENDPOINT_VALUE
---------- ----------- ------------- --------------
      TAB1           A             0              1
      TAB1           A             1          10000
      TAB1           B             0              1
      TAB1           B             1          10000


SQL> select lpad(TABLE_NAME,10), lpad(COLUMN_NAME, 10),
  2  bucket_number, endpoint_value
  3  from user_tab_histograms
  4  where table_name='TAB1';

Analyze has created 1 BUCKET for each column. So all values for the column
are in the same bucket.  The BUCKET_NUMBER represents the BUCKET NUMBER and
ENDPOINT_VALUE represents the last column value in that bucket.

Now query (1) and (2) ; both do a FULL TABLE SCAN.

So, the fact that you have statistics about the table and columns does not
help the optimizer to distinguish between how many of each value we have.
The reason it does a FULL TABLE SCAN is because there is a 1 BUCKET histogram
and any value selected for should be in that bucket.


CREATING HISTOGRAMS
-------------------

What you need now is to create histograms so the Optimizer knows how many
values occur for each column.

Query (1): select * from tab1 where b=5;
           should do a FULL TABLE SCAN   and

Query (2): select * from tab1 where b=3;
           should do an INDEX RANGE SCAN

SQL> execute dbms_stats.gather_table_stats(ownname => 'scott', tabname => 'employee',METHOD_OPT => 'FOR COLUMNS SIZE 10 b’);

select lpad(TABLE_NAME,10), lpad(COLUMN_NAME, 5),
       endpoint_number, endpoint_value
from user_histograms;

TABLE_NAME COLUMN_NAME ENDPOINT_NUMBER ENDPOINT_VALUE
      TAB1           B               1              1
      TAB1           B               2              2
      TAB1           B               3              3
      TAB1           B               4              4
      TAB1           B            9995              5
      TAB1           B            9996           9996
      TAB1           B            9997           9997
      TAB1           B            9998           9998
      TAB1           B            9999           9999
      TAB1           B           10000          10000

So, now there are statistics on the table and on the columns.

You requested 10 buckets (size 10) and there are 10 distinct values.

The ENDPOINT_VALUE shows the column value and the ENDPOINT_NUMBER
shows the cumulative number of rows.

For example, for ENDPOINT_VALUE 2, it has an ENDPOINT_NUMBER 2, the previous
ENDPOINT_NUMBER is 1, hence the number of rows with value 2 is 1. 

Another example is ENDPOINT_VALUE 5. Its ENDPOINT_NUMBER is 9995. The previous
bucket ENDPOINT_NUMBER is 4, so 9995 - 4 = 9991 rows containing the value 5.

So, now QUERY (1) does in fact do a Full Table Scan.

SQL> select * from tab1 where b=5
SQL> /

Execution Plan
----------------------------------------------------------
0      SELECT STATEMENT Optimizer=ALL_ROWS (Cost=10 Card=9991 Bytes=99910)

1    0   TABLE ACCESS (FULL) OF 'TAB1' (Cost=10 Card=9991 Bytes=99910)


And, QUERY (2) does do an Index Range Scan.

SQL> select * from tab1 where b=3
SQL> /

Execution Plan
----------------------------------------------------------
0      SELECT STATEMENT Optimizer=ALL_ROWS (Cost=6 Card=500 Bytes=5000)
1    0   TABLE ACCESS (BY ROWID) OF 'TAB1' (Cost=6 Card=500 Bytes=5000)
2    1     INDEX (RANGE SCAN) OF 'TAB1_B' (NON-UNIQUE)

This is fine if you have a low number of distinct values, but there can
be tables with a huge number of distinct values.  You don't want to
create a bucket for each value. There would be too much overhead.
In this case you would request less buckets than distinct values.


CREATING HISTOGRAMS WITH LESS BUCKETS THAN DISTINCT VALUES
----------------------------------------------------------

SQL> execute dbms_stats.gather_table_stats(ownname => 'scott', tabname => 'employee',METHOD_OPT => 'FOR COLUMNS SIZE 8 b’);


SQL> select lpad(TABLE_NAME,10), lpad(COLUMN_NAME, 5),
  2>       endpoint_number, endpoint_value
  3> from user_histograms;

LPAD(TABLE LPAD( ENDPOINT_NUMBER ENDPOINT_VALUE
---------- ----- --------------- --------------
TAB1     B               0              1
TAB1     B               7              5
TAB1     B               8          10000

Here, Oracle creates the requested number of buckets but puts the same
number of values into each bucket, where there are more endpoint values
that are the same for the frequently occurring value.

The ENDPOINT_NUMBER is the actual bucket number and ENDPOINT_VALUE is
the endpoint value of the bucket determined by the column value.

From above, bucket 0 holds the low value for the column. You cannot see
buckets 1 to 6 so as to save space.

But we have bucket 1 with an endpoint of 5,
                    bucket 2 with an endpoint of 5,
                    bucket 3 with an endpoint of 5,
                    bucket 4 with an endpoint of 5,
                    bucket 5 with an endpoint of 5,
                    bucket 6 with an endpoint of 5,
                    bucket 7 with an endpoint of 5 AND
                    bucket 8 with an endpoint of 10000

So bucket 8 contains values between 5 and 10000.
All buckets contain the same number of values (which is why they are called
height-balanced histograms), except the last bucket may have less values
then the other buckets.

If the data is uniform, you would not use histograms. However, if you request
the same number of buckets as distinct values, Oracle creates 1 bucket.  If
you request less buckets, Oracle uses an algorithm to balance values into each
bucket and any values that remain (which have to be less then the number
stored in each height-balanced bucket) go into the last bucket.


STORING CHARACTER VALUES IN HISTOGRAMS
--------------------------------------

Character columns have some exceptional behaviour, in as much as we store
histogram data for the first 32 bytes of any string.  Any predicates that
contain strings greater than 32 characters will not use histogram information
and the selectivity will be 1 / DISTINCT.

Data in histogram endpoints is normalized to double precision floating point
arithmetic.

EXAMPLE
-------

SQL> select * from morgan;

A
----------
a
b
c
d
e
e
e
e


The table contains 5 distinct values. There is one occurance of 'a', 'b', 'c'
and 'd' and 4 of 'e'.

Create a histogram with 5 buckets.

SQL> analyze table morgan compute statistics for columns a size 5;

Looking in user_histograms:

LPAD(TABLE LPAD( ENDPOINT_NUMBER ENDPOINT_VALUE
---------- ----- --------------- --------------
    MORGAN     A               1     5.0365E+35
    MORGAN     A               2     5.0885E+35
    MORGAN     A               3     5.1404E+35
    MORGAN     A               4     5.1923E+35
    MORGAN     A               8     5.2442E+35

So, ENDPOINT_VALUE   5.0365E+35 represents a
                                                5.0885E+35 represents b
                                                5.1404E+35 represents c
                                                5.1923E+35 represents d
                                                5.2442E+35 represents e

Then if you look at the cumulative values for ENDPOINT_NUMBER,
the corresponding  ENDPOINT_VALUE's are correct.

Friday 13 July 2012

When To Apply Bundles and MPs.

When To Apply Bundles and MPs.


Normally bundles are collection of fixes and one or more bundles constitutes a MP
MPs normally are two types. Cumulative and Delta

When to apply cumulative MP:

If you have not applied any patches, bundles or MPs (delta or cumulative). Cumulative MP is the quickest way to make a vanilla database current.
After applying a cumulative MP, subsequent Delta MP or bundle can be applied.

When to apply delta MP:

After you have applied a cumulative or delta MP and have not applied any patch or bundle after that. After that you can apply next delta MP or next bundle. Once you apply a bundle you can no longer apply MPs. You need to keep applying subsequent bundles.

When to Apply a bundle:

You have applied a bundle or MP previously.
You are on MP and cannot wait for the next delta MP
After a bundle is applied you need to keep applying next bundles, you cannot apply a MP on top of a bundle.This means if at any time you apply a bundle in your DB you cannot apply MPs further. You need to proceed with bundles when they get released. 

Tuesday 29 May 2012

Things to note before and after a people tools patch.


Things to note before and after a people tools patch.


Because of too many customizations normally after applying a tools patch we end up in having lots of issues. From my experience these are few things which we should take care before and after a tools patch. This is more of generic explanation and things may change as per tools releases.

File Server
Backup the file server. If you edit the delivered SQRs , cobol etc.. chances are there patch overwrite those. In such case you may need to reapply the changes.

From DB side
Make sure to back up the DB completely before applying the patch.
It’s always better to apply the patch on a demo DB first and then compare the PTPATCH project to your Peoplesoft DB (Dev SIT or Prod whichever you are going to upgrade). This gives you an idea on what all objects are going to get overwritten or change. Usually if you have some customization on single sign on peoplecode and all, chances are high these will get overwritten by the tools patch and the functionalities won’t work properly after the patch. In such case make sure once the patch is applied you restore these customizations from the backup or from another DB.
On DB side after the patching always run version AE ,clear rowset cahe dms prcsclear.

App and Prcs server
Normally App and prcs servers won’t make too many issues after a patch. Tools patch won’t overwrite the cfg files which store the configuration values we provided. So just reconfiguring and clearing the cache will do for app and batch servers. Don’t recreate an app or batch server after tools patch as it will overwrite all your custom configurations.

Webserver
Webserver is always tricky especially when you have customizations in login logout pages. Normally after a tools upgrade we have to redeploy the PIA. Before doing so make a backup of PS_HOME/webserv folder. Then redeploy the PIA, when you do so make sure you select the correct web profile, domain name etc.. Make  sure that the webserver cache is cleared.
Once the redeploy is done you will be able to login but there are few files which you may need to restore from backup
Configuration.properties , especially when you have a Appserver load balancing setup before the patch.( PORTAL.war/WEB-INF/psftdocs/<SITE>/configuration.properties)
Setenv.sh when you changed values in this file before the tools patch (inside webserv bin)
integrationGateway.properties when we have nodes configured.(./applications/peoplesoft/PSIGW.war/WEB-INF/integrationGateway.properties)-after restoring make sure to change the tools version in this file to reflect the patch.
web.xml(./applications/peoplesoft/PORTAL.war/WEB-INF/web.xml), especially if you have custom MIME types
weblogic.xml- contains custom values. (./applications/peoplesoft/PORTAL.war/WEB-INF/weblogic.xml)
Copy the  customized login and logout files (if present) into /<PS_HOME>/webserver/…../PORTAL.war/WEB-INF/psftdocs/<websitename>/. This directory will also contain any customized html files, so it better to compare this directory from backup.
For applications like portal sometimes there will be separate folders which will store custom scripts. If these are overwritten copy them from backup. Always compare the backup webserv and the new one.


Webserver issues

Issue1: While redeploying PIA after tools patch the following error comes, even though PIA redeployment should not prompt for HTTP port. 

Invalid HTTP port. Please enter a number.

Solution:
1) Make a backup of file weblogic.xml which is located in <PS_HOME>/webserv/<DOMAIN_NAME>/applications/peoplesoft/PORTAL.war/WEB-INF

2) Edit weblogic.xml and remove any hyphens from the machine/host name in the 'CookieName' parameter
Example: Change this:
<session-param>
<param-name>CookieName</param-name>
<param-value>MY-WEB-SERVER-80-PORTAL-PSJSESSIONID</param-value>
</session-param>
To this:
<session-param>
<param-name>CookieName</param-name>
<param-value>MYWEBSERVER-80-PORTAL-PSJSESSIONID</param-value>
</session-param>

3) Now you should be able to proceed with the redeploy and/or the installation of another site

4) After successful installation, restore the backup of weblogic.xml so that the host name is back to its original value

Monday 9 April 2012

Oracle STATS collection. 



Optimizer statistics are a collection of data that describe more details about the database and the objects in the database. These statistics are used by the query optimizer to choose the best execution plan for each SQL statement. Optimizer statistics include table statistics (number of rows,number of blocks,average row length etc..),column statistics (number of  NULLs, cardinality, histograms etc..),Index statistics and System statistics. Since the data in the DB  constantly  changes its very important the STATs are updated properly. Otherwise this will cause serious performance issues.

Automatic Optimizer STATS Collection.

By default 11g have Auto STATs collection enabled during DB  maintenance windows. This is the suggested method for STAT collection. 
Automatic optimizer statistics collection gathers optimizer statistics by calling the DBMS_STATS.GATHER_DATABASE_STATS_JOB_PROC procedure. The GATHER_DATABASE_STATS_JOB_PROC procedure collects statistics on database objects when the object has no previously gathered statistics or the existing statistics are stale because the underlying object has been modified significantly (more than 10% of the rows).The DBMS_STATS.GATHER_DATABASE_STATS_JOB_PROC is an internal procedure, but it operates in a very similar fashion to the DBMS_STATS.GATHER_DATABASE_STATSprocedure using the GATHER AUTO option. The primary difference is that the DBMS_STATS.GATHER_DATABASE_STATS_JOB_PROC procedure prioritizes the database objects that require statistics, so that those objects which most need updated statistics are processed first. This ensures that the most-needed statistics are gathered before the maintenance window closes

Disabling auto stat collection (use ENABLE if at all you need to enable)
BEGIN
  DBMS_AUTO_TASK_ADMIN.DISABLE(
    client_name => 'auto optimizer stats collection', 
    operation => NULL, 
    window_name => NULL);
END;
/

Use following query to verify scheduled jobs are running. Here you can find AUTO STAT ran fine


select * from dba_autotask_client ;
You can check the history of this job using the following query . This way we can figure out Auto STATs on a given date ran fine.
SELECT *FROM dba_autotask_client_history
select * from dba_autotask_job_history

make sure background process CJQ0 is running for AUTO STAT to be successful. You can enable the parameter JOB_QUEUE_PROCESSES (maybe to 1000 ) and this will start the CJQ0 process. 

Manual STATS Collection

Even though we have an automated job for collecting STATS regularly, lot of scenarios will come where we need to collect STATS manually for a table,schema or for an index etc..Examples are say you have a table which got truncated and loaded with huge amount of data, the subsequent select on this table will become slow. You have a table where data changes continually than 10%, these are good candidates for collecting STATS.

The DBMS_STATS package can gather statistics on table and indexes and individual columns and partitions of tables
GATHER_INDEX_STATSIndex statistics
GATHER_TABLE_STATSTable, column, and index statistics
GATHER_SCHEMA_STATSStatistics for all objects in a schema
GATHER_DICTIONARY_STATSStatistics for all dictionary objects
GATHER_DATABASE_STATSStatistics for all objects in a database

Collecting Schema Stats

EXEC DBMS_STATS.gather_schema_stats (ownname => 'SYSADM', cascade =>true,estim
ate_percent => dbms_stats.auto_sample_size);
Collecting table stats

EXEC DBMS_STATS.gather_table_stats ('SCHEMA_NAME','TABLE_NAME',cascade=>true,estimate_percent => 5)
Few parameters of interest

cascade- will include all related indexes also.
Using this option is equivalent to runninggather_table_stats plus running gather_index_stats for each index on the table
estimated_percent- decides the sample size. Auto is default in 11g and oracle decides what percentage (sample_size) to take. In Oracle 11g, the AUTO_SAMPLE_SIZE constant is the preferred sample size as the mechanism for determining the actual sample size has been improved. In addition, the statistics estimate based on the auto sampling are near to 100% accurate and much faster to gather than in previous versions. Sample size is notyhing but how much rows you are taking for calculating the STATS. If you are not giving this parameter it will take all rows and SAMPLE_SIZE will be equel to number of rows. SAMPLE_SIZE and NUM_ROWS can be found from dba_tables. 
Degree- to parallelize the collection of statistics. degree=>x (where is the degree of parallelism, depends processors available and other parallel settings on DB)