Wednesday, June 18, 2008

Block Size and Oracle Database Performance

Recently, on the Oracle OTN Forums, there has been a major debate over the
Oracle block size and database performance. Some top Oracle experts including Jonathan Lewis and Steve Karam have joined the debate. More details can be found here and also here on the issue of whether or not block size for Oracle determines whether or not performance can be improved or not.

In a nutshell, some of the discussion touched on potential bugs in Oracle database code with respect to the use of ASSM and block sizes.

I may or may not create some actual benchmark tests to see how my results differ from the financial services customer I worked a few years ago to resolve the poor performance in the past. I saw major improvement by changing the block size from 8k to 16k. Due to NDA and non-disclosure laws, I was unable to post the actual statistics. Also, one item from the above block size and Oracle performance debate that has been interesting to note is we have two groups of thought at odds with each other: one such as Jonathan Lewis and Tom Kyte advocate extensive test cases and posting results to argue a point on Oracle and the second group represented by Steve Karam and Don Burleson advocate real world cases to prove how Oracle database really works in the real world.

It really depends on how much time I have.

Personally, both camps of thought have some interesting points to consider. I argue this because testing is important and actual results from real world production systems are critical as well. However actual benchmark testing from at the bare minimum a subset of a live production system appears to be more accurate than a simple test case.

Monday, June 9, 2008

My Favorite New Feature in Oracle 11g- Backup and Recovery

Oracle 11g has tons of cool new features that developers and DBAs should be excited about since the first release last summer. For me, as a DBA and consultant, I like the new Data Recovery advisors for Oracle 11g Backup and Recovery with RMAN.

Oracle ACE Arup Nanda has an excellent guide that covers all the details.

The two main features that I particular like about the new Data Recovery Advisor feature new to Oracle 11g release are the automatic repair scripts and the validate database command in RMAN. First lets take a look at an example on how these function in Oracle 11g Release 1(11.1).

Let's say that I have problems with my USERS tablespace and the datafile associated with this tablespace is corrupted or a disk media failure loses the data.

Assuming that I have been a prudent and more likely wise paranoid Oracle DBA, I have good backups tested and available to do the database restore and recovery operation.

Instead of using the normal RMAN commands that I, as the seasoned DBA know by heart to perform the database recovery operation for the lost datafile for the USER tablespace while half asleep at lets say 2am on a Saturday night when the operator pages me, I can make use of the Data Recovery Advisor in RMAN to help guide me to a quick and accurate recovery for my missing or corrupted data file.

First I simulate a data loss by deleting the data file from the operating system.

When I shutdown and startup my test Oracle 11g database, Oracle complains about the missing data files.




And we see the infamous words:



ORA-01157: cannot identify/lock data file 4 - see DBWR trace file
ORA-01110: data file 4: '/u01/app/oracle/oradata/ora11g/users01.dbf'



OK so how can we quickly diagnose the solution for our database recovery operation with Oracle 11g? Well thanks to the new Data Recovery Advisor, it is a piece of cake if we have a good backup.

First, lets start a new RMAN session and connect to our target or catalog if we are using an RMAN catalog. For sake of simple example, we just use target and no catalog for the test.

rman target /

Once we connect to the target database that requires database recovery, we use the new list failure and list failure detail commands with RMAN to find out the root cause of the database failure.





OK, so now I know what the problem is. Now let's try out the cool new feature of the Oracle 11g Data Recovery Advisor to help us with a repair script and recovery strategy for our lost data files.

We use the advise failure command new to Oracle 11g for RMAN to give us the details on how to recover from the data loss. Below is an example for Oracle 11g on Red Hat Linux.





Notice that the Data Recovery advisor informs us that the data files in question are missing or have been lost as well as recommended actions. As an experienced DBA, we should know these steps fairly well and can verify the steps with the advisor. One nice touch before we implicitly believe the Data Recovery Advisor is to run the repair failure preview command to test out a simulation using the automated repair scripts provided to us by the Data Recovery Advisor.


Let's check out what the Data Recovery Advisor tells us when we run the repair failure preview command from within RMAN for our test Oracle 11g database.




Notice that the Data Recovery Advisor provides us with both the list of steps to perform the database recovery option as well as the impact to the database for the recovery operation as well as location of the repair scripts. One nice feature of the Data Recovery Advisor is that these tasks can be scripted in the Unix shell scripting language of your choice and then scheduled to run via a daily cron job for detection and repair of database failures.

Now that we are satisfied with the output for the recovery scripts from the Data Recovery Advisor, let's go ahead and use the repair scripts above to complete our database restore and recovery.

The repair failure command will use the default repair scripts created by the Data Recovery Advisor to execute these tasks as shown in the following example.




First, the Data Recovery Advisor for Oracle 11g in RMAN shows us the tasks to be executed and then prompts us whether or not we really really want to do this. Since we like the choices given, we choose yes and the advisor quickly carries out the database recovery operation for us right away.




Once the restore and recovery operation has completed successfully, Oracle 11g RMAN asks us if we would like to open the database. Since we were unable to open the database last time due to the missing or lost data files, we choose yes to test whether the restore has been successful.

Since it takes a while to open my test Oracle 11g database, I get a cup of coffee and come back.




Now that RMAN has completed the database recovery, lets test to see if we can access our missing data from our USERS and EXAMPLE tablespaces that were lost earlier.

But first, I would like to illustrate the other way cool feature of Oracle 11g for Backup and Recovery with RMAN: validate database

In the past when an Oracle DBA has to check for corrupt blocks and repair corrupt blocks in earlier releases before 11g, the dbverify utility was the tool of choice along with the block recover feature in Oracle RMAN.
Not only was this time consuming tracking down which corrupt blocks belong to which data file and associated tablespaces, but it was a pain in the rear to deal with.

Well now with 11g for Oracle RMAN, we have the validate command. It is fantastic!

Let's view a few examples.

To check the entire database for block corruption- we have the validate database command.

RMAN will cycle through the entire database as shown in the example below:





My advice for Oracle DBAs out there is to run a validate command after each database recovery operation to verify that no block corruption exists. It also helps to do this on a regular basis just as part of good preventive maintenance and sanity checks.

Now lets run the validate tablespace command to check for block corruption with the recently restored EXAMPLE and USERS tablespaces:



And for our EXAMPLE tablespace:




We also have the validate datafile command to check for block corruption with a specific data file. For example, to check for corruption with data file 1 which happens to belong to the SYSTEM tablespace:





I like this a lot better than the old dbverify utility which frankly to be honest was a royal pain to deal with.

Now that we have covered the new features for Oracle 11g RMAN with backup and recovery for the new Data Recovery Advisor, lets verify that our data has been restored by logging on to our database and checking to see that our data has been restored for these tables in the USERS and EXAMPLE tablespaces.




Now that our data has been recovered we are satisfied with the power and ease that the new Oracle 11g Data Recovery Advisor affords us busy Oracle DBAs.

Cheers,
Ben