Tuesday, June 21, 2011

High commit class leads to database hang

Visit the Below Website to access unlimited exam questions for all IT vendors and Get Oracle Certifications for FREE
http://www.free-online-exams.com
Problem Description: High commit class leads to database hang


awr_report_8230_8231.html
##########################
Begin Snap: 8230 23-مايو -11 10:00:17 153 3.9
End Snap: 8231 23-مايو -11 11:00:15 267 3.9
Elapsed: 59.96 (mins)
DB Time: 10,449.88 (mins)



Top 5 Timed Events

Event Waits Time(s) Avg Wait(ms) % Total Call Time Wait Class
log file sync 3,991,487 406,741 102 64.9 Commit
buffer busy waits 54,237 45,214 834 7.2 Concurrency
enq: TX - row lock contention 32,661 45,169 1,383 7.2 Application
row cache lock 10,708 28,410 2,653 4.5 Concurrency
CPU time 21,986 3.5


user commits 76,902 21.38 0.71

log switches (derived) 0 0.00

Redo size: 59,194.47 1,972.70


log file sync 3,991,487 99 406,741 102 36.97
log file parallel write 40,583 1,763 43 0.38


Operating System Statistics
BUSY_TIME 3,009,982
IDLE_TIME 2,750,363

SYS_TIME 1,626,376
USER_TIME 1,383,606



Action:
================
Hi,
1. It shows sys% of CPU was more than 50% . If cpu is busy enough then lgwr will be slow to wake up to flush redo.
Please collect OS watcher to check all CPU statistics . (NOTE.301137.1 OS Watcher User Guide ) . This need to be collected during the issue period.

2. Please collect an errorstack of LGWR and upload the same. (should be collected during the issue period).

connect / as sysdba
oradebug setospid <OSPID> -- ospid of lgwr
oradebug unlimit
oradebug dump errorstack 3
wait 20 seconds
oradebug dump errorstack 3
wait 20 seconds
oradebug dump errorstack 3

Trace file will be at user_dump_dest. Please upload the trace file.



3. collect a system state dump during the issue period only.
sqlplus /nolog
SQL> connect / as sysdba
SQL> oradebug setmypid
SQL> oradebug unlimit
SQL> oradebug dump systemstate 266
-- wait for 20-30 seconds after completion of above command and then issue the same command again
SQL> oradebug dump systemstate 266
SQL> oradebug tracefile_name -- This will give trace file name
SQL> exit



Explanation:

From ecs_ora_23777.trc
=========================

waiting for 'log file parallel write' wait_time=0, seconds since wait started=0
files=1, blocks=7, requests=1
blocking sess=0x0 seq=33775
Dumping Session Wait History
for 'log file parallel write' count=1 wait_time=0.140560 sec
files=1, blocks=6f, requests=1
for 'log file parallel write' count=1 wait_time=0.741955 sec
files=1, blocks=51, requests=1
for 'log file parallel write' count=1 wait_time=0.157550 sec
files=1, blocks=4, requests=1
for 'rdbms ipc message' count=1 wait_time=0.001759 sec
timeout=9d, =0, =0
for 'rdbms ipc message' count=1 wait_time=0.000006 sec
timeout=9d, =0, =0
for 'log file parallel write' count=1 wait_time=0.003976 sec
files=1, blocks=b, requests=1
for 'log file parallel write' count=1 wait_time=0.005834 sec
files=1, blocks=2, requests=1
for 'rdbms ipc message' count=1 wait_time=0.020856 sec
timeout=a0, =0, =0
for 'rdbms ipc message' count=1 wait_time=0.000008 sec
timeout=a0, =0, =0
for 'log file parallel write' count=1 wait_time=0.026097 sec
files=1, blocks=26, requests=1
Sampled Session History of session 2750 serial 3



adsunecs1_iostat_11.05.24.1100.dat
====================================

683.3 12.0 23337.7 37.0 0.0 6.0 0.0 8.6 0 100 c3t60060160C3831E004E870C08A369DF11d0
1.0 0.0 8.0 0.0 0.0 0.2 0.0 213.0 0 21 c3t60060160C3831E006CF4B589E1DDDF11d0


248.9 0.0 6223.5 0.0 0.0 11.1 0.0 44.7 0 100 c3t60060160C3831E004E870C08A369DF11d0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c3t60060160C3831E006CF4B589E1DDDF11d0

291.4 1.0 6872.6 8.0 0.0 11.2 0.0 38.5 0 100 c3t60060160C3831E004E870C08A369DF11d0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c3t60060160C3831E006CF4B589E1DDDF11d0
.........
........
Solution:
==================
LGWR was mostly waiting upon 'log file parallel write' and that contributes to log file sync wait. This implies the waits were due to writing to disk.
Looking as iostat report it shows the above IO disk device was 100% busy many of the time.
I would request to check the device where log files are placed.
- Reduce other I/O activity on the disks containing the redo logs, or use dedicated disks.
- Move the redo logs to faster disks or a faster I/O subsystem.
- raid 1+0 is fine for redolog files. ( RAID 5 is not efficient for writes.)

For log file sync:
- The application might be committing after every INSERT, rather than batching COMMITs.
Applications can reduce this wait by committing after inserting a number of rows, rather than
every row.

The disks or I/O subsystems where the redologs are placed may be too busy.
- Reduce other I/O activity on the disks containing the redo logs, or use dedicated disks.
- Move the redo logs to faster disks or a faster I/O subsystem.
- Move the redolog files from RAID 5 devices. RAID 5 is not efficient for writes.
- Alternate redo logs on different disks to minimize the effect of the archiver on the log writer.

Example:

separate archives and redos from each other:

So, as an example

disk 1 - redo log group 3 - member number 2
redo log group 1 - member number 1

disk 2 - redo log group 1 - member number 2
redo log group 2 - member number 1

disk 3 - redo log group 2 - member number 2
redo log group 3 - member number 1

when the log files are operating, they can create contention
on their own, LGWR is writing to the members of one group,
while ARCH is reading from another group.

When the database archives it will use the 1st log file
of the group, so if you separate out the log files as
noted above, lgwr will be writing to group 1, which is
on disk 1 and disk 2, while ARCH is reading from disk
3(log group 3, member 1),
then LGWR switches to group 2 which is on disks 2 and 3
and ARCH starts reading from disk 1, etc.

This will cut down the contention on the LOG groups.

So put the log groups and archives on seaprate disks
with no other files, to keep the activity to zero, except
for the redo activity.
Get Oracle Certifications for all Exams
Free Online Exams.com

1 comment:

Anonymous said...

I'm gone to tell my little brother, that he should also pay a visit this web site on regular basis to take updated from latest reports.
Also see my site > diet plans that work