Problem:     RAID and Oracle - 20 Common Questions and 
Answers   
RAID and Oracle - 20 Common Questions and 
Answers   
=================================================
1.  
What is RAID?
    
RAID is an acronym for Redundant Array of Independent Disks. A 
RAID
    
system consists of an enclosure containing a number of disk 
volumes,
    
connected to each other and to one or more computers by a 
fast
    
interconnect. Six levels of RAID are defined: RAID-0 simply 
consists
    
of several disks, and RAID-1 is a mirrored set of two or more 
disks.
    
The only other widely-used level is RAID-5, which is the subject 
of
    
this article. Other RAID levels exist, but tend to be 
vendor-specific,
    
and there is no generally accepted standard for features 
included.
2.  
What platforms is RAID available for?
    
Third-party vendors supply RAID systems for most of the popular 
UNIX
    
(including Linux) platforms, and for Windows. Hardware vendors 
often
    
provide their own RAID options.
3.  
What does RAID do?
    
The main feature of RAID-5 is prevention of data loss. If a disk 
is
    
lost because of a head crash, for example, the contents of that 
disk
    
can be reconstituted using the information stored on other disks 
in
    
the array. In RAID-5, redundancy is provided by 
error-correcting
    
codes (ECCs) with parity information (to check on data 
integrity)
    
stored with the data, thus striped across several physical 
disks.
    
(The intervening RAID levels between 1 and 5 work in a similar 
way,
    
but with differences in the way the ECCs are stored.) 
4.  
What are the performance implications of using RAID-5?
    
Depending on the application, performance may be better or 
worse.
    
The basic principle of RAID-5 is that files are not stored on 
a
    
single disk, but are divided into sections, which are stored on 
a
    
number of different disk drives. This means that the effective 
disk
    
spindle speed is increased, which makes reads faster. However, 
the
    
involvment of more disks and the more complex nature of a 
write
    
operation means that writes will be slower. So applications 
where
    
the majority of transactions are reads are likely to give 
better
    
response times, whereas write-intensive applications may show 
worse
    
performance.
    
Only hardware-based striping should be used on Windows. 
Software
    
striping, from Disk Administrator, gives very poor 
performance.
5.  
How does RAID-5 differ from RAID-1?
    
RAID-1 (mirroring) is a strategy that aims to prevent downtime 
due
    
to loss of a disk, whereas RAID-5 in effect divides a 
file
    
into chunks and places each on a separate disk. RAID-1 maintains 
a
    
copy of the contents of a disk on another disk, referred to 
a
    
mirrored disk. Writes to a mirrored disk may be a little slower 
as
    
more than one physical disk is involved, but reads should be 
faster
    
as there is a choice of disks (and hence head positions) to 
seek
    
the required location.
      
5.  
How do I decide between RAID-5 and RAID-1?
    
RAID-1 is indicated for systems where complete redundancy of 
data
    
is considered essential and disk space is not an issue. RAID-1 
may
    
not be practical if disk space is not plentiful. On a 
system
    
where uptime must be maximised, Oracle recommends mirroring 
at
    
least the control files, and preferably the redo log 
files.
    
RAID-5 is indicated in situations where avoiding downtime due 
to
    
disk problems is important or when better read performance 
is
    
needed and mirroring is not in use.
6.  
Do all drives used for RAID-5 have to be identical?
    
Most UNIX systems allow a failed disk to be replaced with one 
of
    
the same size or larger. This is highly implementation-specific, 
so
    
the vendor should be consulted.  
7. Is RAID-5 enough to provide full 
fault-tolerance?
    
No. A truly fault-tolerant system will need to have a 
separate
    
power supply for each disk to allow for swapping of one 
disk
    
without having to power down the others in the array. A 
fully
    
fault-tolerant system has to be purpose-designed.
8. What is hot swapping?
    
This refers to the ability to replace a failed drive without 
having
    
to power down the whole disk array, and is now considered 
an
    
essential feature of RAID-5. An extension of this is to have a 
hot
    
standby disk that eliminates the time taken to swap a 
replacement
    
disk in - it is already present in the disk array, but not 
used
    
unless there is a problem.
9. What is a logical drive, and how does it 
relate to a physical drive?  
    
A logical drive is a virtual disk constructed from one or 
(usually)
    
more than one physical disks. It is the RAID-5 equivalent of a 
UNIX
    
logical volume; the latter is a software device, whereas RAID-5 
uses
    
additional hardware.
10. What are the disadvantages of 
RAID-5?
    
The need to tune an application via placement of 'hot' 
(i.e.
    
heavily accessed) files on different disks is reduced by 
using
    
RAID-5. However, if this is still desired, it is less easy 
to
    
accomplish as the file has already been divided up and 
distributed
    
across disk drives. Some vendors, for example EMC, allow 
striping
    
in their RAID systems, but this generally has to be set up by 
the
    
vendor. There is an additional consideration for Oracle, in that 
if
    
a database file needs recovery several physical disks may be 
involved
    
in the case of a striped file, whereas only one would be involved 
in
    
the case of a normal file. This is a side-effect of the capability 
of
    
RAID-5 to withstand the loss of a single disk.
11. What variables can affect the 
performance of a RAID-5 device?
    
The major ones are: 
      
- Access speed of constituent disks
      
- Capacity of internal and external buses 
      
- Number of buses 
      
- Size of caches
      
- Number of caches
      
- The algorithms used to specify how reads and writes are 
done.
12. What types of files are suitable for 
placement on RAID-5 devices?
    
Placement of data files on RAID-5 devices is likely to give 
the
    
best performance benefits, as these are usually accessed 
randomly.
    More 
benefits will be seen in situations where reads predominate
    
over writes. Rollback segments and redo logs are 
accessed
    
sequentially (usually for writes) and therefore are not 
suitable
    
candidates for being placed on a RAID-5 device. Also, 
datafiles
    
belonging to temporary tablespaces are not suitable for 
placement
    
on a RAID-5 device.
    
Another reason redo logs should not be placed on RAID-5 devices 
is
    
related to the type of caching (if any) being done by the 
RAID
    
system. Given the critical nature of the contents of the redo 
logs,
    
catastrophic loss of data could ensue if the contents of the 
cache
    
were not written to disk, e.g. because of a power failure, 
when
    
Oracle was notified they had been written. This is 
particularly
    
true of write-back caching, where the write is regarded as 
having
    
been written to disk when it has only been written to the 
cache.
    
Write-through caching, where the write is only regarded as 
having
    
completed when it has reached the disk, is much safer, but 
still
    
not recommended for redo logs for the reason mentioned earlier. 
13. What about using multiple Database 
Writers as an alternative to RAID-5?
    
Using at least as many DBWR processes as you have database disks 
will
  
  maximise synchronous write 
capability, by avoiding one disk having to
    
wait for a DBWR process that is busy writing to another disk. 
However,
    
this is not an alternative to RAID-5, because it improves 
write
    
efficiency. And RAID-5 usually results in writes being 
slower.
14. What about other 
strategies?
    
Three strategies that can be used as alternatives to RAID-5, or 
in
    
addition to it, are Asynchronous I/O (aio) and List I/O 
(listio).
    
These are briefly described in the following points.
    
    
In addition, recent Oracle Database releases (10g and 11g) offer 
a
    
number of powerful and sophisticated features for managing 
storage.
    
For more information on these, see the books listed in 
the
    
References section. 
15. What is Asynchronous 
I/O?
    
Asynchronous I/O (aio) is a means by which a process can 
proceed
    
with the next operation without having to wait for a write 
to
    
complete. For example, after starting a write operation, the 
DBWR
    
process blocks (waits) until the write has been completed. If 
aio
    
is used, DBWR can continue almost straight away. aio is 
activated
    
by the relevant init.ora parameter, which will either be 
ASYNC_WRITE
    
or USE_ASYNC_IO, depending on the platform. If aio ia used, there 
is
    
no need to have multiple DBWRs.
    
Asynchronous I/O is optional on many UNIX platforms. It is used 
by
    
default on Windows.
16. What are the advantages and 
disadvantages of Asynchronous I/O?
    
In the above DBWR example, the idle time is eliminated, 
resulting
    
in more efficient DBWR operation. However, aio availability 
and
    
configuration is very platform-dependent; while many UNIX 
versions
    
support it, some do not. Raw devices must be used to store the 
files
    
so the use of aio adds some complexity to the system 
administrator's
    
job. Also, the applications must be able to utilise aio. 
17. What is List I/O?
     
    
List I/O is a feature found on many SVR4 UNIX variants. As 
the
    
name implies, it allows a number of I/O requests to be 
batched
    
into a "list", which is then read or written in a single
    
operation. It does not exist on Windows. 
18. What are the advantages and 
disadvantages of List I/O?
    
I/O should be much more efficient when done in this manner. 
You
    
also get the benefits of aio, so this is not needed if listio 
is
    
available. However, listio is only available on some UNIX 
systems,
    
and as in the case of aio, the system administrator needs to 
set
    
it up and make sure key applications are configured to use 
it.
19. How do Logical Volume Managers (LVMs) 
affect use of RAID-5?
    
Many UNIX vendors now include support for an LVM in their 
standard
    
product. Under AIX, all filesystems must reside on logical 
volumes.
    
Performance of a UNIX system using logical volumes can be very 
good
    
compared with standard UNIX filesystems, particularly if the 
stripe
    
size (size the chunks files are divided into) is small. Performance 
    
will not be as good as RAID-5 given that the latter uses 
dedicated
    
hardware with fast interconnects. In practice, many small and 
    
medium-sized systems will find that the use of logical volumes (with 
a
    
suitable stripe size for the type of application) 
performs
    
just as good as RAID-5. This particularly applies to systems 
where
    
there is no I/O problem. Larger systems, though, are more likely 
to
    
need the extra performance benefits of RAID-5.
20. How can I tell if my strategy to 
improve I/O performance is working?
    
On UNIX, there are several commands that can help you 
determine
    
if a disk device is contributing to I/O problems. On SVR4, use 
the
    
'sar' command with the appropriate flag, usually '-d'. On BSD, use 
the
    
'iostat' command. You are looking for disks whose request 
queue
    
average length is short, ideally zero. Disks with more than a 
few
    
entries in the queue may need attention. Also check the 
percent
    
busy value, as a disk might have a short average queue length 
yet
    
be very active.
    
On Windows, the Performance Monitor allows I/O statistics to be 
    
monitored easily and in a graphical manner. 
    
On any platform, it is essential to obtain baseline figures 
for
    
normal system operation, so you will know when a performance 
problem
    
develops and when your corrective action has restored (or 
improved
    
upon) the expected performance.
References:
RAID and Oracle - 20 Common Questions and Answers 
[ID 38281.1]