Saturday, April 20, 2024

Setting the Oracle ASM Disk Repair Timer

 

Setting the Oracle ASM Disk Repair Timer

 

As soon exadata storage server hit a disk failure and the disk is no more available, ASM marks this disk offline and time starts ticking until disk_repair_time value is reached. If the issue is not fixed and disk is still not available asm drops the disk from diskgroup. Once the disk is dropped rebalance operation is triggered, which may take longer to complete depending on many factors like power limit used, amount of data to rebalance etc..

 

Once the disk is available to server again you will add it back to the diskgroup and again rebalance operation will take place subject to size and power limit.

 

SQL> SELECT GROUP_NUMBER, PASS, STATE FROM V$ASM_OPERATION;
 
GROUP_NUMBER PASS      STAT
------------ --------- ----
           1 RESYNC    RUN
           1 REBALANCE WAIT
           1 COMPACT   WAIT

 

The Oracle ASM disk repair timer represents the amount of time a disk can remain offline before it is dropped by Oracle ASM. While the disk is offline, Oracle ASM tracks the changed extents so the disk can be resynchronized when it comes back online. The default disk repair time is 3.6 hours. If the default is inadequate, then the attribute value can be changed to the maximum amount of time it might take to detect and repair a temporary disk failure. The following command is an example of changing the disk repair timer value to 8.5 hours for the DATA disk group:

 

SQL> ALTER DISKGROUP data SET ATTRIBUTE 'disk_repair_time' = '8.5h'

 

The disk_repair_time attribute does not change the repair timer for disks currently offline. The repair timer for those offline disks is either the default repair timer or the repair timer specified on the command line when the disks were manually set to offline. To change the repair timer for currently offline disks, use the OFFLINE command and specify a repair timer value. The following command is an example of changing the disk repair timer value for disks that are offline:

 

SQL> ALTER DISKGROUP data OFFLINE DISK data_CD_06_cell11 DROP AFTER 20h;

 

 

To check repair times for all mounted disk groups – log into the ASM instance and perform the following query:

 

SQL> select dg.name,a.value from v$asm_diskgroup
dg, v$asm_attribute a where dg.group_number=a.group_number and
a.name='disk_repair_time';

 

Note:

 

Vulnerability to a double failure increases in line with increases to the disk repair time value

 

No comments:

Post a Comment