Maintaining PMEM Devices
on Oracle Exadata Storage Servers
Persistent memory
(PMEM) devices reside in Exadata X8M-2 and X9M-2 storage
server models with High Capacity (HC) or Extreme Flash (EF) storage.
If a PMEM device
fails, Oracle Exadata System Software isolates the failed device and
automatically recovers the cache using the device.
If the cache is in
write-back mode, the recovery operation, also known as resilvering, restores
the lost data by reading a mirrored copy. During resilvering, the grid disk
status is ACTIVE --
RESILVERING WORKING. If the cache is in
write-through mode, then the data in the failed PMEM device is already stored
in the data grid disk, and no recovery is required.
- Replacing a PMEM Device Due
to Device Failure
If the PMEM device has a status of Failed, you should replace the PMEM device on the Oracle Exadata Storage Server.
- Replacing a PMEM Device Due
to Degraded Performance
If a PMEM device has degraded performance, you might need to replace the module.
Replacing a PMEM Device Due to Device Failure
If the PMEM device has
a status of Failed, you should replace the PMEM device on
the Oracle Exadata Storage Server.
A PMEM fault could
cause server to reboot. The failed device should be replaced with a new PMEM
device at the earliest opportunity. Until the PMEM device is replaced, the
corresponding cache size is reduced. If the PMEM device is used for commit
acceleration (XRMEMLOG or PMEMLOG), then the size of
the corresponding commit accelerator is also reduced.
An alert is generated when a PMEM device
failure is detected. The alert message includes the slot number and cell disk
name. If you have configured the system for alert notification, then an alert
is sent by e-mail message to the designated address.
To identify a failed PMEM device, you can also
use the following command:
CellCLI> LIST PHYSICALDISK WHERE disktype=PMEM AND status=failed
DETAIL
name: PMEM_0_1
diskType:
PMEM
luns: P0_D1
makeModel:
"Intel NMA1XBD128GQS"
physicalFirmware: 1.02.00.5365
physicalInsertTime:
2019-09-28T11:29:13-07:00
physicalSerial:
8089-A2-1838-00001234
physicalSize:
126.375G
slotNumber:
"CPU: 0; DIMM: 1"
status: failed
In the above output, the slotNumber shows the socket number and DIMM slot
number.
1. Locate the storage server that contains the
failed PMEM device.
A white Locator LED is lit to help
locate the affected storage server. When you have located the server, you can
use the Fault Remind button to locate the failed DIMM.
Caution:
Do not attempt to remove a faulty DCPMM DIMM
when the Do Not Service LED indicator is illuminated.
2. Power down the storage server with the failed PMEM
device and unplug the power cable for the server.
3. Replace the failed PMEM device.
·
X9M-2: See "Servicing the
DIMMs" in Oracle Exadata Storage
Server X9-2 EF, HC, and XT Service Manual at https://docs.oracle.com/en/servers/x86/x9-2l/exa-storage/servicing-dimms.html
·
X8M-2: See "Servicing the
DIMMs" in Oracle Exadata Storage
Server X8-2 EF, HC, and XT Service Manual at https://docs.oracle.com/en/servers/x86/x8-2l/exadata-storage-service-manual/gqtcm.html
4. Restart the storage server.
Note:
During the restart, the storage server will
shut down a second time to complete the initialization of the new PMEM device.
The new PMEM device is automatically used by
the system. If the PMEM device is used for caching, then the effective cache
size increases. If the PMEM device is used for commit acceleration, then commit
acceleration is enabled on the device.
Replacing a PMEM Device Due to Degraded
Performance
If a PMEM
device has degraded performance, you might need to replace the module.
If degraded
performance is detected on a PMEM device, the module status is set
to warning - predictive failure and an alert is generated. The alert
includes specific instructions for replacing the PMEM device. If you have
configured the system for alert notifications, then the alerts are sent by
e-mail message to the designated address.
The predictive failure status indicates that the PMEM device will fail
soon, and should be replaced at the earliest opportunity. No new data is cached
in the PMEM device until it is replaced.
To identify
a PMEM device with the status predictive failure, you can also use the following command:
CellCLI> LIST PHYSICALDISK WHERE disktype=PMEM AND status='warning
- predictive failure' DETAIL
name: PMEM_0_6
diskType: PMEM
luns: P0_D6
makeModel: "Intel
NMA1XBD128GQS"
physicalFirmware: 1.02.00.5365
physicalInsertTime:
2019-11-30T21:24:45-08:00
physicalSerial:
8089-A2-1838-00001234
physicalSize: 126.375G
slotNumber: "CPU: 0;
DIMM: 6"
status: warning - predictive failure
You can also locate
the PMEM device using the information in the LIST DISKMAP command:
CellCLI> LIST DISKMAP
Name
PhysicalSerial
SlotNumber Status PhysicalSize
CellDisk DevicePartition GridDisks
PMEM_0_1 8089-a2-0000-00000460 "CPU: 0; DIMM: 1" normal
126G
PM_00_cel01 /dev/dax5.0 PMEMCACHE_PM_00_cel01
PMEM_0_3
8089-a2-0000-000004c2 "CPU:
0; DIMM: 3" normal 126G
PM_02_cel01 /dev/dax4.0 PMEMCACHE_PM_02_cel01
PMEM_0_5 8089-a2-0000-00000a77 "CPU: 0; DIMM: 5" normal
126G
PM_03_cel01 /dev/dax3.0 PMEMCACHE_PM_03_cel01
PMEM_0_6
8089-a2-0000-000006ff "CPU:
0; DIMM: 6" warning
- 126G
PM_04_cel01 /dev/dax0.0 PMEMCACHE_PM_04_cel01
PMEM_0_8 8089-a2-0000-00000750 "CPU: 0; DIMM: 8" normal
126G
PM_05_cel01 /dev/dax1.0 PMEMCACHE_PM_05_cel01
PMEM_0_10 8089-a2-0000-00000103
"CPU: 0; DIMM: 10" normal
126G
PM_01_cel01 /dev/dax2.0 PMEMCACHE_PM_01_cel01
PMEM_1_1 8089-a2-0000-000008f6 "CPU: 1; DIMM: 1" normal
126G
PM_06_cel01 /dev/dax11.0 PMEMCACHE_PM_06_cel01
PMEM_1_3
8089-a2-0000-000003bb "CPU:
1; DIMM: 3" normal 126G
PM_08_cel01 /dev/dax10.0 PMEMCACHE_PM_08_cel01
PMEM_1_5 8089-a2-0000-00000708 "CPU: 1; DIMM: 5" normal
126G
PM_09_cel01 /dev/dax9.0 PMEMCACHE_PM_09_cel01
PMEM_1_6
8089-a2-0000-00000811 "CPU:
1; DIMM: 6" normal 126G
PM_10_cel01 /dev/dax6.0 PMEMCACHE_PM_10_cel01
PMEM_1_8 8089-a2-0000-00000829 "CPU: 1; DIMM: 8" normal
126G
PM_11_cel01 /dev/dax7.0 PMEMCACHE_PM_11_cel01
PMEM_1_10 8089-a2-0000-00000435
"CPU: 1; DIMM: 10"
normal 126G
PM_07_cel01 /dev/dax8.0 PMEMCACHE_PM_07_cel01
If the PMEM device is used for
write-back caching, then the data is flushed from the PMEM device to
the flash cache. To ensure that data is flushed from the PMEM device,
check the cachedBy attribute of all the grid disks and
ensure that the affected PMEM device is not listed.
1. Locate the storage server that contains the
failing PMEM device.
A white Locator LED is lit to help
locate the affected storage server. When you have located the server, you can
use the Fault Remind button to locate the failed DIMM.
Caution:
Do not attempt to remove a faulty DCPMM DIMM
when the Do Not Service LED indicator is illuminated.
2. Power down the storage server with the
failing PMEM device and unplug the power cable for the server.
3. Replace the failing PMEM device.
·
X9M-2: See "Servicing the
DIMMs" in Oracle Exadata Storage
Server X9-2 EF, HC, and XT Service Manual at https://docs.oracle.com/en/servers/x86/x9-2l/exa-storage/servicing-dimms.html
·
X8M-2: See "Servicing the
DIMMs" in Oracle Exadata Storage
Server X8-2 EF, HC, and XT Service Manual at https://docs.oracle.com/en/servers/x86/x8-2l/exadata-storage-service-manual/gqtcm.html
4. Restart the storage server.
Note:
During the restart, the storage server will
shut down a second time to complete the initialization of the new PMEM device.
The new PMEM device is automatically used by
the system. If the PMEM device is used for caching, then the effective cache
size increases. If the PMEM device is used for commit acceleration, then commit
acceleration is enabled on the device.
Enabling and Disabling Write-Back PMEM Cache
Prior to Oracle
Exadata System Software release 23.1.0, you can configure PMEM
cache to operate in write-back mode. Also known as write-back PMEM
cache, this mode enables the cache to service write operations.
Note:
The best practice
recommendation is to configure PMEM cache in write-through mode. This
configuration provides the best performance and availability.
Commencing
with Oracle Exadata System Software release 23.1.0, PMEM
cache only operates in write-through mode.
- Enable Write-Back PMEM Cache
- Disable Write-Back PMEM Cache
Use these steps if you need to disable Write-Back PMEM cache on the storage servers.
Enable Write-Back PMEM Cache
Write-back PMEM
cache is only supported in conjunction with write-back flash cache.
Consequently, to enable write-back PMEM cache you must also enable
write-back flash cache.
Note:
Commencing
with Oracle Exadata System Software release 23.1.0, you cannot
enable write-back PMEM cache because PMEM cache only
operates in write-through mode.
Note:
To reduce the
performance impact on your applications, change the cache mode during a period
of reduced workload.
The following command examples use a text file
named cell_group that contains the host names of the
storage servers that are the subject of the procedure.
1. Check the current flash cache mode setting (flashCacheMode):
# dcli –l root –g cell_group cellcli -e "list
cell detail" | grep flashCacheMode
2. If the flash cache is in write-back mode:
a. Validate that all the physical disks are
in NORMAL state before modifying the flash cache.
# dcli –l root –g cell_group cellcli –e "LIST
PHYSICALDISK ATTRIBUTES name,status" | grep –v NORMAL
The command should return no rows.
b. Determine amount of dirty data in the flash
cache.
# dcli –g cell_group –l root cellcli -e "LIST
METRICCURRENT ATTRIBUTES name,metricvalue WHERE name LIKE \'FC_BY_DIRTY.*\'
"
c. Flush the flash cache.
If the flash cache utilizes all available
flash cell disks, you can use the ALL keyword instead
of listing the flash disks.
# dcli –g cell_group –l root cellcli -e "ALTER
FLASHCACHE CELLDISK=\'FD_02_dm01celadm12,
FD_03_dm01celadm12,FD_00_dm01celadm12,FD_01_dm01celadm12\'
FLUSH"
d. Check the progress of the flash cache flush
operation.
The flushing process is complete when the
metric FC_BY_DIRTY is zero.
# dcli -g cell_group -l root cellcli -e "LIST
METRICCURRENT ATTRIBUTES name, metricvalue WHERE name LIKE \'FC_BY_DIRTY.*\'
"
Or, you can check to see if the
attribute flushstatus is set to Completed.
# dcli -g cell_group -l root cellcli -e "LIST
CELLDISK ATTRIBUTES name, flushstatus, flusherror" | grep FD
e. After the flash cache is flushed, drop the
flash cache.
# dcli -g cell_group -l root cellcli -e "drop
flashcache"
f. Modify the cell to use flash cache in
write-back mode.
# dcli -g cell_group -l root cellcli -e "ALTER
CELL flashCacheMode=writeback"
g. Re-create the flash cache.
If the flash cache utilizes all available
flash cell disks, you can use the ALL keyword instead
of listing the cell disks.
If the size attribute is not
specified, then the flash cache consumes all available space on each cell disk.
# dcli –l root –g cell_group cellcli -e "create
flashcache celldisk=\'FD_02_dm01celadm12,
FD_03_dm01celadm12,FD_00_dm01celadm12,FD_01_dm01celadm12\'
h. Verify that flashCacheMode is set to writeback.
# dcli –l root –g cell_group cellcli -e "list
cell detail" | grep flashCacheMode
3. Flush the PMEM cache.
If the PMEM cache utilizes all
available PMEM cell disks, you can use the ALL keyword as shown
here.
# dcli –l root –g cell_group cellcli -e "ALTER
PMEMCACHE ALL FLUSH"
Otherwise, list the specific disks using
the CELLDISK="cdisk1 [,cdisk2]
..." clause.
4. Drop the PMEM cache.
# dcli –l root –g cell_group cellcli -e "DROP
PMEMCACHE"
5. Modify the cell to use PMEM cache in
write-back mode.
# dcli –l root –g cell_group cellcli -e "ALTER
CELL pmemCacheMode=WriteBack"
Starting with Oracle Exadata System
Software release 20.1.0, this command warns about the best practice
recommendation to use PMEM cache in write-through mode and prompts
for confirmation of the change.
6. Re-create the PMEM cache.
If the PMEM cache utilizes all
available PMEM cell disks, you can use the ALL keyword as shown here.
Otherwise, list the specific disks using the CELLDISK="cdisk1
[,cdisk2] ..." clause. If the size attribute is not
specified, then the PMEM cache consumes all available space on each
cell disk.
# dcli –l root –g cell_group cellcli -e "CREATE
PMEMCACHE ALL"
7. Verify that pmemCacheMode is set to writeback.
# dcli –l root –g cell_group cellcli -e "list
cell detail" | grep pmemCacheMode
Disable Write-Back PMEM Cache
Use these steps if you
need to disable Write-Back PMEM cache on the storage servers.
You do not have to stop the cellsrv process or
inactivate grid disks when disabling Write-Back PMEM cache. However,
to reduce the performance impact on the application, disable
the Write-Back PMEM cache during a period of reduced workload.
1. Validate all the Physical Disks are in NORMAL
state before modifying PMEM cache.
The following command should return no rows:
# dcli –l root –g cell_group cellcli –e “LIST
PHYSICALDISK ATTRIBUTES name,status”|grep –v NORMAL
2. Flush the PMEM cache.
# dcli –g cell_group –l root cellcli -e "ALTER
PMEMCACHE ALL FLUSH"
The PMEM cache flushes the dirty
data to the lower layer Write-Back Flash Cache.
3. Check that the flushing operation for
the PMEM cache has completed.
The flushing process is complete when
the PMEM devices do not show up in the cachedBy attribute for the grid disks.
CellCLI> LIST GRIDDISK ATTRIBUTES name, cachedBy
DATA_CD_00_cel01
FD_00_cel01
DATA_CD_01_cel01
FD_01_cel01
DATA_CD_02_cel01
FD_03_cel01
DATA_CD_03_cel01
FD_02_cel01
DATA_CD_04_cel01
FD_00_cel01
DATA_CD_05_cel01
FD_02_cel01
...
4. Drop the PMEM cache.
# dcli -g cell_group -l root cellcli -e drop
pmemcache all
5. Set the pmemCacheMode attribute to writethrough.
# dcli -g cell_group -l root cellcli -e "ALTER
CELL pmemCacheMode=writethrough"
6. Re-create the PMEM cache.
# dcli –l root –g cell_group cellcli -e create
pmemcache all
7. Verify the pmemCacheMode has been set to writethrough.
CellCLI> LIST CELL ATTRIBUTES pmemcachemode
WriteThrough
Good read! Used routers and switches offer dependable performance for expanding networks.
ReplyDelete