Mirror root drive after RHEL was installed on Power system

We had a Power10 system with 2x NVMe drives, and installed RHEL8 in text-mode, were configuring RAID is not possible. So the OS was installed on the first drive (/dev/nvme0n1) and the second drive (/dev/nvme1n1) was unused. To then add RAID1, the steps where:

Partitioning + Prep partition

First we copied the partitioning from nvme0n1 to nvme1n1:

# sfdisk -d /dev/nvme0n1 | sfdisk /dev/nvme1n1

and verified they looked the same:

# fdisk -l /dev/nvme0n1
# fdisk -l /dev/nvme1n1

The first partition was the “PPC PreP Boot” disk, and doesn’t need to be kept in sync, so we don’t need to use RAID for that. But it should be tagged as prep:

# parted /dev/nvme0n1 set 1 prep on
# parted /dev/nvme1n1 set 1 prep on

The two other partitions (nvme0n1p2 and nvme0n1p3) was for boot and for the LVM volume group. These we will place on separate RAID1 devices, so we mark these devices as RAID devices:

# parted /dev/nvme0n1 set 2 raid on
# parted /dev/nvme0n1 set 3 raid on
# parted /dev/nvme1n1 set 2 raid on
# parted /dev/nvme1n1 set 3 raid on

Boot partition

To move the /boot partition to RAID, we first create a degraded RAID1 array on the second disk:

(install mdadm package if missing: “rpm -ivh mdadm-4.2-7.el8.ppc64le.rpm”)

# mdadm --create /dev/md0 --level=1 --raid-disks=2 missing /dev/nvme1n1p2
mdadm: Note: this array has metadata at the start and
    may not be suitable as a boot device.  If you plan to
    store '/boot' on this device please ensure that
    your boot-loader understands md/v1.x metadata, or use
    --metadata=0.90
Continue creating array? yes
mdadm: Defaulting to version 1.2 metadata
[ 1131.975403] md/raid1:md0: active with 1 out of 2 mirrors
[ 1131.975443] md0: detected capacity change from 0 to 1071644672
mdadm: array /dev/md0 started.

It will warn about this maybe not being suitable for boot device, but ignore that, and continue creating the array. Power boots from the PPC PreP Boot device, not from /boot.

Then create file system on it, mount it and copy everything over from the original /boot disk:

# mkfs.xfs /dev/md0
# mount /dev/md0 /mnt
# cp -a /boot/. /mnt
# umount /boot
# umount /mnt

Then we find the UUID of the md0 volume, and update it in /etc/fstab and mount it:

# blkid|grep md0
/dev/md0: UUID="hjsdh-sdsds-s-ss-ss" TYPE="xfs"

# vi /etc/fstab
   UUID=hjsdh-sdsds-s-ss-ss /boot    xfs defaults 0 0 

# systemctl daemon-reload
# mount /boot

and finally add the old disk into the RAID1 array:

# mdadm /dev/md0 -a /dev/nvme0n1p2

Now the RAID array should be rebuilding, and status can be monitored with mdadm --detail /dev/md0

LVM VG

On /dev/nvme0n1p3 we have the main LVM VG name “rhel”. This can be moved online to a RAID1 volume by adding and removing volumed in the VG. First we create a degraded RAID1 array on the unused disk:

# mdadm --create /dev/md1 --level=1 --raid-disks=2 missing /dev/nvme1n1p3 --metadata=1.0

add id to the VG:

# vgextend rhel /dev/md1

then we can move data from the old disk to the RAID array with:

# pvmove /dev/nvme0n1p3 /dev/md1 

and when that completes, we can delete the old disk from the VG and add it to the RAID array:

# vgreduce rhel /dev/nvme0n1p3
# pvremove /dev/nvme0n1p3
# mdadm /dev/md1 -a /dev/nvme0n1p3

and monitor that it’s now syncing with watch mdadm --detail /dev/md1 .

GRUB

After both RAID1 arrays are in sync, we also need to update the grub configuration. First find the UUID of the md-devices:

# mdadm -D /dev/md0 | awk '$1 == "UUID" {print "rd.md.uuid=" $3}'
rd.md.uuid=b14d2fd8:d4f871af:c0e0d839:54fd1c76

# mdadm -D /dev/md1 | awk '$1 == "UUID" {print "rd.md.uuid=" $3}'
rd.md.uuid=118ce9ce:2627ecb0:d6b7436c:e530d2cd

then update GRUB_CMDLINE_LINUX in /etc/default/grub to include this rd.md.uuid= for each of these:

GRUB_CMDLINE_LINUX="rd.md.uuid=cvxcvxcv:cvcxvxcvx:cvxcvxc:xcvcvxcv rd.md.uuid=dsadsasd:dasDasdsad:dasdsadsa:dadsd crashkernel=2G-4G:384M,etc...."

Then update the active grub config:

# grub2-mkconfig -o /boot/grub2/grub.cfg

Manually add the second disk to /boot/grub2/device.map if missing. It should look like:

(hd0)    /dev/nvme0n1
(hd1)    /dev/nvme1n1

Update the PPC PreP Boot disks:

# grub2-install /dev/nvme0n1p1
# dd if=/dev/zero of=/dev/nvme1n1p1
# grub2-install /dev/nvme1n1p1

and rebuild initrd with mdadm config:

# mdadm --examine --scan > /etc/mdadm.conf
# dracut -f --mdadmconf

Then again verify that RAID arrays are in sync before crossing fingers are hoping the machine comes back up after a reboot.

This routine has been tested 0,2,6 multiple times so far. Seems good.