Adding two SATA SSD devices to a RAID10 six SATA array.

I needed a bit more space on my NAS’s RAID10 array which was 6 x 2TB drives.To be honest iI’m not sure why I am using RAID10. The array was initially on my main Linux workstation. Then I decided I needed a separate NAS as my existing NAS was way too small, so I moved it across. RAID5/6 would give me more space. But I guess I like the flexibility and the ability to survive failure of two disks even though it does reduce space by 50%!

The server is running Debian and is headless. I know there’s loads of NAS OSs. But I do prefer to do things myself. The boot/root partitions are on a single SATA SSD card and the (now) eight drives are plugged into a Seagate Smart Host bus Adaptor H240.

I found two “consumer” Crucial 2TB SSD disks on Black Friday for £65 which seemed reasonable. I did wonder how well two SSDs would do in a RAIDarray with spinning drives. Let’s find out….! So this is what I did (which I am blogging about so I do not have to remember next time!). Interestingly the last time I blogged about growing a RAID array was quite some time ago. Also that’s the only time I’ve ever got comments on my blog (127 to be exact!). I think growing RAID arrays with MDADM was quite new back then.

The procedure to add new devices and grow the raid array is:

Procedure to grow an array with two new disks

  • physically add new disks
  • Partition
  • add disks to array
  • increase number of active disks in array to grow the array
  • grow filesystem

The new disks are /dev/sda /dev/sdd

Partition

For GPT use sgdisk to copy the partition table from one disk to a new one

Backup first of course!

sgdisk /dev/sdX -R /dev/sdY

sgdisk -G /dev/sdY

“The first command copies the partition table of sdb to sda/d

sgdisk /dev/sdb -R /dev/sda

sgdisk /dev/sdb -R /dev/sdd

Now randomise the GUID of each device:

sgdisk -G /dev/sdd

sgdisk -G /dev/sda

Add new devices

mdadm --add /dev/md1 /dev/sdd1 /dev/sda1

mdadm: added /dev/sdd1

mdadm: added /dev/sda1

These are added as spares as the number of active devices does not change. Let’s check:

# cat /proc/mdstat
Personalities : [raid10] [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4]
md1 : active raid10 sda1[8](S) sdd1[7](S) sdc1[5] sdg1[0] sde1[6] sdh1[3] sdb1[1] sdf1[4]
5860141056 blocks super 1.2 512K chunks 2 near-copies [6/6] [UUUUUU]
bitmap: 4/22 pages [16KB], 131072KB chunk

Increase number of devices to include new ones

mdadm --grow --raid-devices=8 --backup-file=/mnt/USB/grown_md1.bak /dev/md1

The –backup-file creates a backup file in case the power goes. Not essential as I have a UPS. Also the filesystem is still mounted. However, to speed it up I turned off all services except the DNS/DHCP server. The less disk activity the quicker the reshape will finish.

The reshaping took about 20 hours. Much less than I thought.

Now we need to resize the filesystem. Unmounting is not essential for growing a ext4 filesystem (although it is for shrinking), but hey it’s a lot safer so I shut off everything and unmounted it

systemctl stop smbd
systemctl stop docker
umount /mnt/storage

resize2fs /dev/md1

This gave an error that the filesystem needed checking first.

e2fsck -f /dev/md1

resize2fs /dev/md1

This took about 30 minutes.

Finishing off.

Now let’s get it all back up and running.

mount /mnt/storage

systemctl start docker

systemctl start samba

systemctl start smbd

systemctl start docker

systemctl restart docker

The entry for the mdadm device does not need updating. Previous it did but I think that’s when I was using the 0.9 metadata block.

mdadm --detail --scan

cat /etc/mdadm/mdadm.conf

One bizarre issue was that when I restarted all the docker containers they downloaded a new image rather than using existing images. I have no idea why that happened.

Migrating My Root Drive Raid1 Array to a Pair of NVME Drives.

I’ve always tried to keep upgrading my own Linux boxes. I enjoy it and as I found out a few years ago if I do not regularly keep updating hardware then I completely lose knowledge of doing so.
My latest project is to move all the services from my workstation to a separate box. I’ve got a few RPis around the house running a few services, but I’ve always used my workstation as a games machine/server/development/everything else. In fact for a number of years I stopped using Linux as a workstation at all and this machine was a headless server.
Because of this cycle of continuous upgrades this computer has existed for probably twenty years. Always running some form of Linux (mainly Gentoo).
Currently it’s using some space heater of a server motherboard with a pair of E5 2967v2 on a Supermicro X9dri-LN4 motherboard with 128Gb of ECC DDR3 RAM (very cheap!). That’s fine over winter (my office has no heating) but I do need to fully transfer all the services to the low power Debian box under my desk instead!
Anyway this computer boots from pair of SATA SSD drives in a RAID1 array, with a six disk RAID10 array for data. That array needs to be replaced by a single large drive when I’ve finished moving services to a new machine….!
The motherboard is too old to EFI boot from NVME drives. However, whilst browsing Reddit I came across some people talking about using an adaptor card to add in four NVME drives and using bifurcation toggle each drive proper access to the four PCIe channels that NVME devices need. So x4,x4,x4,x4 instead of x16.
This was not supported on this board, but turns out Supermicro did release a new BIOS that does support bifurcation.
So I bought the card they suggest and a pair of 1TB NVMe drives. The drives are only PCIev3 as that’s all the motherboard supports. PCIe is backwards/forwards compatible. but PCIev4 drives are considerably more expensive than PCIeV3 ones. I may as well get a pair of these, then when I upgrade to a PCIeV4 motherboard the available drives will likely be larger and cheaper!
– Asus M.2. X16 Gen 4
– 2 x WD Blue SN570 NVMe SSD 1TB
The adaptor and cards came. The adapter has got a lovely heatsink that sandwiches the drives in with a small low noise fan.
The adaptor took ten minutes to install. When booted up the BIOS setup disk was a little tricky to enable bifurcation as the slots are numbered from the bottom. This one was CPU1/slot 1.
I had to recompile the kernel to add NVME support, but once booted the pair of drives were there.
After many, many years of using /dev/sdX to refer to storage devices (I was using SCSI hardware before SATA), it does seem a little strange to be running parted on /dev/nvme1n1 then getting partition devices like /dev/nvme1n1p2
I know I should likely move to ZFS, but I’m knowledgeable enough about mdadm not to completely mess up things! ..and replacing a pair of RAID1 devices is just so easy with mdadm.

Workflow is:
– partition drive.
– add to raid array as spare
– fail drive to be removed and then remove it.
– Wait until raid1 array is synced again.
– repeat with second drive
– resize array, then resize the filesystem

procedure

fdisk /dev/nvme0n1

We can use fdisk again as fdisk is GPT aware. Previously we’d always used parted. But I prefer fdisk as I know it!
– partition
– Label the drive as GPT
– Make 256mb partition and mark as EFI boot.
– Make 2nd partition for the rest of the drive and mark that as type Linux raid.

Now add that drive to our raid1 array. For some reason it was not added to a spare, but was instead immediately synced to make a 3 drive raid1 array. I think this is because I previously created this array as a three drive array (for reasons I forget). I guess that’s stored in the metadata of the array.

mdadm /dev/md127 --add /dev/nvme0n1p2

We can watch the progress:

watch cat /proc/mdstat

Once completed we can fail and then remove the drive

mdadm --manage /dev/md127 --fail /dev/sdh3
mdadm /dev/md127 --remove /dev/sdh3

Then let’s update our mdadm.conf file

mdadm --detail --scan >> /etc/mdadm/mdadm.conf

Then remove the oldlines:

vi /etc/mdadm/mdadm.conf

Finally let’s wipe that old drive of RAID data so the array does not try to reassemble it to the array.

wipefs -a /dev/sdh3

A reboot is a good idea now to ensure the array is correctly assembled (and new PAT reread).

Now let#s copy the partition table (PAT) to the second new drive.

sgdisk /dev/nvme0n1 -R /dev/nvme1n1

Then randomise the UIDs

sgdisk -G /dev/nvme1n1

Check all is OK

Now repeat adding the second new drive:

mdadm /dev/md127 --add /dev/nvme1n1p2
mdadm --manage /dev/md127 --fail /dev/sdg3
mdadm /dev/md127 --remove /dev/sdg3
mdadm --detail --scan >> /etc/mdadm/mdadm.conf
wipefs -a /dev/sdg3

resize after adding new devices

mdadm –grow –size max /dev/md127
df -h
Then resize the filesystem
resize2fs -p /dev/md127

benchmarks

dd if=/dev/zero of=/home/chris/TESTSDD bs=1G count=2 oflag=dsync 

2+0 records in
2+0 records out
2147483648 bytes (2.1 GB, 2.0 GiB) copied, 4.7162 s, 455 MB/s

dd if=/dev/zero of=/mnt/storage/TESTSDD bs=1G count=2 oflag=dsync

2+0 records in
2+0 records out
2147483648 bytes (2.1 GB, 2.0 GiB) copied, 10.0978 s, 213 MB/s

dd if=/home/chris/TESTSDD of=/dev/null bs=8k                                                              !10032

262144+0 records in
262144+0 records out
2147483648 bytes (2.1 GB, 2.0 GiB) copied, 0.666261 s, 3.2 GB/s

dd if=/mnt/storage/TESTSDD of=/dev/null bs=8k                                                             !10034

262144+0 records in
262144+0 records out
2147483648 bytes (2.1 GB, 2.0 GiB) copied, 0.573111 s, 3.7 GB/s

I did think that the write speed would be faster. But I guess dd is not the most accurate of benchmark tools.

Zsh and Its Searchable History

So I guess this says a lot about me and the limited things I do on my Linux box as well as the power of ZSH’s searchable history, but I find myself rarely typing commands from scratch. Instead typing the first few letters of the command then using the up cursor arrow to search for the last time I ran that command. It is so, so useful. Rarely do I need to grep my way through my .history file. For commands such a checking a Duplicity backup to Backblaze’s B2 buckets, where I need long strings of my keys it is essential. But for even simple commands like updating my Gentoo setup it is just so useful.

Can you remember this every time?
Duplicity collection-status b2:// [22 character string]:[22 character string]@BucketName/folder

I should remember this one, but I never remember those parameters…

Emerge -uDNav @world —keep going

For bash that was:
history | grep xxxx
– Then typing the line number.
– Hitting ctrl-c to kill that command
– up cursor key then editing the command before hitting return.

With ZSH I type the first few characters and then the cursor. Even when you search and use the line number it allows you to edit that command before running it.

By default this behaviour is not enabled. But edit your .zshrc/.zprofile files and bind the UP/DOWN cursor keys to these two options:
bindkey "^[[A" history-beginning-search-backward
bindkey "^[[B" history-beginning-search-forward

Oh and if you use zsh (it’s the default shell on macOS nowadays) then you really should use Oh My Zsh

Upgrading Linux boxes

After returning to upgrading my main Linux box (due to playing with Docker and using some stuff (genome annotation pipelines) that needs more than the system max of 16GB RAM) I came across this blog post about a similar situation (albeit more time). The beautiful machine
I’ve generally always upgraded my own computers. My main Linux box has been upgraded in bits and pieces for some time. I think the oldest current part is the case that is at least 15 years old. It’s been heavily updated with sound insulation from the car audio scene, although TBH it is much quieter now than when the motherboard was an Asus PC-DL running a pair of power hungry overclocked Xeons.
However, for one reason or another (lack of time, stable hardware, iPadOS etc) I’ve not done this for some time. My box was last opened two years ago when the TV tuner card (PCIe TBS6980) died and I replaced it, with an almost exact model. The previous real upgrade was seven or more years ago.
So I got myself a 10 yr old server board with dual Xeons and Max 32gb ram. Intel S5500BC board with a pair of Xeon E6240s. The setup only cost £70 but each CPU is far faster than the previous single Xeon X3470. Plus max RAM is double.
However, when I could previously swap over a motherboard in less than an hour, now we got multiple beginner errors.
– First error was refusing to boot due to a grounding error. I’d assumed there would be the usual 9 standoffs in the ATX format. Nope. There is no motherboard hole for a middle bottom standoff. What compounded the gorging error was that the lower left standoff was too short. Whoops!
– Then I’d not inserted RAM correctly. Turns out a proper server board does not fail to POST, but just omits the DIMM slots and allows the rest to work. Luckily a red LED indicates the incorrect DIMM slots.
– Then all 32GB of RAM (8 x 4GB) was recognised, but once booted into Linux only 24 GB was seen. Turns out another beginner error and the DIMM was inserted enough to be recognised, but not enough to work properly.
As well as the beginner errors the board is a server one and CPU fans ran so fast that it was difficult to think with the noise. Turns out most Intel server boards are intended to be paired with an Intel chassis. If the board does not detect the chassis it just switches on the fans full speed instead of managing then due to CPU heat! Noisy!
I found a few blog posts on reflashing the BIOS to a more recent one AND also something called a Baseboard Management Controller. When did they come along? This adds a non-Intel chassis profile for fan speed and allows it to be managed in line with CPU heat.
Even though Intel have EOLed these boards, I still found the latest BIOS on their site. The current BIOS was so old though I needed to update to an intermediate build, BIOS 66. Then flash to BIOS 69 which is the latest. Flashing the BIOS on servers boards is easy! The board used EFI can can be booted to a console, which allows you to flash the BIOS from a USB stick. Even easier there’s a BAT script to do this from the USB. Funky!
BUT the BMC firmware was very difficult to find. I eventually found it on a niche You Tube video.
Anyway the lesson I should learn is hardware upgrades can only be easy if you spend a lot of money. If you want to save money then you need to do them regularly to keep your skills up!

old-fashioned kernel upgrading

I keep the kernel on my Linux box fairly up to date. With more or less every point release, after my distro, Gentoo, has released a fairly ‘mature’ patched version, I upgrade. However, I’m thinking that I’m using some pretty old fashioned technqiues in doing so. For example I manually configure the kernel, my boot loader is LILO, and I do not use any of the distro’s helpers.

My usual procedure is:

copy the old config direct from /PROC and using the ‘oldconfig’ option update the config with all new options for the new kernel. Since I rarely leave this more than 1 version difference there’s generally only 20 or so differences:

cp /proc/config.gz
gunzip config.gz
cp config /usr/src/NEWKERNEL/.config
cd /usr/src/NEWKERNEL
make oldconfig

Once that’s done I compile the kernel using the bzImage image, and compile the modules and install them at the same time.

make bzImage && make modules && make modules_install

Incidentally that double ampersand is a cool shortcut. If the previous command ends with an error it does not run.

Once compiled, I change the /usr/src/linux link to the new kernel, copy it to the boot folder, add the new kernel to the lilo.conf file, run lilo, n reboot with prayer to whatever humanist non-deity you don’t believe in!

rm /usr/src/linux
ln -s /usr/src/NEWKERNEL linux
vi /etc/lilo.conf
lilo
reboot

Incidentally if you use a distro that stores the Linux headers, or rather iuses the kernel ones, in /usr/src/linux, then be careful changing this link. Luckily the distro I use stores these in another place, so you can upgrade kernels willy nilly, without affecting what glibc is compiled with.

Use of Backup in Anger!

I lost my entire RAID10 array yesterday. In a fit of “too much noise in office” I removed the hot swap SCSI array box from my workstation box, attached it to a wooden platform, and suspended it in a large plastic box using an old inner tube from my bike. This really reduced the noise, however, like a moron, I did not attach the scsi cable properly and 2 drives got kicked from the array. That was not a problem. However, what was, is when I tried to re-assemble the array without checking the cable. I ended up wiping one of the raid partitions. Still not a major issue, except I subsequently zeroed out the superblock of the missing drive in order to add it back in. Anyway, that was my array lost!

As a main backup strategy I use an homebrewed incremental Rsync script to back up my Linux workstation everynight to a 2Tb ReadyNas+ box (Macs are backed up with a combination of Time Machine and Super Duper). So now I had a chance to test it out. So after recreating the array and copying the data across the network I was back up and running!

mdadm --create /dev/md1 --chunk=256 -R  -l 10 -n 4 -p f2 /dev/sd[abcd]3 
echo 300000 >> /sys/block/md1/md/sync_speed_max 
watch cat /proc/mdstat 
mkfs.xfs /dev/md1
mount /home
rsync -avP /mnt/backup/SCOTGATEHome/current/ /home/

It took about 1 hour to sync, and then 3 hours to copy across the 156Gb of files over the network.

It all worked great, and I’m very pleased to know that my backup strategy is working!

Now back to complete the “silent and suspended hard drive array!”

Google Browser sync now open source!

One of the downsides to Firefox 3 is the mouse scrolling speed on Linux (not on OS X), and the fact I can no longer share my passwords, my bookmarks, and cookies between my Mac and Linux box. I did take a look at Weave, but it seemed pretty poor so far (although I’m sure it will get better).

Well the first step in Firefox3 browser synch’ing has been taken.

http://google-opensource.blogspot.com/2008/07/open-sourcing-browser-sync.html

Cool. Now all I need to do is to wait for somebody to port it to FF3 and I’m set! That may take a while, but at least it’s now possible.

One thing that seems not to have been noticed as that without an encrypted backend to store the synched info the plugin is useless. The util did not sync between browsers, but it was browser to encrypted backend at Google. I wonder if Google would still allow that? Still at least it’s now possible.

Perhaps those Weave people will take this code to better their own? GBS did work pretty dammed well, so that would be a good idea.

Don’t just leave your backups alone, check them!

I use a Readynas+ to backup all my Linux using a homegrown incremental Rsync script and my Mac using SuperDuper.  For the last few weeks the space on the little Nas has been getting pretty low.  I just thought that I just had a lot of stuff…  But thinking further, 1.4Tb is a hell of a lot of stuff.  Where on earth was my space going?  A few uses of du -h showed that I had 200Gb of music and film.  My Mac took over 300Gb (3 x rotated SuperDuper sparse images).  My main Linux box took a lot, but suprisingly my Mythtv backup took 0.5Tb.  WTF? I only keep 3 incremental backups (plus an archived one every now and again) of that machines root partition. e.g. 15Gb max!  Then it dawned on me.  With Myth 0.21 new Disc Groups, I added a new 300Gb partition for storage, BUT I never added it to the exclude list for my Rsync backup. So I’d been backing up all my recordings for the last few months. Doh!!

That’s quite a lot to backup on a nightly basis!

SMBFS to CIFS

Note to self.  If you are ever going to migrate some Samba shares from using the deprecated SMBFS to using CIFS instead, then instead of just changing the filesystem in FSTAB, then wondering why the dammed thing refuses to mount. Try reading up a bit and realise that the actual mount utility is a different one and is likely not to be installed.

What a numpty.  Anyway 2 tips for anybody whos doing this:

  1. The following is a great way to increase the debug output:
    echo 1 >/proc/fs/cifs/cifsFYI
  2. install mount_cifs

Linux Raid Wiki

There’s a lot of outdated stuff concerning Linux software raid out there. Over the last 6 months there’s been a concerted effort by people on the Linux Raid mailing list to improve this situation. The continuing results can be seen on this Wiki:

http://linux-raid.osdl.org/

It’s a great resource already, and getting better. Go have a look if you want to know about the current status of just what cool stuff you can do.