This post is part of the ZFS SSD pool series.

I moved my main hypervisor, Alpha, into a new server case; a compact Inter-Tech with 8 hot-swappable bays in the front.

I also moved the four SSDs over from the failed ZFS SSD pool project, and this time it worked! πŸ˜ƒ

Table of contents

The solution

I’ve been thinking about my failed ZFS SSD pool lately, I find it very strange that I’m having so many issues with the Samsung drives. I use the 850 EVO in other computers, and I’ve never had any issues with them.

I read the following in a Kernel.org bug report:

[…] I’ve since swapped that SSD onto an Intel 8086:8d62 controller, and it hasn’t so much as hiccupped since, with full NCQ and queued trim. β€” Solomon Peachy

Interesting… Maybe the controller was playing a part with my issues as well.

The new server case has 8 hot-swappable bays in the front. I moved the four 500 GB SSDs there, and installed an LSI card to drive them.

And I am happy to report that I’ve had no issues since. So the problem I previously had could be three things:

  • The onboard SATA controller
  • The 4 bay 2.5" drive bay
  • SATA cables

During my initial testing, I found that the error moved with the disk when I changed bay. That leaves me to believe that the issue was not with a single port, bay or cable. But more testing is required to figure out exactly what went wrong.

There seems to be something fishy with the Samsung SSDs, but some controllers handle it β€” others do not.

The pool

Just a quick recap on how I created the pool and dataset:

$ sudo zpool create spool0 -o ashift=12 \
    mirror /dev/sdb /dev/sdc \
    mirror /dev/sdd /dev/sde

$ sudo zfs set compression=lz4 spool0
$ sudo zfs set mountpoint=/srv/spool0 spool0

$ sudo zfs create spool0/home
$ sudo chown hebron:hebron /srv/spool0/home/

The speed

During my speed testing of the new SSD pool I noticed that the transfer speeds from my desktop computer was rather slow.

sent 9.26G bytes  received 4.02K bytes  179.86M bytes/sec

I figured maybe the 10 GbE card wasn’t performing well, having a look in dmesg on Alpha revealed that it wasn’t running at full capacity.

Multiqueue Enabled: Rx Queue count = 8, Tx Queue count = 8 XDP Queue count = 0
PCI Express bandwidth of 16GT/s available
(Speed:5.0GT/s, Width: x4, Encoding Loss:20%)
This is not sufficient for optimal performance of this card.
For optimal performance, at least 20GT/s of bandwidth is required.
A slot with more lanes and/or higher speed is suggested.

I’m using a dual 10 GbE Intel NIC, so the needed bandwidth could be to fully utilize both ports. I’m only using one. With iperf I confirmed that I indeed had (close to) 10 Gbit between the two computers.

[ ID] Interval           Transfer     Bitrate
[  5]   0.00-10.00  sec  10.9 GBytes  9.40 Gbits/sec

I did a test from the local NVMe drive on Alpha, to the SSD pool and got decent speeds!

$ sudo zpool iostat spool0 5
              capacity     operations     bandwidth
pool        alloc   free   read  write   read  write
----------  -----  -----  -----  -----  -----  -----
spool0      2.53G   925G      0  8.38K  4.00K   785M
spool0      4.81G   923G      3  10.2K  14.4K  1.08G
spool0      6.55G   921G      0  6.56K    819   640M
spool0      8.35G   920G      0  7.32K  3.20K   659M
spool0      9.71G   918G      0  6.03K      0   543M

So the slow transfer speed was caused by my desktop computer. Looking at smartctl for the desktop SSD showed it was only running at 3 Gb/s.

SATA Version is:  SATA 3.1, 6.0 Gb/s (current: 3.0 Gb/s)

My desktop is very old, I bought the motherboard 10 years ago β€” in 2011. It only supports SATA-300, so maximum uncoded transfer rate of 2.4 Gbit/s (300 MB/s). So no wonder I wasn’t able to fully utilize the 10 GbE network and new SSD pool.

I think I need a new desktop computer β€” with an NVMe drive πŸ˜›

The disks

You’ll find lots of warnings on the internet that using regular SSDs in a ZFS pool will burn out the drives β€” quickly. My pool is two striped mirrors, and I’m going to use them for documents, photos, projects, etc. No heavy write operations.

Since the SSDs I’m using have TLC NAND, and only a limited amount of SLC; they are not great for large write operations.

But for my planned usage I think they are going to be just fine. I’ll keep an eye on the TBW (terabytes written).