This post is part of the ZFS SSD pool series.
I moved my main hypervisor, Alpha, into a new server case; a compact Inter-Tech with 8 hot-swappable bays in the front.
I also moved the four SSDs over from the failed ZFS SSD pool project, and this time it worked! 😃
Table of contents
I’ve been thinking about my failed ZFS SSD pool lately, I find it very strange that I’m having so many issues with the Samsung drives. I use the 850 EVO in other computers, and I’ve never had any issues with them.
I read the following in a Kernel.org bug report:
[…] I’ve since swapped that SSD onto an Intel 8086:8d62 controller, and it hasn’t so much as hiccupped since, with full NCQ and queued trim. — Solomon Peachy
Interesting… Maybe the controller was playing a part with my issues as well.
The new server case has 8 hot-swappable bays in the front. I moved the four 500 GB SSDs there, and installed an LSI card to drive them.
And I am happy to report that I’ve had no issues since. So the problem I previously had could be three things:
- The onboard SATA controller
- The 4 bay 2.5" drive bay
- SATA cables
During my initial testing, I found that the error moved with the disk when I changed bay. That leaves me to believe that the issue was not with a single port, bay or cable. But more testing is required to figure out exactly what went wrong.
There seems to be something fishy with the Samsung SSDs, but some controllers handle it — others do not.
Just a quick recap on how I created the pool and dataset:
$ sudo zpool create spool0 -o ashift=12 \ mirror /dev/sdb /dev/sdc \ mirror /dev/sdd /dev/sde $ sudo zfs set compression=lz4 spool0 $ sudo zfs set mountpoint=/srv/spool0 spool0 $ sudo zfs create spool0/home $ sudo chown hebron:hebron /srv/spool0/home/
During my speed testing of the new SSD pool I noticed that the transfer speeds from my desktop computer was rather slow.
sent 9.26G bytes received 4.02K bytes 179.86M bytes/sec
I figured maybe the 10 GbE card wasn’t performing well, having a look in
dmesg on Alpha revealed that it wasn’t running at full capacity.
Multiqueue Enabled: Rx Queue count = 8, Tx Queue count = 8 XDP Queue count = 0 PCI Express bandwidth of 16GT/s available (Speed:5.0GT/s, Width: x4, Encoding Loss:20%) This is not sufficient for optimal performance of this card. For optimal performance, at least 20GT/s of bandwidth is required. A slot with more lanes and/or higher speed is suggested.
I’m using a dual 10 GbE Intel NIC, so the needed bandwidth could be to fully utilize both ports. I’m only using one. With
iperf I confirmed that I indeed had (close to) 10 Gbit between the two computers.
[ ID] Interval Transfer Bitrate [ 5] 0.00-10.00 sec 10.9 GBytes 9.40 Gbits/sec
I did a test from the local NVMe drive on Alpha, to the SSD pool and got decent speeds!
$ sudo zpool iostat spool0 5 capacity operations bandwidth pool alloc free read write read write ---------- ----- ----- ----- ----- ----- ----- spool0 2.53G 925G 0 8.38K 4.00K 785M spool0 4.81G 923G 3 10.2K 14.4K 1.08G spool0 6.55G 921G 0 6.56K 819 640M spool0 8.35G 920G 0 7.32K 3.20K 659M spool0 9.71G 918G 0 6.03K 0 543M
So the slow transfer speed was caused by my desktop computer. Looking at
smartctl for the desktop SSD showed it was only running at 3 Gb/s.
SATA Version is: SATA 3.1, 6.0 Gb/s (current: 3.0 Gb/s)
My desktop is very old, I bought the motherboard 10 years ago — in 2011. It only supports SATA-300, so maximum uncoded transfer rate of 2.4 Gbit/s (300 MB/s). So no wonder I wasn’t able to fully utilize the 10 GbE network and new SSD pool.
I think I need a new desktop computer — with an NVMe drive 😛
You’ll find lots of warnings on the internet that using regular SSDs in a ZFS pool will burn out the drives — quickly. My pool is two striped mirrors, and I’m going to use them for documents, photos, projects, etc. No heavy write operations.
Since the SSDs I’m using have TLC NAND, and only a limited amount of SLC; they are not great for large write operations.
But for my planned usage I think they are going to be just fine. I’ll keep an eye on the TBW (terabytes written).
Last commit 2021-05-31, with message: add dedicated computer pages