I’ve been having a strange problem with outgoing WireGuard traffic, the problem has probably always been there — I just haven’t noticed, until now. Outgoing WireGuard traffic is very slow, while incoming is what I’d except with my 750/750 fiber internet connection.
This lead me down a rabbit hole of testing performance internally, which I documented in a previous blog post. That turned out to be a queue issue on the SFP+ port on my MikroTik CHR router. Could this also be queue related?
I was “working” with the Kagi Assistant while troubleshooting this — asking it for suggestions and to analyse results. It felt like having someone to pitch ideas with.
This was my first time using an AI assistant, and I must say I am fairly impressed. While it did hallucinate a bit, it did provide me with useful analyzes and suggestions.
I’m pretty sure it cut down on my time spent troubleshooting this problem.
This post is not written by AI — I have pasted a few responses from the AI, those are clearly marked as being “AI quotes”.
Table of contents
The problem
First a quick look at the problem:
My internet connection is 750/750, I expect the WireGuard traffic to be pretty close to that — and to be symmetrical. I already knew a few things:
- My CHR router is capable of handling 3.2 Gbit/s WireGuard traffic
- My internet connection is actually 750/750
My working theory was that the WireGuard UDP traffic was getting congested somewhere, like last time. And, that the reason I haven’t noticed it before — is that I haven’t ever tested or tried with that high UDP traffic rate.
Lets first look at the speeds I was getting, outgoing (egress) was terrible:
[ ID] Interval Transfer Bitrate Retr Cwnd
[ 5] 0.00-1.00 sec 4.76 MBytes 39.9 Mbits/sec 52 123 KBytes
[ 5] 1.00-2.00 sec 6.19 MBytes 51.9 Mbits/sec 0 154 KBytes
[ 5] 2.00-3.00 sec 5.83 MBytes 48.9 Mbits/sec 7 135 KBytes
[ 5] 3.00-4.00 sec 5.52 MBytes 46.3 Mbits/sec 1 120 KBytes
[ 5] 4.00-5.00 sec 6.19 MBytes 52.0 Mbits/sec 0 155 KBytes
[ 5] 5.00-6.00 sec 5.58 MBytes 46.8 Mbits/sec 9 136 KBytes
[ 5] 6.00-7.00 sec 6.32 MBytes 53.0 Mbits/sec 5 119 KBytes
[ 5] 7.00-8.00 sec 4.84 MBytes 40.6 Mbits/sec 16 107 KBytes
[ 5] 8.00-9.00 sec 5.52 MBytes 46.3 Mbits/sec 0 139 KBytes
[ 5] 9.00-10.00 sec 5.64 MBytes 47.3 Mbits/sec 7 126 KBytes
Incoming (ingress) was what I expected:
[ ID] Interval Transfer Bitrate
[ 6] 0.00-1.00 sec 77.4 MBytes 648 Mbits/sec
[ 6] 1.00-2.00 sec 88.6 MBytes 743 Mbits/sec
[ 6] 2.00-3.00 sec 86.5 MBytes 726 Mbits/sec
[ 6] 3.00-4.00 sec 86.0 MBytes 722 Mbits/sec
[ 6] 4.00-5.00 sec 85.4 MBytes 716 Mbits/sec
[ 6] 5.00-6.00 sec 86.8 MBytes 728 Mbits/sec
[ 6] 6.00-7.00 sec 86.1 MBytes 722 Mbits/sec
[ 6] 7.00-8.00 sec 86.1 MBytes 722 Mbits/sec
[ 6] 8.00-9.00 sec 86.0 MBytes 722 Mbits/sec
[ 6] 9.00-10.00 sec 86.1 MBytes 722 Mbits/sec
Investigation
With the previous queue problem in mind, I went to check if there was any TX queue drops on the CHR WAN interface, and there was — quite a lot actually.
Nice… That solves that I though and changed the queue setting on the WAN interface from only-hardware-queue
to multi-queue-ethernet-default
.
Then repeated the testing:
[ ID] Interval Transfer Bitrate Retr Cwnd
[ 5] 0.00-1.00 sec 35.7 MBytes 300 Mbits/sec 50 573 KBytes
[ 5] 1.00-2.00 sec 29.8 MBytes 250 Mbits/sec 45 448 KBytes
[ 5] 2.00-3.00 sec 27.5 MBytes 231 Mbits/sec 21 353 KBytes
[ 5] 3.00-4.00 sec 25.9 MBytes 217 Mbits/sec 0 402 KBytes
[ 5] 4.00-5.00 sec 29.5 MBytes 247 Mbits/sec 0 453 KBytes
[ 5] 5.00-6.00 sec 32.6 MBytes 273 Mbits/sec 0 502 KBytes
[ 5] 6.00-7.00 sec 35.6 MBytes 299 Mbits/sec 0 552 KBytes
[ 5] 7.00-8.00 sec 23.2 MBytes 195 Mbits/sec 74 313 KBytes
[ 5] 8.00-9.00 sec 23.1 MBytes 193 Mbits/sec 0 363 KBytes
[ 5] 9.00-10.00 sec 26.5 MBytes 222 Mbits/sec 0 414 KBytes
Well — it was definitely better, and the TX queue drops was reduced. But it’s wasn’t great, and I still had TX queue drops, only less than before.
Could it be MTU related? Probably not, that should have affected both directions. The CPU on the hypervisor was humming along just fine. So what then?
I asked the Kagi Assistant if there was any other queue types in CHR better suited for the WAN interface:
AI quote
For your high-speed WireGuard use case, fq (Fair Queuing) would likely be the best choice because:
- It’s designed for high-throughput scenarios
- It handles UDP traffic well
- It provides better fairness and reduced bufferbloat
- It’s more modern than pfifo
There is a queue kind called fq-codel
, it isn’t defined as a queue type by default. So first I had to create it:
Then I could use it on the WAN interface:
The solution
Boom! Setting the queue type on the WAN interface to fq-codel
was the solution — my WireGuard speed shot up:
[ ID] Interval Transfer Bitrate Retr Cwnd
[ 5] 0.00-1.00 sec 89.6 MBytes 752 Mbits/sec 0 1.97 MBytes
[ 5] 1.00-2.00 sec 86.2 MBytes 724 Mbits/sec 0 1.97 MBytes
[ 5] 2.00-3.00 sec 86.2 MBytes 724 Mbits/sec 0 1.97 MBytes
[ 5] 3.00-4.00 sec 86.2 MBytes 723 Mbits/sec 0 1.97 MBytes
[ 5] 4.00-5.00 sec 86.2 MBytes 724 Mbits/sec 0 1.97 MBytes
[ 5] 5.00-6.00 sec 86.2 MBytes 724 Mbits/sec 0 1.97 MBytes
[ 5] 6.00-7.00 sec 85.0 MBytes 713 Mbits/sec 0 1.97 MBytes
[ 5] 7.00-8.00 sec 86.2 MBytes 724 Mbits/sec 0 1.97 MBytes
And the TX queue drops went away. One day later and I still have 0 TX queue drops.
I was now getting 724/722 Mbits/sec WireGuard traffic 🥳
I asked the Kagi Assistant if I should be using the fq-codel
on my SFP+ interface as well, to which it replied:
AI quote
No, you typically don’t need to enable fq-codel
on the internal NIC because:
- Bufferbloat and queue management issues primarily occur at bottleneck points
- Your WAN interface (750 Mbps) is likely the bottleneck compared to internal NICs (usually running at 1-10 Gbps)
- Internal network traffic doesn’t typically face the same congestion issues
- Adding unnecessary queuing can add latency without benefits
The combination you’re using is optimal:
fq-codel
on WAN (manages bottleneck point)mq
on internal 10 Gbit (maximizes throughput)
Queue types explain
The Kagi Assistant went on to further explain the two queue types I’m using:
AI quote
FQ-CoDel (Fair Queuing with Controlled Delay)
- Modern queue management algorithm
- Actively manages bufferbloat
- Uses packet sojourn time (how long packets stay in queue) to detect congestion
- Provides fair bandwidth allocation between flows
- Works particularly well with mixed TCP/UDP traffic
- Ideal for WAN interfaces where congestion occurs
MQ (Multi-Queue)
- Designed for high-speed interfaces
- Distributing packet processing across multiple CPU cores
- Enabling parallel packet processing
- Using hardware transmit descriptor ring buffers
- Improving performance on high-speed NICs
Ending thoughts
Having solved this problem is a huge relief, its been bugging me for a while. And, to find and understand the root cause and solution is very rewarding 🙂
I found great value in “working” with the Kagi Assistant on solving this, and have done so a few more since — on other problems. I’m very new to using AI assistants, but in methodic troubleshooting like this — it provided real value.
Going forward I can see myself continuing to use such tools to aid in analyzing, troubleshooting and documenting. I see it as a research tool, not a creative help.
This blog will not have blog posts written by AI, not now, and not in the future. Any quotes by an AI will be clearly marked.