When I was trying out Netdata last year — I noticed I had lots of inbound_packets_dropped_ratio
warnings, on multiple nodes.
Time to investigate 👇
I did a tcpdump
capture on one of the nodes reporting dropped packets:
sudo tcpdump -i vmbr0 -w trace.pcap
The capture was stored as a pcap
file, which I then analyzed in Wireshark.
50 0.204016 TPLink_4f:26:5f Broadcast Realtek 60
123 1.201995 TPLink_4f:26:5f Broadcast Realtek 60
272 2.202889 TPLink_4f:26:5f Broadcast Realtek 60
350 3.203498 TPLink_4f:26:5f Broadcast Realtek 60
467 4.206708 TPLink_4f:26:5f Broadcast Realtek 60
575 5.204777 TPLink_4f:26:5f Broadcast Realtek 60
673 6.205612 TPLink_4f:26:5f Broadcast Realtek 60
796 7.206345 TPLink_4f:26:5f Broadcast Realtek 60
889 8.209804 TPLink_4f:26:5f Broadcast Realtek 60
1004 9.207812 TPLink_4f:26:5f Broadcast Realtek 60
1107 10.208800 TPLink_4f:26:5f Broadcast Realtek 60
1188 11.209873 TPLink_4f:26:5f Broadcast Realtek 60
1353 12.213628 TPLink_4f:26:5f Broadcast Realtek 60
1425 13.211084 TPLink_4f:26:5f Broadcast Realtek 60
1512 14.211925 TPLink_4f:26:5f Broadcast Realtek 60
1616 15.212710 TPLink_4f:26:5f Broadcast Realtek 60
1747 16.216115 TPLink_4f:26:5f Broadcast Realtek 60
1881 17.214192 TPLink_4f:26:5f Broadcast Realtek 60
1966 18.215312 TPLink_4f:26:5f Broadcast Realtek 60
2092 19.215959 TPLink_4f:26:5f Broadcast Realtek 60
2230 20.219056 TPLink_4f:26:5f Broadcast Realtek 60
I found something interesting; every second the TP-Link router in my garage was sending out broadcast traffic. Why?
Looking at the frame in Wireshark didn’t reveal much:
Frame 50: 60 bytes on wire (480 bits), 60 bytes captured (480 bits)
Ethernet II, Src: TPLink_4f:26:5f (34:60:f9:4f:26:5f), Dst: Broadcast (ff:ff:ff:ff:ff:ff)
Realtek Layer 2 Protocols
But a quick internet search lead me to an issue posted on the Netdata repository:
I found the RRCP broadcasts from both TP-link TL-SG105e and Netgear GS908e were stopped by disabling “loop prevention” on the switch. — jw-g19
I tried disabling “loop prevention” in my TP-Link switch:
And voilà — the steady 1 drop/s went away:
One frame per second dropped isn’t the end of the world — but I’d rather avoid it if possible. Finding small issues like this isn’t possible without some kind of monitoring.
I still have some dropped packets, due to improper VLAN filtering — I need to get on that, soon™.
I used Netdata for almost one year, but recently dropped it for Checkmk. That is a topic for another blog post 🙂
🖖