Wireguard not completing handshake

Lucio Crusca asked:

I have two Debian GNU/Linux systems (bullseye/sid), both running wireguard on port 33456, both behind NAT. Both run a kernel version > 5.6.

System A is the server, and it dynamically updates a dedicated "A record" in the authoritative nameserver for its internet domain, with the correct public IP address its internet facing router A (ZyWALL USG 100) is assigned with. It does so once every minute, but the public IP address usually changes only on reboot of the router.

System B is behind router B and it acts as wireguard client, pointing to the dynamically updated "A record" and port 33456. Router B is a consumer grade VDSL router and it allows everything in outbound direction, only replies inbound.

Router A (ZyWALL USG 100) is configured to allow UDP packets on port 33456 through it and forwards them to server A. Here is the relevant configuration screen:

ZyWALL USG 100 wireguard-behind-NAT configuration

Here is the server A wireguard configuration file (keys in this snippet, despite being valid, aren’t the real ones):

[Interface]
Address = 10.31.33.100/24, fc00:31:33::1/64
ListenPort = 33456
PrivateKey = iJE/5Qy4uO55uUQg8nnDKQ/dFT1MEq+tDfFXrGNj3GY=
PreUp = iptables -t nat -A POSTROUTING -s 10.31.33.0/24  -o enp1s0 -j MASQUERADE; ip6tables -t nat -A POSTROUTING -s fc00:31:33::/64 -o enp1s0 -j MASQUERADE
PostDown = iptables -t nat -D POSTROUTING -s 10.31.33.0/24  -o enp1s0 -j MASQUERADE; ip6tables -t nat -D POSTROUTING -s fc00:31:33::/64 -o enp1s0 -j MASQUERADE

# Simon
[Peer]
PublicKey = QnkTJ+Qd9G5EybA2lAx2rPNRkxiQl1W6hHeEFWgJ0zc=
AllowedIPs = 10.31.33.211/32, fc00:31:33::3/128

And here is client B wireguard configuration (again, keys and domain aren’t the real ones):

[Interface]
PrivateKey = YA9cRlF4DgfUojqz6pK89poB71UFoHPM6pdMQabWf1I=
Address = 10.31.33.211/32

[Peer]
PublicKey = p62kU3HoXLJACI4G+9jg0PyTeKAOFIIcY5eeNy31cVs=
AllowedIPs = 10.31.33.0/24, 172.31.33.0/24
Endpoint = wgsrv.example.com:33456
PersistentKeepalive = 25

Here is a dirty diagram that depicts the situation:

Server A -- LAN A -- ZyWALL (NAT) -- the internet -- VDSL Router B (NAT) -- LAN B -- Client B

Starting wireguard on both systems does not establish the VPN connection. Activating debug messages on the client and adding a LOG rule into iptables, that logs OUTPUT packets, I get lots of these:

[414414.454367] IN= OUT=wlp4s0 SRC=10.150.44.32 DST=1.2.3.4 LEN=176 TOS=0x08 PREC=0x80 TTL=64 ID=2797 PROTO=UDP SPT=36883 DPT=33456 LEN=156 
[414419.821744] wireguard: wg0-simon: Handshake for peer 3 (1.2.3.4:33456) did not complete after 5 seconds, retrying (try 2)
[414419.821786] wireguard: wg0-simon: Sending handshake initiation to peer 3 (1.2.3.4:33456)

However I’ve also added a LOG iptables rule to the server, in order to diagnose router configuration problems.

[email protected] ~ # iptables -t nat -A PREROUTING -i enp1s0 -p udp -m udp --dport 33456 -j LOG

It logs absolutely nothing. But. There’s always a ‘but’. But if I run, on the client, the following nmap command:

[email protected] ~ $ sudo nmap -sU -p 33456 wgsrv.example.com

then I can see two UDP packets coming into the server enp1s0 interface:

[email protected] ~ # dmesg
...
[316518.138618] IN=enp1s0 OUT= MAC=52:54:00:41:8a:24:c8:6c:87:34:5c:bd:08:00 SRC=4.3.2.1 DST=192.168.0.249 LEN=28 TOS=0x00 PREC=0x00 TTL=44 ID=31039 PROTO=UDP SPT=50864 DPT=33456 LEN=8 
[316518.241170] IN=enp1s0 OUT= MAC=52:54:00:41:8a:24:c8:6c:87:34:5c:bd:08:00 SRC=4.3.2.1 DST=192.168.0.249 LEN=28 TOS=0x00 PREC=0x00 TTL=47 ID=23573 PROTO=UDP SPT=50865 DPT=33456 LEN=8

so I’m inclined to assume the A router (ZyWALL USG 100) was correctly configured to let the packets come into the server local network. But if it were, why wireguard UDP packets do not reach the server instead?

My answer:


OK, you mentioned that the client is on VDSL, so I suspect you have an MTU problem.

The normal MTU of a wired (and these days, wireless) network connection is 1500 bytes, but on *DSL the PPPoE layer takes up 8 bytes, making the usable MTU actually 1492. (It’s also possible your network connection has been set to an even lower MTU.)

Wireguard’s packet overhead is 80 bytes, meaning the tunnel MTU is 1420 by default. Try lowering this by the same 8 bytes, to 1412. (Or lower if you already had a lower MTU than 1492.)

You also need to have the client to tell the server to lower its MTU on tunnelled packets. This can be done with an iptables rule.

On the client side wg0.conf you will need something like:

[Interface]
MTU = 1412
PostUp = iptables -A FORWARD -p tcp --tcp-flags SYN,RST SYN -j TCPMSS --clamp-mss-to-pmtu
PostDown = iptables -D FORWARD -p tcp --tcp-flags SYN,RST SYN -j TCPMSS --clamp-mss-to-pmtu
;....the rest

View the full question and any other answers on Server Fault.

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License.