Wireless networking insanity at OS X con
by Rob Flickenger
All day today, we've noticed some strange behavior on the wireless network at OS X con. Every so often, clients would get a "Connection Refused" when trying to go to a web page, or would be kicked off of iChat spontaneously, only to be reconnected a moment later. Hitting Reload would usually work, but would sometimes show another "Connection Refused", and then just as mysteriously go away. Strangely, ssh connections seemed to work just fine, and never got dropped. Many other people (beyond the usual wireless ultra geeks) also reported similar problems.
We finally got tired of putting up with this state of affairs, and took a look at network traffic directly with tcpdump. What we found was very interesting. (Lines below are broken and highlighted for readability).
root@jojo~# tcpdump -eni en1 host 192.168.2.234
22:51:09.343951 0:30:65:aa:bb:cc 0:a0:c9:00:11:22 0800 74:
192.168.2.234.52774 > 22.214.171.124.80: S 4010194416:4010194416(0)
Here we see a client with the MAC address 0:30:65:aa:bb:cc and IP address 192.168.2.234 make an initial SYN setup request to a web page at 126.96.36.199, via the router with the MAC address 0:a0:c9:00:11:22.
22:51:09.344882 0:30:65:42:23:ff 0:30:65:aa:bb:cc 0800 54:
188.8.131.52.80 > 192.168.2.234.52774: R 0:0(0) ack
4010194417 win 0 (DF)
22:51:09.345665 0:30:65:42:23:ff 0:30:65:aa:bb:cc 0800 54:
184.108.40.206.80 > 192.168.2.234.52774: R 0:0(0) ack
1 win 0 (DF)
What's this? Almost immediately (about .8ms later), a different host entirely (0:30:65:42:23:ff) sends two TCP RESET packets back to the client machine. How rude! Of course, the client machine aborts the current connection attempt (as a "Connection Reset by Peer") and forgets about it.
22:51:09.417171 0:a0:c9:00:11:22 0:30:65:aa:bb:cc 0800 74:
220.127.116.11.80 > 192.168.2.234.52774: S
4010194417 win 65535
Next we see the router (at 0:a0:c9:00:11:22) returning with the SYN ACK from the original web page...
22:51:09.418044 0:30:65:aa:bb:cc 0:a0:c9:00:11:22 0800 54:
192.168.2.234.52774 > 18.104.22.168.80: R
4010194417:4010194417(0) win 0
...which of course the client naturally refuses (sending a RESET back to the web site), since it has already received a RESET from the misbehaving peer!
22:51:09.863614 0:30:65:aa:bb:cc 0:a0:c9:00:11:22 0800 74:
192.168.2.234.52775 > 22.214.171.124.80: S
4246974705:4246974705(0) win 32768
And finally, since the connection didn't go through the first time, the client retries, and the cycle repeats again, and again, and again... Until the client's browser finally gives up a few seconds later.
The rude machine that sent the gratuitous RESETs above (0:30:65:42:23:ff) was actually my own laptop. But by logging traffic for a while, we found that arbitrary hosts on the wireless segment were also sending RESETs. What was the common variable on all of these machines? What we have been able to determine is that if any host on a wireless network is both running the Jaguar Firewall and running a program that throws the AirPort into promiscuous mode (like tcpdump, ngrep, etherpeg, or other network monitoring tool) then that machine will send arbitrary TCP RESETs for every packet that it sees on the wireless, even if it wasn't destined for itself. Likely, this is because something in the firewall code sees the packets as a local destination (as the card is in promiscuous mode), even though it's not really a local destination. This also explains why ssh connections were unaffected: most people have an exception for ssh in their firewall rules, and so packets destined for port 22 (the ssh port) wouldn't ever get matched, and so wouldn't get rejected.
This is an exceedingly easy thing to do, especially at this conference (where people like me are working with firewalls, monitoring tools, and wireless networks, and there are also many active wireless clients!) It is possible that this behavior would manifest itself on a wired network as well, if all of the clients involved were connected to a network hub (but not a switch). As wireless APs necessarily act as a hub, every client can see the traffic of every other, and hence can send responses to packets that weren't destined for them. Unfortunately, we don't have a hub to test with here at the conference.
Turning off firewalling immediately eliminates the problem, and turning it back on recreates it reliably. This is very odd behavior, as filtered packets should normally be dropped on the floor (and we should certainly not automatically send RESETs to addresses that aren't involved with a locally bound address).
As Cliff Skolnick (who instigated tracking down the above strangeness) also points out, there is even more dementia when OS X 10.2 is set up as a router. As part of a nifty hack he's presenting that enables a Bluetooth enabled Palm to use the network through his Titanium and over its wireless network to get to the Internet, he needs to enable packet forwarding:
root@jojo~# sysctl -w net.inet.ip.forwarding=1
Now, since he's using a Titanium, he isn't using the internal AirPort card (as the wireless range is, well, not what it could be) but instead is using an external PCMCIA card for his network connection. As soon as routing is enabled and promiscuous mode is turned on (say, by simply running tcpdump), it suddenly attempts to send ICMP redirects on every ICMP packet it sees on the wireless segment, redirecting them to the router it was destined for in the first place. This generates a huge amount of inadvertent traffic, with very little effort:
root@caligula:~# ping 192.168.2.234
PING 192.168.2.234 (192.168.2.234): 56 data bytes
64 bytes from 192.168.2.234: icmp_seq=0 ttl=64 time=4.916 ms
64 bytes from 192.168.2.234: icmp_seq=0 ttl=64 time=7.294 ms (DUP!)
64 bytes from 192.168.2.234: icmp_seq=1 ttl=64 time=12.44 ms
64 bytes from 192.168.2.234: icmp_seq=1 ttl=64 time=15.105 ms (DUP!)
64 bytes from 192.168.2.234: icmp_seq=2 ttl=64 time=9.754 ms
64 bytes from 192.168.2.234: icmp_seq=1 ttl=64 time=1020.15 ms (DUP!)
64 bytes from 192.168.2.234: icmp_seq=2 ttl=64 time=22.165 ms (DUP!)
64 bytes from 192.168.2.234: icmp_seq=1 ttl=64 time=1026.62 ms (DUP!)
64 bytes from 192.168.2.234: icmp_seq=1 ttl=64 time=1029.02 ms (DUP!)
64 bytes from 192.168.2.234: icmp_seq=1 ttl=64 time=1032.94 ms (DUP!)
64 bytes from 192.168.2.234: icmp_seq=3 ttl=64 time=6.755 ms
64 bytes from 192.168.2.234: icmp_seq=3 ttl=64 time=23.359 ms (DUP!)
64 bytes from 192.168.2.234: icmp_seq=2 ttl=64 time=1216.67 ms (DUP!)
Quitting tcpdump immediately makes this problem go away. This is very unexpected network behavior, and certainly bears more examination...
So, to sum up:
- If you're running promiscuous mode tools, do not run a firewall in OS X 10.2
- If you're acting as a router... DON'T run promiscuous mode utilities!
- If you're at OS X con and are running Etherpeg (or ethereal or ngrep or ettercap or ntop any other utility that throws the card into promiscuous mode), and are running routing or firewalling, we will hunt you down and make you knock it off. ;)
Note that the actual MAC and IP addresses have been changed to protect the innocent. Thanks to Cliff Skolnick and the bunch of people hanging out on the mezzanine that helped get to the bottom of this! If you read this, and you're at the conference, help spread the word to any other curious Etherpeg aficionados who may have their Firewall turned on...
Have you seen this strange behavior on Jaguar?
I have seen this strange behaviour
on wired ethernet. I thought it was a bug with chimera, but I've also had problems with my mail client (getting a wrong password error that mysteriously fixes itself). I'll try running the utilities mentioned here to see if I can find a similar pattern.
yes, I have seen this on a wired network as well
First, big thanks for the explanation
Rude Firewall behavior
I had a similar problem. I have a G4 wired to a
router, an Airport that just routes packets to RF,
a sleeping power book using wireless and PPOE DSL on the
other side of the router. I was getting connection
refused and just like you, and had to try twice to access
a site. Turing off the Firewall fixed this
also. As far as I can see I was not running any
ethernet utilities just two browsers (Cyberdog and Mozilla)
oh, yeah, I've seen it
I posted about this on the Apple support forums a few days ago, and I just added a link to this article to prove I'm not crazy.
Underlying flaw in BSD stack
I've seen this on wired networks as well.
X 10.1.5, Airport in Pismo to WAP; Windows on Ethernet; both to Router, Asante DSL -- same thing
Same problem's been bedviling our home setup -- Alcatel DSL modem to a Linksys router; thence by ethernet to a Win laptop and by wireless to the Airport card in the Pismo.
Seen simular things
Especially with 10.1, I saw very evil things when I put an iface in promiscuous mode. RSTs, destunreach, etc.
Resets in Jaguar Stack
I've run into this as well (spent a day trying to figure out why our Sun boxes had all of a sudden stopped accepting connections). Had the netowrk down to the two main servers and a dell/linux box and my Pismo. Looked at the MAC addresses of the resets and noticed that although the IP was from the intended target the MAC was wrong, it was me!
Firewall hiccups even without promiscuous mode
There also seem to be conditions where the Jaguar firewall drops packets in a completely erratic manner. Don't know if this is related to the problem described here.
There's a thread in the BrickHouse support forum:
I have the same problem at home
I have an iBook and a Titanium that have the same problem. I have an old Airport base station. My browser is Mozilla. If I have the firewall turned on for the Titanium, my iBook gives me the "connection refused" and it stops when I turn it off, or if I put the Titanium to sleep. This does not happen if the firewall is turned on for the iBook however.
I don't think it's Jaguar either...
I've seen similar problems, and they predate jaguar.