Esoteric Tip of the Day #1: Dead Man’s Check

Posted: December 12th, 2009 | Author: | Filed under: Computers, Esoteric Tips | 10 Comments »

I’m responsible for the care and feeding of way too many computers. I say “way too many” because the probability of one of your computers doing something stupid on a Friday night increases in proportion to the number of computers that can do something stupid. Most of the time, the stupidity is routine (“Oh hey, what a surprise! Firefox is using all my RAM!”) but every so often, they surprise you. I’m going to post something here whenever I manage to fix one of these more “WTF”-type errors under the heading Esoteric Tip of the Day. Chances are you won’t care, but someone Google searching might, and I want to make this sort of information easy to find so that others won’t have to endure this same issue.

I noticed a couple days ago that my Mac mini (which acts as a storage server but will probably be doing more media serving once I get a better TV) was not visible to the rest of my home network. It turns out that the mini’s network interface was having serious issues, because the mini couldn’t even get an IP address. Doing a little digging into the log, I found a lot of log segments that look like this:

kernel  AppleYukon2: error - FATAL: SkGeStopPort() does not terminate (Rx)
kernel  AppleYukon2: error - Event queued in Init Level 0
configd[14] network configuration changed.
Firewall[66]  krb5kdc is listening from :::88 proto=6
Firewall[66]  krb5kdc is listening from 0.0.0.0:88 proto=6
kernel  Ethernet [AppleYukon2]: Link up on en0, 100-Megabit, Full-duplex, Symmetric flow-control, Debug [796d,6c0c,0de1,0200,4de1,4000]
configd[14] network configuration changed.
Firewall[66]  krb5kdc is listening from :::88 proto=6
Firewall[66]  krb5kdc is listening from 0.0.0.0:88 proto=6
kernel  AppleYukon2: 00000000,00000000 sk98nif - deadmanCheck - nothing received, soft reset of chip
kernel  AppleYukon2: 00000000,00000000 sk98nif - deadmanCheck - nothing received, soft reset of chip
kernel  AppleYukon2: 00000000,00000000 sk98nif - deadmanCheck - nothing received, soft reset of chip
kernel  AppleYukon2: 00000000,00000000 sk98nif - deadmanCheck - nothing received, soft reset of chip
kernel  AppleYukon2: 00000000,00000000 sk98nif - deadmanCheck - nothing received, soft reset of chip
kernel  AppleYukon2: 00000000,00000000 sk98nif - deadmanCheck - still nothing received, hard reset of chip

There is only one group of people who know exactly what these messages mean, and they are the guys that wrote the AppleYukon2 driver.  Thankfully, a little Googling and some link clicking later, I had a plausible explanation. Apparently this sort of error occurs when one of the networking-related preferences that OS X stores is corrupted somehow. This apparently happened quite a bit when users upgraded to 10.5.7, and it also happened to me. The driver reads a corrupted preferences file, takes some ridiculous action and wedges the underlying hardware in some kind of bizarre state that only a reboot can correct.

Thankfully, OS X can regenerate its preferences files. To do this cleanly:

  • Reboot holding down the Shift key. This will force the computer to boot in Safe Mode and (as a side-effect) rebuild its caches, which is a good thing if they’ve been populated with garbage from your corrupted preferences files, which they probably have been.
  • Delete the following files in /Library/Preferences/SystemConfiguration: NetworkInterfaces.plist, com.apple.airport.preferences.plist, com.apple.network.identification.plist
  • Reboot and re-apply your network settings (IP address, et al)

Special thanks to Daniel Palmer, the poor soul who sat on the phone with Apple to get this resolved. Here’s the thread containing the solution.

Update 3/3/10: This problem has reared its ugly head once again, and I’ve found out a little more information about it:

When you set the speed and duplex settings of your wired network adapter manually rather than keeping the setting at “Automatically” and you disable IPv6, the driver no longer triggers hard resets for whatever reason. In my case, it triggers soft resets at precise 6-minute intervals.

Also, as reader lafber pointed out, this problem only seems to occur when there is no traffic on the wired network. To solve this problem, I wrote a one-line AppleScript script. The application start a command that pings my router every 30 seconds. The command itself runs in a virtual terminal, so I don’t have to keep an application running in the foreground. The AppleScript looks like this:

do shell script “screen -m -d ping -i 30 192.168.1.1″

To get it to start at bootup, I saved that AppleScript script as an application and added the application to my default user’s Login Items in the Accounts system preference pane.