The "server is down" checklist
My server was down earlier today for mysterious reasons. A switch at the data center was cycled and my box never came back on (the only one that didn't).
- Login via serial console to verify network device is actually alive - check
- Cycle the server for good measure - check
- Cycle the switch again - check
- Ping setup-knowledgable person on IM (idle) - check
- Try several cell calls to various people with no good result - check
- Get a hold of a person who might help - they're too busy right now - check
- Watch the only person left who knows more than you about the setup (I know nothing) go to bed - check
- Start pinging every IP in your range in desperation - bingo!
Bang-head-on-wall *check*
Comments
Thing's that happens, however gw ip change are normally rip announced as i know ?
Fine place. Beautiful blue things. There 's at least one girl in geek's world, one and half with Elisa. Jogging is a good idea when we're seating most part of the day. ++
Posted by: z80 | August 24, 2004 08:15 PM
You mean people still use RIP these days?
Also, aren't RIP/BGP/et al normally only used for infrastructure-level routers and so on? If the router itself changes address, then the individual client systems on the network probably wouldn't know, since at that level things are usually announced via DHCP/bootp/etc. if they're even announced at all (and by the description, I'm guessing that Kasia's box was the only one which isn't configured by DHCP).
Posted by: fluffy | August 25, 2004 07:32 PM
This turned out to be some wigged-out routers.
Kasia has a static /28 subnet, and two big Juniper routers provide a VRRP (Virtual Router Redundancy Protocol) group for her "virtual" gateway. Both the hard router gateways worked fine (one of which she found), but the virtual gateway was dorked. We rebuilt the VRRP group to fix it.
Posted by: Steve Friedl | August 26, 2004 10:19 AM