[UFO Chicago] Sad crashing of Linux server

Ian Bicking ianb@colorstudy.com
02 Aug 2002 20:42:35 -0500


On Fri, 2002-08-02 at 16:50, Peter A. Peterson II wrote:
> Quoting Ian Bicking:
> > A dedicated server of mine has been crashing in bad ways.  I don't have
> > physical access, so all I know is that the services shut down (ssh,
> > telnetd, apache)... not all of them, the POP server is still there, but
> > not fully functional (this is odd that it works at all, since both POP
> > and telnet are inetd services).  To get it back, I've had to ask for it
> > to be rebooted.
> 
> How old is this box? How up to date is the software (security patches,
> etc.) How likely is it that the hardware is just giving up (bad ram,
> flaky disk) rather than it's being compromised? It seems to me that if
> it were compromised, a hax0r would do something more interesting than
> kill the services and sort of cripple POP. But maybe I'm just thinking
> too highly of hax0rs.
> 
> How often does it "go down" after it's rebooted? Or how quickly?
> 
> Is the server in the Twin Cities? Or where?

It's a dedicated server that I've never seen.  I assume the hardware is
new or newish -- I doubt it is the problem.  I've looked around as Jesse
suggested, but haven't noticed anything out of the ordinary.  The one
really odd thing is I can't ssh in with normal user accounts, only with
root... hmmm... but I can telnet in.  It gives me the message "System
bootup in progress - please wait", before it even gets to the password
prompt... hmmm... and then I delete /etc/nologin and it works.  Okay, so
that wasn't anything...

I almost wonder if I'm getting out-of-memory errors, and the random
process killer kicks in (it's a recent 2.4 kernel, which apparently they
haven't figured out the malloc-returns-NULL feature :-/ ) .  I tried
leaving an ssh session open with top for a while, but it seemed to be
working and I let it go.

I've never had these kind of stability problems except with X, and I've
never been hacked, so I don't know what to make of this.  Maybe my
security luck is gone -- but I don't see any evidence of it (if they
were stealthy they wouldn't shut down my services -- if they weren't
stealthy, I'd notice something odd about the system).

Sigh...