Saturday, January 29, 2011

Server crashed unexpectedly

I have absolutely no idea how did the server crashed, the only exceptional thing I found is the following graph from munin:

http://i.imgur.com/Jm08n.png

Please don't tell me I need more RAM, as you can see, before the incident occurred, everything was stable. I just don't get why suddenly the server crashed nor why the memory demand was suddenly so high.

  • Hello,

    First check dmesg and system logs for any kernel panel or memory use. Looks like you have an application that is using all your memory. Try this script that will log your process list in a file and you will know what caused you the problem:

    #!/bin/bash
    mkdir /tmp/mem_log
    while [ 1 ] ; do
       date "+%Y-%m-%d %H:%M:%S"
       ps aux
       sleep 60
    done
    

    and execute it like this:

    nohup ./mem_log.sh > /tmp/mem_log/mem_log.log &
    

    After the next server crash, check the log to see what process used all your memory. It is a memory problem, but not because you don't have enough memory, it's just a faulty process that causes this.

    TheOnly92 : Will try that. Thanks. Just one question, if I save the log file in the /tmp/ directory, wouldn't it be deleted if I reboot the server (after it crashed)?
    MihaiM : yes, sorry, you are right, save it in your home dir.
    From MihaiM
  • You may want to install psmon and make it report/kill misbehaving memory-hungry processes. Psmon logs / emails about events it reacts to, so that way you can easily find out what is the rebel process you have there.

0 comments:

Post a Comment