Apache2 processes at 100% cpu!

Hello folks,

 

I have a strange problem since today afternoon (actually it is 11:45pm here).

Since a few hours my apache2 creates processes that climbs up to 100% for each process and doesn't get closed again.

See attached some images I made of the output of `top`, `netstat`, `strace` and the extended server-status. It seems clear to me, that the problem is caused by this ./flash/XML.php script of dolphin.

On these screenshots, you can see two processes in the server-status, that are using more then 100% cpu each. In the screenshot with top you only can see one process with more than 100%. But when waiting some time, there will be many processes, that are using >=100% and doesn't get killed anymore.

 

I urgently need some help to get this issue solved. There is a bigger community behind that should not get blown up.

Thanks in advance

Criz

zl.001.jpg · 313.1K · 286 views
zl.002.jpg · 154.5K · 231 views
zl.003.jpg · 7.5K · 277 views
zl.004.jpg · 121K · 267 views
Quote · 29 Dec 2010

That file is the cause of nightmares.

BoonEx Certified Host: Zarconia.net - Fully Supported Shared and Dedicated for Dolphin
Quote · 29 Dec 2010

top - 23:58:21 up 61 days,  2:54,  1 user,  load average: 2.30, 1.86, 3.21
Tasks: 120 total,   4 running, 116 sleeping,   0 stopped,   0 zombie
Cpu(s): 54.6%us,  0.7%sy,  0.0%ni, 41.8%id,  2.9%wa,  0.0%hi,  0.0%si,  0.0%st
Mem:   4025424k total,  3221352k used,   804072k free,    35600k buffers
Swap:  2104496k total,    93316k used,  2011180k free,  2182880k cached

PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND                                                                                                                                                                                                 
18462 www-data  20   0  519m  29m 4936 R  100  0.8   7:31.27 apache2                                                                                                                                                                                                 
18461 www-data  20   0  519m  29m 5036 R   99  0.8   7:52.79 apache2                                                                                                                                                                                                 
18839 www-data  20   0  525m  50m  21m R   18  1.3   0:00.55 apache2                                                                                                                                                                                                 
22847 mysql     20   0 1583m 326m 6392 S    3  8.3  17:08.77 mysqld                                                                                                                                                                                                  
17988 www-data  20   0  512m  28m  10m S    1  0.7   0:03.37 apache2                                                                                                                                                                                                 
18792 www-data  20   0  510m  28m  12m S    1  0.7   0:00.77 apache2                                                                                                                                                                                                 
14461 centovac  20   0 36120 3520  772 S    0  0.1   0:07.87 sc_serv                                                                                                                                                                                                 
18686 www-data  20   0  510m  28m  12m S    0  0.7   0:01.41 apache2

Quote · 29 Dec 2010

I have created a cronjob for last night that restarts the apache2 each full hour. But that surely is NOT a solution!

 

Today, 40 minutes after the last apache2 restart, this is how top looks like:

 

top - 09:40:27 up 61 days, 12:36,  1 user,  load average: 9.95, 6.11, 4.44
Tasks: 129 total,  13 running, 116 sleeping,   0 stopped,   0 zombie
Cpu(s): 99.9%us,  0.1%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Mem:   4025424k total,  3924488k used,   100936k free,    78020k buffers
Swap:  2104496k total,    93324k used,  2011172k free,  2705960k cached

PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND                                                                                                                                                                                                 
14251 www-data  20   0  516m  34m  12m R   37  0.9   1:14.82 apache2                                                                                                                                                                                                 
13396 www-data  20   0  516m  27m 6332 R   37  0.7  21:36.87 apache2                                                                                                                                                                                                 
14162 www-data  20   0  516m  28m 6760 R   33  0.7   2:50.63 apache2                                                                                                                                                                                                 
14322 www-data  20   0  519m  48m  25m R   33  1.2   0:15.83 apache2                                                                                                                                                                                                 
14286 www-data  20   0  516m  27m 6120 R   33  0.7   2:27.38 apache2                                                                                                                                                                                                 
14390 www-data  20   0  516m  26m 5084 R   32  0.7   0:08.09 apache2                                                                                                                                                                                                 
14111 www-data  20   0  516m  27m 6012 R   31  0.7   9:03.99 apache2                                                                                                                                                                                                 
14325 www-data  20   0  523m  52m  23m R   31  1.3   0:40.93 apache2                                                                                                                                                                                                 
14333 www-data  20   0  516m  27m 6016 R   31  0.7   0:52.68 apache2                                                                                                                                                                                                 
14240 www-data  20   0  521m  50m  23m R   31  1.3   1:59.02 apache2                                                                                                                                                                                                 
14321 www-data  20   0  521m  57m  30m R   30  1.5   1:02.18 apache2                                                                                                                                                                                                 
14391 www-data  20   0  516m  27m 6140 R   29  0.7   0:01.42 apache2                                                                                                                                                                                                 
14467 centovac  20   0 26328 3712 1632 S    5  0.1  22:55.85 ices                                                                                                                                                                                                    
14253 www-data  20   0  511m  27m  12m S    4  0.7   0:01.59 apache2                                                                                                                                                                                                 
14288 www-data  20   0  509m  26m  11m S    1  0.7   0:01.11 apache2

 

You see the 12 processes that are all apache2 and that are all way over 25% of cpu time?

that must get solved TODAY.

Quote · 30 Dec 2010

okay... I've now written a script, that kills such processes automatically. This prevents from the need of restarting the apache each hour.

It runs in a cronjob every minute and calls top in batch mode five times in a 2 seconds interval to get all processes.

If there is one process that is greater than 50% cpu usage and its name is "apache2", then it gets killed:

 

#!/bin/bash
LOGFILE="/root/kill.high.processes.log"
for i in {1..5}; do
TOP=$( /usr/bin/top -b -n 1 | /bin/grep -A 1 "PID USER" | /bin/grep -v "PID USER" | /usr/bin/tr -s " " )
PID=$( echo ${TOP} | /usr/bin/cut -d' ' -f1 )
CPU=$( echo ${TOP} | /usr/bin/cut -d' ' -f9 )
TIME=$( echo ${TOP} | /usr/bin/cut -d' ' -f11 )
NAME=$( echo ${TOP} | /usr/bin/cut -d' ' -f12 )
MIN=$( echo $TIME | /usr/bin/cut -d':' -f1 )
DATE=$( /bin/date +%Y-%m-%d )
HOUR=$( /bin/date +%H:%M:%S )
if [ $CPU -gt 50 ]; then
if [ "$NAME" == "apache2" ]; then
echo "[${DATE} ${HOUR}] Killing process $PID (${NAME}: ${CPU}%, ${TIME})" >> $LOGFILE
/bin/kill -9 $PID
fi
fi
sleep 2
done

 

Maybe this helps other peoples too...

But again, this is NOT a solution. We need to fix the main reason for these processes ASAP!

Criz

Quote · 30 Dec 2010

there is many posts regarding dolphin utilize processor at full throttle. Hopefully processor issues get fully solved in 7.1 I had to stop using dolphin because of high cpu utilization and other bugs - by my opinion d7 is really not tuned yet for live production site. Hope 7.1 get dolphin to stable phase and let us use dolphin 7 finally on production sites...

Quote · 30 Dec 2010

XML.php file is request more often than others, this is the reason you see it in the process list more often.

I suggest to emulate the situation to know exactly what file is causing problems. Please run ab (Apache Benchmark) program to test it, for example:

ab -c 1 -n 500 -C memberID=123 -C memberPassword=135e2fa1b771ed59fd2c02fb04b556758df0c456 "http://www.somedomain.com/flash/XML.php?module=im&action=updateInvite&recipient=123&_t=1293753323764"

Change memberID, memberPassword cookies, recipient get param and domain to your own.

Rules → http://www.boonex.com/terms
Quote · 31 Dec 2010
 
 
Below is the legacy version of the Boonex site, maintained for Dolphin.Pro 7.x support.
The new Dolphin solution is powered by UNA Community Management System.