(dv) 3.5:Memory optimization and troubleshooting
- This page was last modified on September 2, 2011, at 09:04.
From (mt) Community Wiki
Contents |
Symptoms
- Server crashes.
- Slow server response times.
- Out of memory errors.
- High memory usage.
- kmemsize and privvmpages alerts.
- New files and processes won't open.
Cause
If you are experiencing the above symptoms, you may be overusing your server's memory resources. Please note that some of these symptoms may be due to unrelated causes, such as a slow network connection to the server or inefficient MySQL queries. Pinpointing the issue to memory will require additional diagnosis.
Resource overuse is not a "server problem" that can be fixed by (mt) Media Temple. Your (dv) Dedicated-Virtual Server comes with a fixed amount of dedicated RAM and kernel memory, just like it comes with a fixed amount of disk space. If you try to run processes on the server that require more than this amount of RAM, for example, your system may overload and crash or function poorly.
Diagnosis
- Processes running on your server require more memory than is available with the plan you have purchased.
To see if this is an accurate diagnosis of your server issue, and for help identifying specific causes of your resource over-use, please continue with the steps below.
Following this article is a great first step in identifying the cause(s) of your server resource overuse. However, a general-purpose article simply cannot replace the dedicated attention of your system administrator. (dv) Dedicated-Virtual Server customers are expected to have access to their own server administration resources. Server resource diagnosis and troubleshooting is not a service that (mt) Media Temple provides for (dv) Dedicated-Virtual Server customers.
QoS Alerts
The first thing you should check when you notice performance problems with your server is your QoS Alerts, which stands for Quality of Service Alerts.
- Sign into Plesk as admin or root.
- Click on Virtuozzo in the column on the left.
- Click on QoS Alerts.
- Check for kmemsize and privvmpages errors. If you see other types of errors, or simply want to know more about the different types of system resource limits, please see our QoS Alerts, beancounters, and system resource limits article. Most of the steps below are equally applicable to other types of errors as well, such as tcpsndbuf or othersockbuf errors.
If you see these kinds of alerts, you are running into a memory issue.
Quick fix - reboot
Please do NOT reboot if you are trying to do a deeper diagnosis of the problem. A reboot will kill current processes which may be the clue to your server resource problem. Skip to the next section for more troubleshooting and diagnostic tips.
Rebooting your server is a quick and temporary fix that may bring some or all of your services online. If your sites are down or you have runaway processes that need to be killed, a reboot can bring immediate relief to your server. Please see (dv) How can I reboot my server? for details. You may also be able to simply restart a single service to bring your server up again more gracefully.
Quick reboot guide:
- Sign into your AccountCenter.
- Click on your primary domain.
- Click Reboot Server.
kmemsize and privvmpages
kmemsize is the kernel memory of the server. This limit is closely tied to your CPU use, and is smaller than your RAM. You can reach your kmemsize limit by:
- Running too many processes on your server at the same time.
- Running just a few CPU-intensive processes.
See the Parallels documentation for more technical details.
privvmpages is the RAM of the server. You can reach your privvmpages limit by:
- Running processes that require intensive server memory.
See the Parallels documentation for more technical details.
This article at maxgarrick.com has a good visual representation of how the two types of memory work.
Tracking down resource-hogging processes
Now you know that you have one or more processes running on your server that are using too much of your memory. The best way to see these processes is to use SSH.
- Log into your server as a root or sudo user via SSH.
- Run:
top
- You should see dynamic output like this:
top - 15:43:35 up 443 days, 3:53, 1 user, load average: 0.00, 0.00, 0.00 Tasks: 34 total, 1 running, 33 sleeping, 0 stopped, 0 zombie Cpu(s): 0.0% us, 0.0% sy, 0.0% ni, 100.0% id, 0.0% wa, 0.0% hi, 0.0% si Mem: 1807436k total, 205592k used, 1601844k free, 0k buffers Swap: 0k total, 0k used, 0k free, 0k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 1 root 16 0 1980 644 556 S 0 0.0 0:33.59 init 24158 root 15 0 1644 564 472 S 0 0.0 1:53.25 syslogd 24181 root 16 0 6912 1048 672 S 0 0.1 1:59.54 sshd 24196 root 16 0 2640 896 724 S 0 0.0 0:00.64 xinetd 24216 root 15 0 5768 820 560 S 0 0.0 0:00.08 couriertcpd 24218 root 16 0 4608 1064 836 S 0 0.1 0:00.02 courierlogger 24366 root 17 0 2368 1132 976 S 0 0.1 0:00.00 mysqld_safe 24426 mysql 16 0 119m 23m 5348 S 0 1.3 8:51.08 mysqld 24476 root 16 0 32396 29m 2340 S 0 1.7 3:08.59 spamd 24478 popuser 16 0 32396 27m 968 S 0 1.6 0:06.28 spamd 24501 root 16 0 43648 6804 4192 S 0 0.4 0:00.77 httpsd 24556 root 18 0 5396 688 424 S 0 0.0 0:00.00 saslauthd 24557 root 18 0 5396 432 168 S 0 0.0 0:00.00 saslauthd 21912 qmails 16 0 1636 488 396 S 0 0.0 0:00.03 qmail-send 21914 qmaill 16 0 1580 464 396 S 0 0.0 0:00.00 splogger 21916 root 17 0 1616 372 280 S 0 0.0 0:00.00 qmail-lspawn 21917 qmailr 16 0 1604 376 288 S 0 0.0 0:00.00 qmail-rspawn 21918 qmailq 16 0 1572 344 280 S 0 0.0 0:00.00 qmail-clean 10049 psaadm 16 0 47232 20m 14m S 0 1.2 0:02.95 httpsd 11518 psaadm 16 0 48164 21m 14m S 0 1.2 0:02.21 httpsd 25766 root 16 0 35824 14m 8388 S 0 0.8 0:00.15 httpd
Important columns:
- PID - Shows process ID number.
- USER - Shows owner of the process, useful for identifying hacks.
- S - Watch out for zombie processes, marked with a Z - these processes have not been properly closed by the program that started them.
- %CPU - Shows server CPU percentage used.
- %MEM - Shows server RAM percentage used.
- TIME+ - Shows how long the process has been running.
- COMMAND - Shows the daemon running the process. Useful for identifying the general system service that is being a resource hog.
Tasks are sorted by CPU use percentage by default. Type SHIFT-M to sort by memory, and SHIFT-P to switch back to sort by CPU percentage. Memory is helpful if you are getting privvmpages errors, and CPU is helpful if you are getting kmemsize errors.
When you are done with top, type CTRL-C to exit.
- Identify which system service(s) are using a high percentage of CPU-time or memory. For example, httpd is Apache, and mysqld is MySQL.
This may be enough to identify the exact cause of your problem. For example, if MySQL is the culprit, you can now check your running MySQL queries to see if any of them are extremely inefficient.
Handy MySQL command to view live queries:
watch "mysqladmin -u admin -p`cat /etc/psa/.psa.shadow` processlist"
See the top manual for more top commands.
If you received generic results from this step, such as the result that Apache is the resource hog, you can use a few more commands to drill deeper into the processes. Continue with the following steps as appropriate.
- If you suspect you may be hacked, type c to view the command that started each current process. If you notice something suspicious, such as a process with USER apache that was not initiated by COMMAND /usr/sbin/httpd, investigate the script listed in the COMMAND section.
- Note the PID number for a process that is using a high percentage of your resources, for example, 24158 for the syslogd process shown above. Exit top with CTRL-C. Now execute this command, replacing 24158 with your own PID:
lsof -p 24158
You should see output like this:
COMMAND PID USER FD TYPE DEVICE SIZE NODE NAME syslogd 24158 root cwd DIR 0,201 4096 8857888 / syslogd 24158 root rtd DIR 0,201 4096 8857888 / syslogd 24158 root txt REG 0,201 35832 8858116 /sbin/syslogd syslogd 24158 root mem REG 0,201 46680 16525660 /lib/libnss_files-2.5.so syslogd 24158 root mem REG 0,201 1594552 16525628 /lib/libc-2.5.so syslogd 24158 root mem REG 0,201 124432 16525614 /lib/ld-2.5.so syslogd 24158 root 0u unix 0x26484b80 358292464 /dev/log syslogd 24158 root 2w REG 0,201 39610 14951126 /var/log/messages syslogd 24158 root 3w REG 0,201 3009832 14951128 /var/log/secure syslogd 24158 root 4w REG 0,201 44906 49676428 /usr/local/psa/var/log/maillog syslogd 24158 root 5w REG 0,201 53710 14951614 /var/log/cron syslogd 24158 root 6w REG 0,201 0 14951132 /var/log/spooler syslogd 24158 root 7w REG 0,201 0 14951612 /var/log/boot.log
This shows all of the files currently opened by this process. This kind of output is particularly helpful if you are trying to track down web scripts associated resource-heavy Apache processes.
Run lsof -p as quickly as possible. Many processes last for only a few seconds, and if you don't catch one while it's running, you'll only see the log files as output.
- Alternately, you can make a note of the PID as described in the step above, then execute the following commands to view more information about the process:
- Exit top with CTRL-C.
- List the contents of the process folder, replacing 24158 with your own PID:
ls -la /proc/24158
- You should see output similar to the following:
total 0 dr-xr-xr-x 3 root root 0 Aug 19 14:10 . dr-xr-xr-x 2485 root root 0 Jun 2 2009 .. -r-------- 1 root root 0 Aug 19 16:36 auxv -r--r--r-- 1 root root 0 Aug 19 16:00 cmdline lrwxrwxrwx 1 root root 0 Aug 19 16:19 cwd -> / -r-------- 1 root root 0 Aug 19 16:36 environ lrwxrwxrwx 1 root root 0 Aug 19 16:19 exe -> /sbin/syslogd dr-x------ 2 root root 0 Aug 19 16:19 fd -r--r--r-- 1 root root 0 Aug 19 16:19 maps -rw------- 1 root root 0 Aug 19 16:36 mem -r--r--r-- 1 root root 0 Aug 19 16:36 mounts -r-------- 1 root root 0 Aug 19 16:36 mountstats lrwxrwxrwx 1 root root 0 Aug 19 16:19 root -> / -r-------- 1 root root 0 Aug 19 16:36 smaps -r--r--r-- 1 root root 0 Aug 19 16:00 stat -r--r--r-- 1 root root 0 Aug 19 16:39 statm -r--r--r-- 1 root root 0 Aug 19 16:36 status dr-xr-xr-x 3 root root 0 Aug 19 16:36 task -r--r--r-- 1 root root 0 Aug 19 16:36 wchan
Again, doing this as quickly as possible will give you the best results. Of particular use is the cwd, which shows the directory that the current process is working in.
- If these steps do not yield useful information, you will need to work with your system administrator to identify the cause of your issue in more detail.
Alternate Plesk method
- Sign into Plesk as admin or root.
- Click on Virtuozzo in the column on the left.
- Click on System Processes.
- You will see output similar to that mentioned in the top description above. Refer to Step 3 in the previous section.
- Click on Enable Auto Refresh to make the output closer to live. Note that this is not as effective as running top in SSH, because Plesk has to generate graphics.
- Analyze the output, following instructions from Steps 4 onward from the previous section. Note that you will have to use SSH for many of these steps.
- If these steps do not yield useful information, you will need to work with your system administrator to identify the cause of your issue in more detail.
Alternate SSH method
This method is not supported by (mt) Media Temple.
- Connect to the server via SSH as root.
- Enter the following to change into the root directory.
cd /root
- Run the following command:
- This command will print out the top 9 memory using processes on the server. Here is an example output:
- Press control-c on the keyboard to end the program. If you want to run it again, simply type:
python /root/memtop-0.9.py
wget http://memtop.googlecode.com/files/memtop-0.9.py && python memtop-0.9.py
PID | private/writ. mem |command
| current | previous |(truncated)
9486 | 186.8 MB | +++++ |/usr/sbin/named-unamed-c/etc/named.conf-unamed-t/var/named/run-root
19921 | 129.4 MB | +++++ |/usr/libexec/mysqld -basedir=/usr -datadir=/var/lib/mysql -user=mysql -log-error=/var/log/mysqld.log -pid-file=/var/run/mysqld/mysqld.pid -socket=/var/lib/mys
19857 | 43.5 MB | ++++ |spamd child
19701 | 43.5 MB | ++++ |/usr/bin/spamd -username=popuser -daemonize -nouser-config -helper-home-dir=/var/qmail -max-children 1 -create-prefs -virtual-config-dir=/var/qmail/mailnames
15379 | 38.4 MB | ++++ |/usr/sbin/httpd
15370 | 11.8 MB | +++ |/usr/sbin/httpd
24276 | 6.0 MB | +++ |/usr/sbin/sw-cp-serverd-f/etc/sw-cp-server/config
15374 | 5.5 MB | +++ |/usr/sbin/httpd
15757 | 2.3 MB | ++ |pythonmemtop-0.9.py
RAM usage: ======================================= 50.7 %
Common culprits
High traffic
Your server may be optimized perfectly, but it's getting hammered with high traffic. This is especially likely if you don't notice any one process using a lot of your resources, but you see dozens of processes open at once. Check your website statistics to see if you're getting unusually high traffic, then consider upgrading your server - see the Upgrade sub-section the suggested Solutions below.
Poorly-written software
If you notice any of the following:
- Zombie processes when you run top - check for the letter Z in the S column.
- Sudden performance drop after installing or upgrading a new software package.
- MySQL queries that run for longer than 2 seconds.
- A cluttered list of themes and plugins for your content management system, such as WordPress.
You may be suffering from poorly-written software, or a bad combination of software components. Contact your software developer for further assistance with custom-written software, or check your software's help forums for help with third-party components. Note that poor software may still be the cause of your issue even if you have none of the above symptoms.
Hacks
Your server may be compromised. The high memory use could be due to sending out lots of spam, forwarding large amounts of traffic, or running rogue processes. See this collection of articles for detailed information on investigating and resolving hacked server scenarios:
Runaway processes
If this is an out-of-the-blue occurrence for your server, you may simply need to kill off a runaway process. Every once in a while, even well-written code can spawn a process that gets stuck for some reason. See the Quick fix - reboot section above for instructions on rebooting your server or restarting a process.
Solutions
Eliminate resource hogs
If you have identified a specific script or piece of software, or even a single MySQL query, that is causing your memory over-use, you should remove or optimize the offending code.
You may need to work with a professional software developer or system administrator to do this effectively.
Optimize
Here is our general article on optimizing your server:
Tune Apache and MySQL
Tune resource allocation on your server:
- (dv) 3.5 Auto-Tuning Your Server - Use this article if you've ever upgraded or downgraded your (dv) Dedicated-Virtual Server.
- (dv) HOWTO: Basic Apache performance tuning (httpd) - Tune Apache.
- Using MySQLTuner on your Dedicated-Virtual Server - Tune MySQL.
- Using MySQL Report on a Dedicated-Virtual Server - View a detailed MySQL report.
Eliminate unneeded services
If you aren't using a particular service, shut it down.
- Enable or disable named (bind) on your (dv) 3.5 or (dpv) Nitro - Turn off the unneeded nameserver service, if you are not running private nameservers.
- Do the same for other unneeded services. For example, if you use Google Apps for email, you can turn off Qmail, SpamAssassin, and the IMAP/POP services.
Upgrade
If you're happy with how everything on your server runs - you just need more resources - you can upgrade your (dv) Dedicated-Virtual Server plan. There are a few different options, depending on your needs:
- Upgrade from a Base to a Rage, or from a Rage to an Extreme. If you need a little more of every resource, this is the way to go. This is the only way to upgrade your kernel memory and overcome a kmemsize deficiency.
- Purchase a RAM add-on. This is useful if the only resource limit you are hitting is privvmpages, and you don't need extra resources in any other area.
- Purchase a second (dv) Dedicated-Virtual Server. Many customers run MySQL from one server and Apache from another server. This is a good middle-of-the-road option if you need more than a (dv) Extreme, but less than a (dpv) Nitro.
- Upgrade to a (dpv) Nitro. This will give you your own machine which is four times as powerful as a (dv) Extreme. You must contact the (mt) Media Temple Sales department to purchase a (dpv) Nitro.
- If you are on one of the older (dv) Dedicated-Virtual Servers, like the 2.0 or 3.0, you can get additional resources by upgrading to the (dv) Dedicated-Virtual Server 3.5. To do this, you must purchase a new (dv) Dedicated-Virtual Server, then use the (dv):Plesk Migration Manager.
Please see our Upgrade (dv) Dedicated-Virtual Server article for details.
Monitor and maintain
Keep an eye on your server so you can quickly respond to small resource issues before they escalate.
- Install monitoring software on your server, such as Monit. This can alert you of resource overages as soon as they occur. Here are some other monitoring options:
- VPS Info - http://www.labradordata.ca/home/13
- Status2k - http://status2k.com/
- LoadAVG - http://www.silversoft.com/loadavg
- Memory Utilization Script - http://wiki.vpslink.com/index.php?title=Memory_Utilization_Script
- Note that Watchdog tends to be overzealous in reporting problems. You may want to use one of the other suggested programs.
- Become more familiar with SSH monitoring commands. Learn commands like top, free, cat /proc/user_beancounters, and ps auxx to view server resource usage in realtime.
- top - See the section on top above.
- free - Shows memory and swap space breakdown. http://www.linfo.org/free.html
- cat /proc/user_beancounters - See QoS Alerts, beancounters, and system resource limits.
- ps auxww - View all processes on your system.
- Check your system logs. See System Paths and Checking error logs for their locations.
- Check out one (mt) Media Temple customer's monitoring solution: http://davidseah.com/blog/comments/monitoring-my-media-temple-dv-base-memory-usage/
- Track server uptime externally using Pingdom or Down for everyone or just me?.
- Maintain up-to-date backups of your server so you can revert your server to a working state in the case of a hack or irreversible configuration issue.
- Consult with other (mt) Media Temple customers on our forums.





