Are you find your server is running slowly, how to troubleshot to find out what is causing the problem
Lets start looking at the server, you need to login to your server to run the command top
[email protected]:/# top
The next cause for high load is a system that has run out of available RAM and has started to go into swap. Because swap space is usually on a hard drive that is much slower than RAM, when you use up available RAM and go into swap, each process slows down dramatically as the disk gets used. What’s tricky about swap issues is that because they hit the disk so hard, it’s easy to misdiagnose them as I/O-bound load. After all, if your disk is being used as RAM, any processes that actually want to access files on the disk are going to have to wait in line. So, if I see high I/O wait in the CPU row in top, I check RAM next and rule it out before I troubleshoot any other I/O issues.
To diagnose out of memory issues, the first place I look is the next couple of lines in the top output:
Mem: 1020076k total, 998000k used, 27000k free, 85520k buffers Swap: 1004052k total, 4360k used, 999692k free, 280000k cached
These lines tell you the total amount of RAM and swap along with how much is used and free; however, look carefully, as these numbers can be misleading. To get an accurate amount of free RAM, you need to combine the values from the free column with the cached column. In this example, it would be 27000k + 280000k, or over 300Mb of free RAM. In this case, the system is not experiencing an out of RAM issue. Of course, even a system that has very little free RAM may not have gone into swap. That’s why you also must check the Swap: line and see if a high proportion of your swap is being used.
If you do find you are low on free RAM, go back to the same process output from top, only this time, look in the %MEM column. By default, top will sort by the %CPU column, so simply type M and it will re-sort to show you which processes are using the highest percentage of RAM.
I/O-bound load can be tricky to track down sometimes, if your system is swapping, it can make the load appear to be I/O-bound. Once you rule out swapping, and you do have a high I/O wait, the next step is to attempt to track down which disk and partition is getting the bulk of the I/O traffic. To do this, you need a tool like iostat.
The iostat tool provides a good overall view of your disk I/O statistics:
Like with top, iostat gives you the CPU percentage output. Below that, it provides a breakdown of each drive and partition on your system and statistics for each:
- tps: transactions per second.
- Blk_read/s: blocks read per second.
- Blk_wrtn/s: blocks written per second.
- Blk_read: total blocks read.
- Blk_wrtn: total blocks written.
By looking at these different values and comparing them to each other, ideally you will be able to find out first, which partition (or partitions) is getting the bulk of the I/O traffic, and second, whether the majority of that traffic is reads (Blk_read/s) or writes (Blk_wrtn/s). As I said, tracking down the cause of I/O issues can be tricky, but hopefully, those values will help you isolate what processes might be causing the load.
For instance, if you have an I/O-bound load and you suspect that your remote backup job might be the culprit, compare the read and write statistics. Because you know that a remote backup job is primarily going to read from your disk, if you see that the majority of the disk I/O is writes, you reasonably can assume it’s not from the backup job. If, on the other hand, you do see a heavy amount of read I/O on a particular partition, you might run the lsof command and grep for that backup process and see whether it does in fact have some open file handles on that partition.
- In case of DDoS attack, install Apache module mod_evasive – this should be install by default as it can be used for preventing DDoS attacks.
- Optimize your application: profile your application’s code and try and optimize the areas that place the most load on the server.
- PHP could be a culprit – caching PHP would be a good option by adding memchached or eAccelerator
- Consider changing to a lower impact web server, such as nginx. This could increase the number of connections that the server can handle.
- Add page caching, using something like varnish
- Add another server, and load balance between them
- Increase the capabilities of the existing server (e.g. increase RAM, upgrade processor, install a faster hard disk).
All of these options have a cost, in time or money. It has to be done, the quickest wins in my opinion would probably be options 1, 2 and 3. Option 4 and 5 means downtime while they’re being installed.
Here are some useful articles to read: