Friday, April 9, 2010

Linux Server Load


Server load - just a number? Regular monitoring of server has always been one of the top priority tasks of server system administrator. Almost all of us use commands such as uptime, top, w, procinfo, etc, comes with a line denotes load average. In one line The load average is the sum of the run queue length and the number of jobs currently running on the CPUs.

The Linux load average is a set of 3 real numbers separated by comma, which defines as, the number of processes waiting in the run-queue (to compete for CPU processing) plus the number currently executing over the last 1, 5, and 15 minutes respectively.

They will tell how busy or how CPU bound the Linux system might be. As long as the CPU utilization rate is not easily exceed 70%, the CPU is still able to handle CPU-bound processes or busiest Linux system!Linux load average is intended to provide some kind of information about how much work has been done on the system in the recent past 1 minute, the past 5 minutes and the distant past 15 minutes.Linux load average is not about utilization but the total queue length.Linux load average figures are point samples of three different time series.Linux load average are exponentially-damped moving averages.Linux load average figures are in the wrong order to represent trend information. Linux load average is an exponentially smoothed moving average function. In this way sudden changes can be damped. Hence, Linux load average figures don’t contribute significantly to the longer term picture.

Knowing what the value of the server load is not very important though. Knowing how to interpret the value is what counts. Lets understand that, The load averages differ from CPU percentage in two significant ways: 1. load averages measure the trend in CPU utilization not only an instantaneous snapshot, as does percentage, and 2. Load averages include all demand for the CPU not only how much was active at the time of measurement.

Now question might come, what is a ideal /optimum load for my server. Server load is a number in the format x.xx. And of course a load of 0.xx is always safe :-). because the server load represents the number of processes waiting to access the CPU.

Well even though to derive on a point “What is high”, in one line, for an ideal kind utilization of your CPU , the maximum value here should be equal to the number of CPU’s in the box. If the server has a single CPU (central processing unit), a server load higher than 1.00 is not good; if the server has two CPUs, a server load over 2.00 is not good and so on.

Now what to do when i am encountering a very odd load scenario.

1/ Try top command.

2/ Shift + p

3/ Check if the topmost processes are gone haywire somewhere.

4/ Just after typing top press ‘1′ to get a more realistic picture CPU wise.

5/ May be you can kill some pid’s to get a rescue.

Nevertheless, identifying the exact cause review of you application design is must as a post analysis along with logs etc.


Hope it helps.


Further reading Here: Unix/Linux Load average.


~Debu

No comments:

Post a Comment

RCA - Root Cause Analysis

An important step in finding the root causes of issues or occurrences that happen within a system or organization is root cause analysis (RC...