Question

What is mode?Explain the three important methods of estimating the mode of a series.

Accepted Answer

In statistics, the mode is the value that occurs most frequently in a data set or a probability distribution.

the mode is a way of capturing important information about a random variable or a population in a single quantity. The mode is in general different from the mean and median, and may be very different for strongly skewed distributions.

Mode of a sample

The mode of a sample is the element that occurs most often in the collection. For example, the mode of the sample [1, 3, 6, 6, 6, 6, 7, 7, 12, 12, 17] is 6. Given the list of data [1, 1, 2, 4, 4] the mode is not unique - the dataset may be said to be bimodal, while a set with more than two modes may be described as multimodal.

For a sample from a continuous distribution, such as [0.935..., 1.211..., 2.430..., 3.668..., 3.874...], the concept is unusable in its raw form, since each value will occur precisely once. The usual practice is to discretize the data by assigning frequency values to intervals of equal distance, as for making a histogram, effectively replacing the values by the midpoints of the intervals they are assigned to. The mode is then the value where the histogram reaches its peak. For small or middle-sized samples the outcome of this procedure is sensitive to the choice of interval width if chosen too narrow or too wide; typically one should have a sizable fraction of the data concentrated in a relatively small number of intervals (5 to 10), while the fraction of the data falling outside these intervals is also sizable. An alternate approach is kernel density estimation, which essentially blurs point samples to produce a continuous estimate of the probability density function which can provide an estimate of the mode.

Example for a skewed distribution

An example of a skewed distribution is personal wealth: Few people are very rich, but among those some are extremely rich. However, many are rather poor.

Comparison of mean, median and mode of two log-normal distributions with different skewness.

A well-known class of distributions that can be arbitrarily skewed is given by the log-normal distribution. It is obtained by transforming a random variable $X$ having a normal distribution into random variable $Y = e^{X}$ . Then the logarithm of random variable $Y$ is normally distributed, hence the name.

Taking the mean $\mu$ of $X$ to be 0, the median of $Y$ will be 1, independent of the standard deviation $\sigma$ of $X$ . This is so because $X$ has a symmetric distribution, so its median is also 0. The transformation from $X$ to $Y$ is monotonic, and so we find the median $e^0 = 1$ for $Y$ .

When $X$ has standard deviation $\sigma = 0.25$ , the distribution of $Y$ is weakly skewed. Using formulas for the log-normal distribution, we find:

\text{mean} = e^{\mu + \sigma^2 / 2} = e^{0 + 0.25^2 / 2} \approx 1.032

\text{mode} = e^{\mu - \sigma^2} = e^{0 - 0.25^2} \approx 0.939

\text{median} = e^{\mu} = e^0 = 1

Indeed, the median is about one third on the way from mean to mode.

When $X$ has a larger standard deviation, $\sigma = 1$ , the distribution of $Y$ is strongly skewed. Now

\text{mean} = e^{\mu + \sigma^2 / 2} = e^{0 + 1^2 / 2} \approx 1.649

\text{mode} = e^{\mu - \sigma^2} = e^{0 - 1^2} \approx 0.368

\text{median} = e^{\mu} = e^0 = 1

Here, Pearson's rule of thumb fails.

Question #9657

Expert's answer