There are three primary measures of central tendency:
- Mean
- Median
- Mode
Mean
Simple Mean
The mean is the most important measure. The mean of data points is defined as:
In some datasets, the mean may not be representative of any individual datum. For example, the mean of the set is , despite zero not being close to any individual value in the set.
Weighted Mean
If different values are of different importance, it may be useful to calculate a mean where values have different weightings. For a set of data points , with weights , the weighted mean is defined as:
Median
The median is the ‘middle’ item in the dataset. The median can be calculated from a sorted set of data points:
- If is odd then the median is the middle item.
- If is even then the median is the Mean of the two middle items.
The median is more robust against extreme values than the mean and is often more useful for asymmetric (skewed) data.
Example
A common example where the median is preferred is salaries. The mean salary of a group containing 9 people earning £20k and 1 person earning £2M is £201,800, which is not representative of the actual scenario.
Mode
The mode is the value which occurs most frequently in a dataset. Some datasets may have no mode (all values are unique), while some may be bimodal or multimodal. The mode is generally most useful when dealing with probability distributions.