- Order:
- Duration: 3:02
- Published: 06 Feb 2007
- Uploaded: 30 Jun 2011
- Author: EScottDygert
There are many different descriptive statistics that can be chosen as a measurement of the central tendency of the data items. These include arithmetic mean, the median and the mode. Other statistical measures such as the standard deviation and the range are called measures of spread and describe how spread out the data is.
An average is a single value that is meant to typify a list of values. If all the numbers in the list are the same, then this number should be used. If the numbers are not the same, the average is calculated by combining the values from the set in a specific way and computing a single number as being the average of the set.
The most common method is the arithmetic mean but there are many other types of central tendency, such as median (which is used most often when the distribution of the values is skewed with some small numbers of very high values, as seen with house prices or incomes).
__TOC__
:
The arithmetic mean, often simply called the mean, of two numbers, such as 2 and 8, is obtained by finding a value A such that 2 + 8 = A + A. One may find that A = (2 + 8)/2 = 5. Switching the order of 2 and 8 to read 8 and 2 does not change the resulting value obtained for A. The mean 5 is not less than the minimum 2 nor greater than the maximum 8. If we increase the number of terms in the list for which we want an average, we get, for example, that the arithmetic mean of 2, 8, and 11 is found by solving for the value of A in the equation 2 + 8 + 11 = A + A + A. One finds that A = (2 + 8 + 11)/3 = 7.
The geometric mean of n numbers is obtained by multiplying them all together and then taking the nth root. In algebraic terms, the geometric mean of a1, a2, ..., an is defined as
:
Geometric mean can be thought of as the antilog of the arithmetic mean of the logs of the numbers.
Example: Geometric mean of 2 and 8 is
:
One example where it is useful is calculating the average speed. For example, if the speed for going from point A to B was 60 km/h, and the speed for returning from B to A was 40 km/h, then the average speed is given by
:
:
It is easy to remember noting that the alphabetical order of the letters A, G, and H is preserved in the inequality. See Inequality of arithmetic and geometric means.
The mode has the advantage that it can be used with non-numerical data (e.g., red cars are most frequent), while other averages cannot.
The median is the middle number of the group when they are ranked in order. (If there are an even number of numbers, the mean of the middle two is taken.)
Thus to find the median, order the list according to its elements' magnitude and then repeatedly remove the pair consisting of the highest and lowest values until either one or two values are left. If exactly one value is left, it is the median; if two values, the median is the arithmetic mean of these two. This method takes the list 1, 7, 3, 13 and orders it to read 1, 3, 7, 13. Then the 1 and 13 are removed to obtain the list 3, 7. Since there are two elements in this remaining list, the median is their arithmetic mean, (3 + 7)/2 = 5.
This method can be generalized to examples in which the periods are not all of one-year duration. Average percentage of a set of returns is a variation on the geometric average that provides the intensive property of a return per year corresponding to a list of percentage returns. For example, consider a period of a half of a year for which the return is −23% and a period of two and one half years for which the return is +13%. The average percentage return for the combined period is the single year return, R, that is the solution of the following equation: , giving an average percentage return R of 0.0600 or 6.00%.
Thus standard deviation about the mean is lower than standard deviation about any other point, and the maximum deviation about the midrange is lower than the maximum deviation about any other point. The uniqueness of this characterization of mean follows from convex optimization. Indeed, for a given (fixed) data set x, the function
:
represents the dispersion about a constant value c relative to the L2 norm. Because the function ƒ2 is a strictly convex coercive function, the minimizer exists and is unique.
Note that the median in this sense is not in general unique, and in fact any point between the two central points of a discrete distribution minimizes average absolute deviation. The dispersion in the L1 norm, given by : is not strictly convex, whereas strict convexity is needed to ensure uniqueness of the minimizer. In spite of this, the minimizer is unique for the L∞ norm.
One can create one's own average metric using generalized f-mean:
:
where f is any invertible function. The harmonic mean is an example of this using f(x) = 1/x, and the geometric mean is another, using f(x) = log x. Another example, expmean (exponential mean) is a mean using the function f(x) = ex, and it is inherently biased towards the higher values. However, this method for generating means is not general enough to capture all averages. A more general method for defining an average, y, takes any function of a list g(x1, x2, ..., xn), which is symmetric under permutation of the members of the list, and equates it to the same function with the value of the average replacing each member of the list: g(x1, x2, ..., xn) = g(y, y, ..., y). This most general definition still captures the important property of all averages that the average of a list of identical elements is that element itself. The function g(x1, x2, ..., xn) =x1+x2+ ...+ xn provides the arithmetic mean. The function g(x1, x2, ..., xn) =x1·x2· ...· xn provides the geometric mean. The function g(x1, x2, ..., xn) =x1−1+x2−1+ ...+ xn−1 provides the harmonic mean. (See John Bibby (1974) “Axiomatisations of the average and a further generalisation of monotonic sequences,” Glasgow Mathematical Journal, vol. 15, pp. 63–65.)
:Update rule for a window of size upon seeing new element :
Marine damage is either particular average, which is borne only by the owner of the damaged property, or general average, where the owner can claim a proportional contribution from all the parties to the marine venture. The type of calculations used in adjusting general average gave rise to the use of "average" to mean "arithmetic mean".
However, according to the Oxford English Dictionary, the earliest usage in English (1489 or earlier) appears to be an old legal term for a tenant's day labour obligation to a sheriff, probably anglicised from "avera" found in the English Domesday Book (1085). This pre-existing term thus lay to hand when an equivalent for avarie was wanted.
Category:Summary statistics Category:Means Category:Statistical terminology
This text is licensed under the Creative Commons CC-BY-SA License. This text was originally published on Wikipedia and was developed by the Wikipedia community.