[Next] [Up] [Previous]

Next: Maximum Entropy Formalism for
Up: Model of Liquid Sheet
Previous: Shannon's Entropy

The Principle of Maximum Entropy

The Principle of Maximum Entropy states: When one has only partial information about the possible outcomes one should choose the probabilities so as to maximize the uncertainty about the missing information, as shown by Jaynes [8]. In other words, the basic rule is: Use all the information on the parameter that you have, but avoid including any information that you do not have. Therefore one should be as uncommitted as possible about missing information.

Also, entropy is a measure of randomness. By applying the principle of maximum entropy, one obtains the most random distribution subject to the satisfaction of the given constraints. We might also say, that if there is not complete information about a distribution, the optimum estimate is as unbiased as possible, and so choose the most random possible distribution. Choosing any other distribution would mean, including additional information not given to us and by that not keeping to the principle.

Jaynes proposed that Shannon's measure of uncertainty (entropy) could be used to define the values for probabilities. The principle of Maximum Entropy provides that if there are n possible outcomes then, in the absence of additional information, the outcomes should be presumed to have equal probabilities. So no outcome is preferred over any other.
 displaymath2112
We may also have some additional information that can be expressed as
 displaymath2114
In the constraint equations tex2html_wrap_inline2077 is a function of n variables tex2html_wrap_inline2081. We have m+1 relations between tex2html_wrap_inline2085. If m+1<n, it is not possible to determine the probabilities tex2html_wrap_inline2085 uniquely. We can use any arbitrary values for n-m-1 of the probabilities. After that we can solve the remaining m+1 probabilities by using the equations (2.3) and (2.4).

We thus have a infinite number of solutions for the probabilities and consequently an infinity of probability distributions. According to Jaynes one should select that distribution which has maximum entropy. He suggested that we should choose tex2html_wrap_inline2085 so as to maximize the uncertainty measure tex2html_wrap_inline2061 subject to equations (2.3) and (2.4).

To sum up we should choose the distribution that

*
says least about the information we don't have, or
*
has maximum uncertainty, or
*
is most random, or
*
is most unbiased


displaymath2116

The minimally prejudiced (or biased) probability distribution is the set of tex2html_wrap_inline2021 which obeys the (m+1) equations (2.3), (2.4) above and maximizes S of equation (2.1). The resulting distribution is
 displaymath2118
where the tex2html_wrap_inline2105's are Lagrangian multipliers. Since a exponential function is never negative it is for sure that tex2html_wrap_inline2107 for each i so that there is no need to state the non-negativity constraint.

The sum of partial probabilities is unity (eq. 2.3). Forming the sum of equation (2.6) we get
displaymath2120
and solving for tex2html_wrap_inline2111
 displaymath2122


[Next] [Up] [Previous]
Next: Maximum Entropy Formalism for Up: Model of Liquid Sheet Previous: Shannon's Entropy

Heike Preußer
26 March 1997