Students in a first-year probability course learn the concept of the moment of a random variable. The moments are related to various aspects of a probability distribution. In this context, the formula for the mean or the first moment of a non-negative continuous random variable is often shown in terms of its c.d.f. (or the survival function). However, higher order moments are also important, for example, to study the variance or the skewness of a distribution. In this note, we consider the \(r\)th moment of a non-negative random variable and derive formulas in terms of the c.d.f. paralleling the existing results for the first moment (the mean) using Fubini’s theorem. These formulas may be advantageous, for example, when dealing with the moments of a transformed random variable, where it may be easier to derive its c.d.f. using the so-called c.d.f. method.

# The first moment

Moments of random variables play a key role in describing and understanding of probability distributions. For a continuous non-negative random variable \(X\), it is well known that the mean or the first moment, when it exists, can be expressed as

\[E(X) = \int_0^\infty \left( 1- F_X (x) \right) dx,\]

where \(F_X (x)\) is the c.d.f. of \(X\). The function $1-F_X(x) $, which is the probability that exceeds is commonly known as the **survival function** of which has a long history in the analysis of life tables, and has long been used in the actuarial, bio-statistical, demographic, and engineering applications, e.g. Keyfitz (1968).

Suppose we have a continuous random variable \(X\) whose range / support is \([0,\infty)\). Assume that the expectation of this random variable exists. We begin with the usual definition of expectation,

\[E(X) = \int_0^\infty xdF_X(x) ,\]

and then we integrate by parts

\[E(X) = \int_0^\infty xf_X(x)dx = -x\left(1-F_X(x)\right) \Big|_{x=0}^\infty + \int_0^\infty (1-F_X(x))dx\]

Regarding the first term, it is clear that evaluating at \(x= 0\) gives us \(0\). But what about \(x \rightarrow \infty\)? The \(x\) term will grow unboundedly, and \(1-F_X(x)\) will approach \(0\) since \(F_X(x)\) is non-decreasing, so we’re in a case where the limit is $ 0 . $. As is, we can’t say anything about this limit, but we can hope that \(1-F_X(x)\) decays to zero faster than \(x\) grows to infinity.

Actually proving that

\[ \lim_{x \rightarrow \infty} x\left(1-F_X(x)\right) = 0 \]

requires a bit of analytic trickery. Since I didn’t come up with the trick, I urge you to see the second page of Muldowney, Ostaszewski, and Wojdowski’s paper. With this result in hand, we’ve completed the derivation and found that, indeed,

\[E(X) = \int_0^\infty \left( 1- F_X (x) \right) dx\].

# Higher order moments

Higher order (greater than the first) moments are also of great interest in the same context. These are needed to describe aspects of the distribution other than the location, for example, the variance, the skewness and the kurtosis of a distribution. To this end, it is known that when it exists, the $ r $th moment about the origin, $ E(X^r) $ of a continuous non-negative random variable is given by

\[E(X^r) = \int_0^\infty rx^{r-1}\left( 1- F_X(x) \right) dx, r \ge 1\].

Now we prove above formula.

Assuming that it exists, the \(r\)th moment of a continuous non-negative random variable \(X\), with p.d.f. \(f_x\), is given by:

\[E\left(X^r\right) = \int_0^\infty t^rf_X(t)dt, r \ge 1.\] Since \(t^r = r\int_0^t x^{r-1}dx\), the above integral can be written as

\[E\left(X^r\right) =r\int_0^\infty \left(\int_0^t x^{r-1}dx \right)f_X(t)dt\]

Assuming that \(E\left(X^r \right)\) exists, one can interchange the order of above integrations using Fubini’s theorem. This yields

\[E\left(X^r\right) =r\int_0^\infty \int_x^\infty x^{r-1}f_X(t)dtdx = r\int_0^infty x^{r-1}\left(1-F_X(x) \right) dx\]

The proof is now completed.

# Example

## Example 1

\(X\) is a continuous random variable with the following distribution function

\[F(x) = \begin{cases} 0 \text{ if } x < 0 \\ \sin x \text{ if } 0 \le x < \frac{\pi}{2} \\ 1 \text{ if } x \ge \frac{\pi}{2} \end{cases}\]

Calculate \(E(X)\).

### Solution 1

We firstly differentiate the distribution function to obtain the density function:

\[f(x) = \cos x \text{ if } 0 \le x < \frac{\pi}{2} \]

Then we calculate \(E(X)\) by employing integrals by parts.

\[E(X) = \int_0^{\pi / 2} x\cos x dx = x\sin x \Big|_0^{\pi /2} - \int_0^{\pi / 2} \sin x dx = \frac{\pi}{2} - 1\]

### Solution 2

We also calculate \(E(X)\) by employing survival function

\[E(X) = \int_0^{\pi /2} (1-\sin x)dx = \frac{\pi}{2} + \cos \frac{\pi}{2} - \cos 0 = \frac{\pi}{2} + 0 - 1 = \frac{\pi}{2} - 1 \]

## Example 2

Suppose that the c.d.f. of \(X\) given by

\[F(x) = \begin{cases} 0 \text{ for } 0 \le x <1 \\ \ln x \text{ for } 1 \le x < e \\ 1 \text{ for } x \ge e \end{cases}\]

Calculate variance of \(X\).

### Solution

We know that

\[\text{Var}(X) = E(X^2) - E(X)^2\]

where

\[ \begin{aligned} &\begin{aligned}E(X) &= \int_0^\infty \left( 1-F(x) \right)dx = \int_0^1 dx + \int_1^e (1-\ln x)dx = x \Big|_0^1 + x\Big|_1^e - \left(x\ln x -x \right) \Big|_1^e \\&= e-1\end{aligned} \\ &\begin{aligned}E(X^2) &= \int_0^\infty 2x\left( 1-F(x) \right)dx = 2\left( \int_0^1 xdx +\int_1^e x(1-\ln x) dx \right) \\&= x^2 \Big|_0^1 +x^2 \Big|_1^e -\left[x^2\ln x - \frac{x^2}{2} \right] \Big|_1^e = \frac{1}{2}(e^2-1)\end{aligned} \end{aligned}\]

So that finally, \(\text{Var}(X)\) is equal to

\[\text{Var}(X) = -\frac{1}{2} e^2 + 2e -\frac{3}{2} = 0.2420\]

# Reference

- Keyfitz, N. (1968). “Introduction to the Mathematics of Population”. Reading, MA: Addison-Wesley.
- http://www.degruyter.com/view/j/tmmp.2012.52.issue-1/v10127-012-0025-9/v10127-012-0025-9.xml