# Expectation of Random Variables

Certainly! The expectation (or expected value) of a random variable is a fundamental concept in probability and statistics, providing a measure of the "average" or "center" of a random variable's distribution. Here's how the expectation is defined for the three types of random variables:

**Discrete Random Variables**: A discrete random variable is one that can take on a finite or countably infinite number of distinct values. Let $X$ be a discrete random variable with possible values $x_1, x_2, ...$ and let $P(X = x_i)$ be the probability that $X$ takes on the value $x_i$. The expected value $E(X)$ of $X$ is defined as: $E(X) = \sum_i x_i \cdot P(X = x_i)$In other words, for each possible value $x_i$ of $X$, you multiply that value by its probability and sum up all such products.

**Continuous Random Variables**: A continuous random variable is one that can take on a continuum of values over a certain interval or set of intervals. Let $X$ be a continuous random variable with probability density function (pdf) $f(x)$. The expected value $E(X)$ of $X$ is defined as: $E(X) = \int_{-\infty}^{\infty} x \cdot f(x) \, dx$Essentially, you integrate over all possible values of $X$, multiplying each value by its density, to get the average value.

**General Random Variables**: The definition of expectation for a general random variable (which could be mixed, having both discrete and continuous components) is a combination of the two above. In the most general sense, if $X$ is a random variable with cumulative distribution function (cdf) $F(x)$, its expectation is given by: $E(X) = \int_{-\infty}^{\infty} x \, dF(x)$This definition encompasses both discrete and continuous cases as special instances.

It's worth noting that for a random variable to have a finite expectation, the sums or integrals above must converge to a finite number. If they do not, the expectation is said to be infinite or undefined.

A continuous random variable is one that can take on an uncountably infinite number of values in an interval (or multiple intervals). This is in contrast to a discrete random variable, which takes on a finite or countably infinite number of distinct values.

Here's a precise definition:

**Continuous Random Variable**: A random variable $X$ is said to be continuous if there exists a non-negative function $f(x)$, called the probability density function (pdf), such that for any two numbers $a$ and $b$ with $a \leq b$:

$P(a \leq X \leq b) = \int_{a}^{b} f(x) \, dx$

Where:

- $P(a \leq X \leq b)$ is the probability that the random variable $X$ lies between $a$ and $b$, inclusive.
- The function $f(x)$ satisfies two main properties:
- $f(x) \geq 0$ for all $x$.
- $\int_{-\infty}^{\infty} f(x) \, dx = 1$ (i.e., the total area under the curve of $f$ over the entire real line is 1).

Unlike discrete random variables, the probability that a continuous random variable takes on any specific value is always zero. This is because the number of potential values in any interval, however small, is uncountably infinite, so individual points have no width on which to accumulate probability.

A classic example of a continuous random variable is one that follows the normal (or Gaussian) distribution, which has a bell-shaped probability density function. Another example is a random variable following a uniform distribution on the interval [0, 1], where any value between 0 and 1 is equally likely.

Certainly! Measure theory offers a rigorous foundation for integration and is essential when we talk about more general random variables, especially in the context of probability theory.

Let's delve into the measure-theoretic definition:

**Probability Space**: Let $(\Omega, \mathcal{F}, P)$ be a probability space, where:- $\Omega$ is the sample space.
- $\mathcal{F}$ is a $\sigma$-algebra of subsets of $\Omega$ (these subsets are called "events").
- $P$ is a probability measure defined on $\mathcal{F}$, assigning to each event $A \in \mathcal{F}$ a number $P(A)$ between 0 and 1.

**Measurable Function**: Let $X: \Omega \rightarrow \mathbb{R}$ be a (real-valued) random variable. We say that $X$ is measurable with respect to $\mathcal{F}$ (or $\mathcal{F}$-measurable) if for every Borel set $B$ in $\mathbb{R}$, $X^{-1}(B) \in \mathcal{F}$. In simpler terms, this means that the pre-image of any Borel set under $X$ is an event in $\mathcal{F}$.**Expectation**: The expectation of a random variable $X$ is defined as: $E(X) = \int_{\Omega} X(\omega) \, dP(\omega)$ This is the Lebesgue integral of $X$ with respect to the probability measure $P$. Note that the random variable $X$ serves as the integrand, and the integration is over the sample space $\Omega$ with respect to the measure $P$.

The expectation is well-defined whenever the integral is absolutely convergent, meaning: $\int_{\Omega} |X(\omega)| \, dP(\omega) < \infty$

When we discuss real-valued random variables on $\Omega$, this measure-theoretic framework helps ensure all the definitions and operations are rigorously grounded. The definition of expectation given here captures both discrete and continuous random variables as special cases and also extends naturally to more general random variables.

In measure theory, the integral of a function with respect to a measure (like $P$) is defined using the concept of simple functions and then extended to more complex functions. The integral of a random variable $X$ with respect to the probability measure $P$ over a sample space $\Omega$, written as $\int_\Omega X(\omega) \, dP(\omega)$, is defined using the Lebesgue integral.

Here’s a rough sketch of how the Lebesgue integral is defined:

**Simple Functions**: Start with "simple functions." A simple function is one that takes on a finite number of distinct values. For a simple function $\phi$ that takes on values $a_1, a_2, \ldots, a_n$ over sets $E_1, E_2, \ldots, E_n$, respectively, the integral is: $\int_\Omega \phi \, dP = \sum_{i=1}^n a_i P(E_i)$**Positive Measurable Functions**: For a positive measurable function $f$ (meaning $f(\omega) \geq 0$ for all $\omega$), the integral of $f$ is the supremum (least upper bound) of the integrals of all simple functions $\phi$ that are bounded by $f$: $\int_\Omega f \, dP = \sup \left\{ \int_\Omega \phi \, dP : 0 \leq \phi \leq f, \phi \text{ is simple} \right\}$**General Measurable Functions**: For a general measurable function $f$ that can take both positive and negative values, decompose $f$ into its positive and negative parts: $f^+(\omega) = \max(f(\omega), 0)$ $f^-(\omega) = \max(-f(\omega), 0)$ Then $f$ can be written as $f = f^+ - f^-$. Now, the integral of $f$ is defined as: $\int_\Omega f \, dP = \int_\Omega f^+ \, dP - \int_\Omega f^- \, dP$ provided at least one of the two integrals on the right-hand side is finite.

The beauty of the Lebesgue integral is that it can handle functions that are more "irregular" than those that can be integrated using the classical Riemann integral. In the context of probability, the Lebesgue integral allows us to integrate a broader class of random variables with respect to a probability measure.

Certainly. When you referred to the integral with respect to $F$, you were pointing to the cumulative distribution function (CDF) of a random variable $X$. The CDF $F$ is defined as: $F(x) = P(X \leq x)$

Given the cumulative distribution function $F$ of a real-valued random variable $X$, the expectation can be formulated in terms of $F$ using the Lebesgue-Stieltjes integral.

For a general random variable $X$ with CDF $F$, the expectation $E(X)$ can be defined as: $E(X) = \int_{-\infty}^{\infty} x \, dF(x)$ provided the integral exists.

Breaking it down further:

**Positive Values**: Consider the part of the integral for positive values: $\int_0^{\infty} x \, dF(x)$ This captures contributions from all positive outcomes.**Negative Values**: Now consider the part of the integral for negative values: $\int_{-\infty}^{0} x \, dF(x)$ This covers contributions from all negative outcomes.

Combining these parts, the expectation is: $E(X) = \int_{-\infty}^{0} x \, dF(x) + \int_0^{\infty} x \, dF(x)$

The integral is the Lebesgue-Stieltjes integral, which is a generalization of the Riemann-Stieltjes integral in a way that is analogous to how the Lebesgue integral generalizes the Riemann integral. This definition of expectation allows for a unified treatment of both discrete and continuous random variables, as well as more general mixed cases.

The Riemann-Stieltjes integral provides a generalization of the standard Riemann integral. While the Riemann integral integrates a function with respect to the variable $x$, the Riemann-Stieltjes integral integrates a function with respect to another function, say $\alpha(x)$.

Here's a precise definition:

Let $f$ and $\alpha$ be real-valued functions defined on a closed interval $[a, b]$. The function $f$ is said to be Riemann-Stieltjes integrable with respect to $\alpha$ on $[a, b]$ if there exists a limit: $\int_a^b f(x) \, d\alpha(x) = \lim_{{\lVert P \rVert} \to 0} \sum_{i=0}^{n-1} f(x_i^*)(\alpha(x_{i+1}) - \alpha(x_i))$ for some choice of $x_i^*$ in each subinterval $[x_i, x_{i+1}]$, where $P$ is a partition of $[a, b]$ into $n$ subintervals: $a = x_0 < x_1 < \ldots < x_n = b$ and ${\lVert P \rVert}$ denotes the mesh or maximum width of the partitions: ${\lVert P \rVert} = \max_{0 \leq i \leq n-1} (x_{i+1} - x_i)$

If such a limit exists, then $f$ is said to be Riemann-Stieltjes integrable over $[a, b]$ with respect to $\alpha$, and the value of the integral is given by the limit above.

Several points worth noting:

- When $\alpha(x) = x$, the Riemann-Stieltjes integral reduces to the standard Riemann integral.
- If $\alpha$ is of bounded variation and $f$ is continuous, then $f$ is Riemann-Stieltjes integrable with respect to $\alpha$ over $[a, b]$.
- The integral $\int_a^b f(x) \, d\alpha(x)$ can be thought of as "summing" the values of $f$ in a weighted manner, where the weights are determined by the increments of the function $\alpha$.

The Riemann-Stieltjes integral provides a useful bridge between discrete sums and continuous integrals, and it plays a foundational role in the development of more general integration theories like the Lebesgue-Stieltjes integral.

The Riemann-Stieltjes integral can be used to define the expected value of many random variables, but it's not sufficient for all cases, especially in the context of more complex or irregular probability distributions.

For discrete random variables and absolutely continuous random variables (those with a probability density function), the Riemann-Stieltjes integral is adequate for defining the expected value. Specifically:

**Discrete Random Variable**: If $X$ is a discrete random variable with possible values $x_1, x_2, \ldots$ and probabilities $p_1, p_2, \ldots$ respectively, then its cumulative distribution function (CDF) $F(x)$ has "jumps" at these values. The expected value can be written as a sum: $E(X) = \sum_i x_i p_i$ Which can be viewed as a Riemann-Stieltjes integral with respect to $F(x)$.**Absolutely Continuous Random Variable**: If $X$ has a probability density function $f(x)$, then its CDF $F(x)$ is differentiable almost everywhere, and $f(x) = F'(x)$. The expected value can be expressed as a Riemann integral: $E(X) = \int_{-\infty}^{\infty} x f(x) \, dx$ But it can also be viewed as a Riemann-Stieltjes integral with respect to $F(x)$.

However, for more complex random variables, the Riemann-Stieltjes integral might not suffice. Especially problematic are random variables associated with singular distributions (neither absolutely continuous nor purely discrete, with CDFs that are continuous but not absolutely continuous). In such cases, the Lebesgue-Stieltjes integral, based on the more general Lebesgue integration theory, is required to properly define the expected value.

In conclusion, while the Riemann-Stieltjes integral is sufficient for a large class of random variables, it's not general enough to handle all cases. The Lebesgue-Stieltjes integral provides a more comprehensive framework for defining the expected value of a general random variable.