Linear Transformation of a Single Random Variable
- In probability theory and statistics, a linear transformation of a random variable is a fundamental concept that allows us to manipulate and analyze data more effectively.
- For a random variable $X$, a linear transformation takes the form $aX + b$, where $a$ and $b$ are constants.
Expected Value of a Linear Transformation
The expected value of a linear transformation of a random variable $X$ is given by:
$$E(aX + b) = aE(X) + b$$
This formula is incredibly useful as it allows us to calculate the expected value of transformed data without having to recalculate the entire distribution.
Suppose we have three random variables:
\[\begin{aligned} &X_1: \text{ Number of cars sold }, \quad E(X_1) = 10 \\&X_2: \text{ Number of motorcycles sold }, \quad E(X_2) = 5 \\&X_3: \text{ Number of bicycles sold }, \quad E(X_3) = 20 \end{aligned} \]
If the profit for each is 1000,500 , and $\$ 200$ respectively, the expected total profit would be:
$$E\left(1000 X_1+500 X_2+200 X_3\right)$$ $$=1000 E\left(X_1\right)+500 E\left(X_2\right)+200 E\left(X_3\right)$$ $$=1000(10)+500(5)+200(20)=10000+2500+4000=16,500 \$ $$
Variance of a Linear Transformation
The variance of a linear transformation of a random variable $X$ is given by:
$$VAR(aX + b) = a^2 VAR(X)$$
Notice that the constant $b$ doesn't affect the variance, as it doesn't contribute to the spread of the data.
The standard deviation of a linear transformation is the absolute value of $a$ times the standard deviation of $X$: $SD(aX + b) = |a| * SD(X)$
Continuing with our height example, if $VAR(X) = 100$ cm², then:
$VAR(0.3937X + 2) = 0.3937^2 * 100 ≈ 15.5$ inches²
Expected Value of Linear Combinations of n Random Variables
- When dealing with multiple random variables, we often need to consider their linear combinations.
- The expected value of a linear combination of $n$ random variables is:
$$E(a_1X_1 + a_2X_2 + ... + a_nX_n) = a_1E(X_1) + a_2E(X_2) + ... + a_nE(X_n)$$
This property is known as the linearity of expectation and is extremely useful in many applications.
Suppose we have three random variables:
- $X_1$: Number of cars sold (E(X_1) = 10)
- $X_2$: Number of motorcycles sold (E(X_2) = 5)
- $X_3$: Number of bicycles sold (E(X_3) = 20)
If the profit for each is $1000, $500, and $200 respectively, the expected total profit would be:
$E(1000X_1 + 500X_2 + 200X_3) = 1000E(X_1) + 500E(X_2) + 200E(X_3)$ $= 1000(10) + 500(5) + 200(20) = 10000 + 2500 + 4000 = $16,500$
Variance of Linear Combinations of n Independent Random Variables
For independent random variables, the variance of their linear combination is:
$$VAR(a_1X_1 + a_2X_2 + ... + a_nX_n) = a_1^2VAR(X_1) + a_2^2VAR(X_2) + ... + a_n^2VAR(X_n)$$
It's crucial to remember that this formula only applies when the random variables are independent. If they're not, we need to consider covariance terms as well.
Using the previous example, let's say:
$$
\begin{aligned}
& V A R\left(X_1\right)=4 \\
& \operatorname{VAR}\left(X_2\right)=2 \\
& \operatorname{VAR}\left(X_3\right)=10
\end{aligned}
$$
The variance of the total profit would be:
$$V A R\left(1000 X_1+500 X_2+200 X_3\right)$$ $$=1000^2 V A R\left(X_1\right)+500^2 V A R\left(X_2\right)+200^2 V A R\left(X_3\right)$$ $$=1000000(4)+250000(2)+40000(10)=4000000+500000+400000=4,900,000$$
Unbiased Estimators
In statistics, we often use sample data to estimate population parameters. An estimator is said to be unbiased if its expected value equals the true population parameter.
Sample Mean as an Unbiased Estimator of Population Mean
The sample mean, denoted as $\bar{x}$, is an unbiased estimator of the population mean $\mu$. It is calculated as:
$$\bar{x} = \frac{\sum_{i=1}^n x_i}{n}$$
where $x_i$ are the individual observations and $n$ is the sample size.
The expected value of the sample mean is equal to the population mean: $E(\bar{X}) = \mu$
Sample Variance as an Unbiased Estimator of Population Variance
The sample variance, $s^2_{n-1}$, is an unbiased estimator of the population variance $\sigma^2$. It is calculated as:
$$s^2_{n-1} = \frac{n}{n-1} \cdot s^2_n = \frac{\sum f_i(x_i - \bar{x})^2}{n-1}$$
where $n = \sum f_i$ (total frequency).
The factor $\frac{n}{n-1}$ is known as Bessel's correction and is used to make the sample variance an unbiased estimator of the population variance.
The expected value of the sample variance is equal to the population variance: $E(s^2_{n-1}) = \sigma^2$
This formula applies to the estimation of the population variance, not the standard deviation.
Suppose we have the following data set: $2,4,4,4,5,5,7,9$.
To calculate the unbiased sample variance:
- Calculate the mean: $\bar{x}=\frac{2+4+4+4+5+5+7+9}{8}=5$
- Calculate the squared deviations:
$$(2-5)^2,(4-5)^2,(4-5)^2,(4-5)^2,(5-5)^2,(5-5)^2,(7-5)^2,(9-5)^2$$ - Sum the squared deviations: $9+1+1+1+0+0+4+16=32$
- Divide by $(n-1)=7: s_{n-1}^2=\frac{32}{7} \approx 4.57$
Remember that while understanding the formulas is important, being able to interpret and apply them in context is equally crucial. Always consider the practical implications of your statistical calculations.