- IB
- Question Type 4: Calculating correlation coefficients with and without outliers
Calculate and compare the Spearman correlation coefficient before and after removing the outlier. Quantify the change and discuss the impact of the outlier on rank-based association.
Calculate and compare the Pearson correlation coefficient before and after removing the outlier. Quantify the change and discuss the impact of the outlier on linear association.
For the dataset , which includes an outlier:
(a) calculate the Pearson product-moment correlation coefficient, ;
(b) calculate the Spearman's rank correlation coefficient, ;
(c) compare the two values and comment on the effect of the outlier .
[6]Calculate the Pearson correlation coefficient for the following dataset, which includes an extreme outlier:
[5]
Calculate the percentage change in the Pearson correlation coefficient when the outlier is removed, using and .
[3]For the modified dataset where the outlier’s -value is reduced to 20, i.e.
calculate the Pearson correlation coefficient .
[4]For the dataset without the outlier , calculate both the Pearson product-moment correlation coefficient and the Spearman's rank correlation coefficient . Compare and comment on the two values.
[4]Explain why the Spearman rank correlation coefficient is less sensitive to an extreme outlier than the Pearson correlation coefficient, using mathematical reasoning based on ranks vs raw values.
[4]Calculate the Pearson correlation coefficient for the dataset after removing the outlier: Give your answer to three significant figures.
[4]Calculate the Spearman rank correlation coefficient for the dataset after removing the outlier:
[4]
A researcher calculates the Pearson correlation coefficient, , for three datasets based on the same nine base observations, but with different treatments of a tenth point .
The calculated values of for each scenario are: (a) including the point , (b) including the point , (c) excluding the tenth point entirely,
Compare the Pearson correlation coefficients for the three datasets and comment on how each treatment of the outlier affects the value of .
[4]This question assesses the calculation of Spearman's rank correlation coefficient and the understanding of its robustness against outliers compared to the Pearson product-moment correlation coefficient.
A student collects bivariate data to investigate the relationship between two variables, and . The following dataset is obtained:
Calculate the Spearman's rank correlation coefficient, , for this dataset.
[4]Comment on the effect of the outlier on the value of and explain why might be a more appropriate measure of correlation for this dataset than the Pearson product-moment correlation coefficient, .
[2]