- IB
- SL 4.4—Pearsons, scatter diagrams, eqn of y on x
Practice SL 4.4—Pearsons, scatter diagrams, eqn of y on x with authentic IB Mathematics Analysis and Approaches (AA) exam questions for both SL and HL students. This question bank mirrors Paper 1, 2, 3 structure, covering key topics like functions and equations, calculus, complex numbers, sequences and series, and probability and statistics. Get instant solutions, detailed explanations, and build exam confidence with questions in the style of IB examiners.
A librarian records the number of books borrowed, , and the number of library visits, , by eight members over a month. The data are shown below.
| Books borrowed | Library visits |
|---|---|
| 2 | 1 |
| 4 | 2 |
| 6 | 3 |
| 8 | 4 |
| 10 | 5 |
| 12 | 6 |
| 14 | 7 |
| 16 | 8 |
Find Pearson's product-moment correlation coefficient, , and interpret its value in context.
Find the equation of the regression line on .
Estimate the number of library visits for a member who borrows 9 books.
Draw a scatter diagram of the data with the regression line.
A researcher studies the relationship between the number of hours, , spent studying per week and the average test score, , out of 100, for eight randomly selected students. The data are shown in the following table.
| Hours studying | Test score |
|---|---|
| 2 | 55 |
| 4 | 60 |
| 6 | 65 |
| 8 | 70 |
| 10 | 75 |
| 12 | 80 |
| 14 | 85 |
| 16 | 90 |
The relationship is modeled by the regression equation .
Write down the value of and .
Use the regression equation to estimate the test score for a student who studies for 9 hours per week.
Draw a scatter diagram of the data, including the regression line.
A nutrition researcher investigates the relationship between the amount of protein (in grams) in a breakfast meal and the time (in minutes) after which a person begins to feel hungry again. The following data were obtained from eight participants.
| Protein (g) | 10 | 14 | 18 | 21 | 25 | 28 | 32 | 36 |
|---|---|---|---|---|---|---|---|---|
| Hunger time (min) | 60 | 75 | 82 | 90 | 110 | 115 | 124 | 130 |
Using technology, find
(i) the mean and standard deviation of and ;
(ii) the value of the Pearson product–moment correlation coefficient .
Find the equation of the regression line of on in the form .
A new breakfast bar contains 30 g of protein. Estimate, using your regression model, how long it will take for an average person to feel hungry again.
The researcher claims that hunger time increases by about 2.5 minutes for each additional gram of protein. Test this claim against your model and comment on whether it is supported.
Calculate the coefficient of determination, and interpret its meaning in context.
A teacher records the number of pages read, , and the time taken, , in minutes, for six students completing a reading task. The data are shown below.
| Pages read | Time taken |
|---|---|
| 10 | 15 |
| 15 | 22 |
| 20 | 28 |
| 25 | 34 |
| 30 | 40 |
| 35 | 45 |
Calculate Pearson's product-moment correlation coefficient, , and interpret its value in context.
Find the equation of the regression line on .
Estimate the time taken to read 18 pages.
Interpret the slope of the regression line in context.
A dataset records study-hours and test scores for eight students:
(a) Using technology, find (i) ; (ii) the regression line ; (iii) the regression line . [5]
Using technology, find (i) ; (ii) the regression line ; (iii) the regression line .
Predict at and find the residual for .
Using on , estimate for ; then invert to estimate when . Explain the difference.
Compute and interpret; comment on extrapolating to .
A farmer records the number of seeds planted, (in thousands), and the crop yield, (in kg ), for 10 fields. The data for and are shown below.
| 2.30 | 4.61 |
| 2.71 | 5.01 |
| 3.00 | 5.30 |
| 3.22 | 5.52 |
| 3.40 | 5.70 |
| 3.50 | 5.80 |
| 3.69 | 5.99 |
| 3.91 | 6.21 |
| 4.09 | 6.39 |
| 4.20 | 6.50 |
The relationship between and can be modeled by the regression equation . The relationship between and can be modeled as .
Find the equation of the regression line on .
Use the regression equation to estimate the crop yield when 15,000 seeds are planted.
Find the values of and in the model .
Calculate Pearson's product-moment correlation coefficient for and , and interpret its value in context.
If the farmer increases the number of seeds by in a field with 20,000 seeds, estimate the expected percentage increase in crop yield.
A store manager records the daily advertising budget, , in dollars, and the number of customers, , visiting the store over seven days. The data are shown below.
| Advertising budget | Customers |
|---|---|
| 50 | 20 |
| 100 | 25 |
| 150 | 30 |
| 200 | 35 |
| 250 | 40 |
| 300 | 45 |
| 350 | 50 |
Find the equation of the regression line on .
Write down the mean values and .
Draw a scatter diagram of the data, including the regression line and the point
Estimate the number of customers if the advertising budget is 400 dollars, and explain why this estimate may not be reliable.
A scientist studies the effect of temperature, , in degrees Celsius, on the reaction time, , in seconds, of a chemical process. The data for six experiments are shown below.
| Temperature | Reaction time |
|---|---|
| 10 | 8.0 |
| 15 | 7.5 |
| 20 | 6.8 |
| 25 | 6.2 |
| 30 | 5.5 |
| 35 | 5.0 |
Calculate Pearson's product-moment correlation coefficient, r.
Find the equation of the regression line on .
Estimate the reaction time at .
State one reason why the regression line may not be suitable for predicting the reaction time at .
An environmental scientist studies the relationship between the average daily temperature (°C) in a city and the number of ice creams sold in a local park.
The data collected over eight days are shown below.
| (Temperature, °C) | 15 | 18 | 21 | 24 | 27 | 30 | 33 | 36 |
|---|---|---|---|---|---|---|---|---|
| (Ice creams sold) | 120 | 145 | 170 | 195 | 220 | 250 | 275 | 300 |
Using technology, find the mean, standard deviation of and , and the Pearson product-moment correlation coefficient
Find the regression line
The scientist suspects an exponential relation . Transform the relation into a linear form and use the technology to determine the regression line.
Using both models from parts 2 and 3, estimate when .
State both predicted values (linear and exponential) and decide which model you prefer, justifying your choice with appropriate evidence (e.g., , residual plots, or interpretability).
Using the linear regression model from part 2, predict when .
Comment on whether this prediction is valid and why.
Suppose that for , the actual response time (in the same units as ) is approximately normally distributed with mean equal to your preferred model’s prediction from part 4 and a given standard deviation of (same units as ).
Compute and interpret your result in the context of the problem.
A fitness coach records the number of hours, , spent training per week and the maximum heart rate, , in beats per minute (bpm), during a workout for seven athletes. The data are shown below.
| Training hours | Heart rate |
|---|---|
| 3 | 140 |
| 5 | 145 |
| 7 | 150 |
| 9 | 152 |
| 11 | 155 |
| 13 | 158 |
| 15 | 160 |
It is assumed that follow a bivariate normal distribution with product moment correlation coefficient .
(i) State suitable hypotheses and to test whether there is a correlation between training hours and heart rate, using a two-tailed test.
(ii) Calculate Pearson's product-moment correlation coefficient, , and interpret its value in context.
(iii) Using a significance level, state your conclusion in the context of the coach's study.
The regression line of on is given by . Estimate the heart rate for an athlete training 10 hours per week.