Linear Regression & Correlation
1. The thermal expansion of a steel rod was investigated by measuring the increase in length βπ
(mm) as a function of the temperature T (oC) between 30oC and 400oC). The data were used in a
linear regression and the following equation was found: βπ = 0.009T – 0.25
(a) Find the increase in length at 300oC.
(b) Find the increase in length at 530oC.
(c) Explain why the answer to part (b) might be less reliable then the answer to part (a).
2. * Two variables, s and t, are thought to be linearly related by the equation π‘ = π + ππ where a
and b are constants. Using a set of data for {s,t}, the following statistics were obtained:
β π = 553 ; β π‘ = 549 ; β π π‘ = 31185 ; π = 12
π Μ = 46.0833 ; π‘Μ = 45.75 ; ππ π = 6193
(a) Use these data to find the equation of the regression line. [3]
(b) Use the fitted regression equation to estimate the value of t when s is 50. [1]
3. * For the data in the table below
x 2 4 5 8 10
y 3 7 8 13 17
(a) Use a calculator or software such as Excel or Desmos to find the equation of the regression
line y = a + bx and the value of r2. [2]
(b) Estimate the value of y at x=7.0 and comment on the reliability of this estimate value. [4]
Hypothesis Testing
4. A D5 (5–sided die) seems to have a lower–than–expected probability of scoring a 5. In an
experiment to check this, the die is rolled 25 times and only 2 scores of 5 are obtained.
(a) Construct a relevant hypothesis and use e.g. the π~π΅(25,0.2) cumulative distribution table on
Canvas (see Week 7 materials) to check, at the 5% significance level, whether this die may be
biased.
(b) What is the actual significance level of this test?
2020). The advertisement for foundation claims that 75% of 103 people agree that the product is
good. 25 people are randomly selected from a list of users of the foundation and asked what they
think of it. The alternative hypothesis is that the probability they agree is less than 75%.
(a) Find the critical region in the binomial cumulative distribution that tests this hypothesis at the
1% level of significance and explain what this critical region means. [3]
(b) What is the actual significance level of this test? [1]
6. A report from May 2020 in the British Medical Journal (Watson et al, 2020) suggests that the
sensitivity of the Covid–19 test is approximately 70%. A fictional new test is developed which
claims to be half the cost, but as equally sensitive as the original. Using the new test, a clinical trial
was conducted on 20 patients known to be Covid–positive at the time of testing. It correctly
diagnosed 12 of them. Use a one–tailed test of the hypothesis that the new test is equally as
effective as the original test at the 5% level of significance. Try to answer this question in two
ways: firstly use a calculator of Binomial distributions such as the Casio FX991–EX or one of the
many free online sites (here is a link to a good one from the University of Iowa (Bognar, 2021):
https://homepage.divms.uiowa.edu/~mbognar/applets/bin.html)
Secondly, try to use only the Binomial cumulative frequency tables provided on Canvas.
Note: βsensitivityβ here means the correct diagnosis of someone who is positive for Covid–19.
7. Based on past records, a manufacturer of glass vases estimates that 15% of the vases produced
have defects. To check this, a random sample of 20 vases were checked and 6 were found to have
defects.
(a) Create a suitable hypothesis and use a two–tailed test at the 5% significance level to find the
critical regions and check this claim.
(b) What is the actual significance level for this test?
8. * A stationery company claims that 30% of the pencils sent to them by their supplier are faulty.
To check this, a random sample of 20 pencils were checked and 4 were found to be faulty.
(a) Create a suitable hypothesis and use a two–tailed test at the 5% significance level to find the
critical regions and check this claim. [5]
(b) What is the actual significance level for this test?