The following data show the age (week) and height (cm) of soybean
plant.
Age (week) 1 2 3 4 5 6 7
Height (cm) 5 13 16 23 33 38 40
Find the least square regression line of height on age.
Consider pairs of points "(x,y)", where "x" corresponds to the age and "y" corresponds to the height. The least squares regression line is the line: "\\hat{y}=m{x}+b" that minimizes the following error: "E(m,b)=\\sum_{i=1}^n(y_i-\\hat{y})^2". After substitution "\\hat{y}=mx+b" we receive: "E(m,b)=\\sum_{i=1}^n(y_i-(mx_i+b))^2". After taking the derivatives with respect to "m" and "b" we get: "E_m=2\\sum_{i=1}^nx_i((mx_i+b)-y_i)=2(m\\sum_{i=1}^nx_i^2+b\\sum_{i=1}^nx_i-\\sum_{i=1}^nx_iy_i)" and "E_b=2\\sum_{i=1}^n((mx_i+b)-y_i)=2(m\\sum_{i=1}^nx_i+nb-\\sum_{i=1}^ny_i)". Denote "\\alpha=\\sum_{i=1}^nx_i^2", "\\beta=\\sum_{i=1}^nx_iy_i", "\\gamma=\\sum_{i=1}^nx_i" and "\\delta=\\sum_{i=1}^ny_i". Necessary conditions of extrema take the form: "E_m=0,E_b=0". The latter is equivalent to: "m\\alpha+b\\gamma-\\beta=0," "m\\gamma+nb-\\delta=0". From the first equation we get: "m=\\frac{\\beta-b\\gamma}{\\alpha}". We substitute it in the second equation and get: "\\frac{\\beta\\gamma}{\\alpha}-\\frac{\\gamma^2}{\\alpha}b+nb-\\delta=0"We receive: "b=\\frac{\\delta\\alpha-\\beta\\gamma}{n\\alpha-\\gamma^2}", "m=\\frac{n\\beta-\\delta\\gamma}{n\\alpha-\\gamma^2}". Expressions for "m" and "b" as well as verifications of sufficient conditions of extrema can be also found in the respective literature about the least square regression line. Set "n=7", take values from the tables and get: "\\alpha=140", "\\beta=844", "\\delta=168", "\\gamma=28". We receive: "b=\\frac{168\\cdot140-844\\cdot28}{7\\cdot 140-(28)^2}=-\\frac47", "m=\\frac{7\\cdot844-168\\cdot28}{7\\cdot 140-(28)^2}=\\frac{43}{7}".
Answer: the least square regression line has the form: "\\hat{y}=\\frac{43}{7}x-\\frac47". "x" corresponds to the age and "\\hat{y}" corresponds to the height.
Comments
Leave a comment