solution a)
The regression equation is written as:
Where "m" and "c" are the coefficients
Sum both sides of the equation:
"\\sum y = m\\sum x + c" ........(equation 1)
Multiply equation 1 by "x" to get equation 2
"\\sum xy = m\\sum x^2 + c\\sum x" ........(equation 2)
From data size "(x)" we get:
"\\sum x =63"
"\\sum x^2 =623"
From processed requests "(y)" we get:
"\\sum y=245"
Multiplying "x" and "y" we get:
"\\sum xy=1973"
Substituting these values in equations 1 and 2 we get:
"245 =63m+c" ...... equation 1
"1973=623m+63c" ....... equation 2
Solve the simultanious equations
From equation 1
"c=245-63m"
Substitute in equation 2
"1973=623m+63(245-63m)"
"13462=3346m"
"m=-4.02"
Hence
"1973= 623*(-4.02)+63c"
"63c=4477.46"
"c=71.07"
The regression equation is:
answer: "y=-4.02x +71.07"
solution b)
intercept, 71.07 is the expected number of processed requests when data size is 0 megabytes. The computer has atleast 71 processes running in any given hour when no data is provided.
Coefficient, -4.02 is the expected change in processed requests given a 1 gigabyte change in data size. Every new gigabyte of data introduced will reduce the number of requests the computer processes in a given hour atleast 4.
solution c)
YES. To calculate the model parameters, let data size be the "y" variable and processed requests the "x" variable. This model may answer the question, how much data to expect for a given frequency of requests per hour
solution d)
NO. Correlation is not a sufficient method of answering the manager's question.
It only shows the direction and degree the data appear to move relative to each other. It has no predictive power to explain the relationship
Comments
Leave a comment