Tuesday, May 5, 2020
Quantity Analysis Regression and Correlation Parameters
Question: Describe about the Quantity Analysis for Regression and Correlation Parameters. Answer: 1. The scatter plot of the age of the cars and the advertised prices is given below: Figure 1: scatter plot of the age of the cars and the advertised prices (Source: created by author) The association between the variables age and price of a used Corolla was found to be -0.945346303. There was a negative association between two variables. With the increase in the age it was seen that there is a decrease in the price of a used Corolla. The strength of association between these two variables was found to be strong as the value was found to be 0.945346303 (Cohen et al., 2013). It is seen that the linear model is appropriate as the cost of assets usually decreases as the age of the asset increases. The relationship between the age of the car and the price of the car also follows this same theory and it was seen that there exists a strong negative relationship between the two variables. The correlation between age and price was found to be -0.945346303. The value of r square between the variables age and price of a used Corolla was found to be 0.894. This shows that the data is 89.4% closely fitted to the regression line (Al-Rawwash Pourahmadi, 2013). It can also be interpreted that the model is a good fit model of the data. R square, known as the coefficient of determination, measures that how close the data is fitted to the regression line. The model does not explain 100% variability of the price of a used Corolla because the value of r square was found to be 0.894. The price of the car had decreased heavily from the 3rd year to 4th This heavy decrease in the price in one year had made the value of r square as 89.4% and the model do not explain the 100% variability of the price of a used Corolla. The slope of the line is -924. The meaning of the slope of the line being -924 depicts that the predicted price of the car would decrease by -924 on increases in one unit f age of the car. The value of y-intercept is 12319.6, which depicts that in absence of the factor age of the car, the predicted price of the car would be $12319.6. If a person wishes to sell a 7 years old car, the appropriate price of the car would be $5851.6. iii. Residuals are defined as the difference between the observed value and predicted value (Lauda?ski, 2013). It is better to buy a car which has the observed value less than the predicted value. This shows that it is better to buy the car that has negative residual. The price of a 10 year old car is predicted to be 12319.6 924 * 10 = $3079.6. It is observed that the price of the car is $1500. The residual of the car is $1500 - $3079.6 = -$1579.6. The regression model cannot be used to establish a fair price for a 20 year old car because the value that is predicted for the car in next 20 years is found to be negative (Barr et al., 2013). This negative value is not possible as the price of the car cannot be negative at any point of time. 2. The Tennant Creek Towns daily water demand was found to be normally distributed with mean 5ml and standard deviation 1.25ml. In order to estimate the number of days in a year where the daily consumptions are as follows: 50% of mean = 0.5 * 5 = 2.5ml. The value of X = 5 + 2.5 = 7.5ml. The z-score with mean 5 and standard deviation 1.25 is (7.5 5)/ 1.25 = 2. The probability if X being 50% or more than the mean is given by P( X 7.5) = P(Z 2) = 1 P( Z 2) = 0.02275 (Struben et al., 2015). The number of days where daily consumption is 50% or more greater than the mean value is given by 0.2275 * 365 = 8.303 = 8 days. Two standard deviation of means are the lower bounds and upper bounds The lower bound is denoted as mean 2(standard deviation) = 5 2(1.25) = 2.5ml. The upper bound is denoted as mean + 2(standard deviation) = 5 + 2(1.25) = 7.5ml. The z-score of lower bound is (2.5 5) / 1.25 = -2. The z-score of upper bound is (7.5 5) / 1.25 = 2. The probability that the value of X lies between two standard deviation is P (2.5 X 7.5) = P (-2 Z 2) = P (Z 2) P (Z -2) = 0.97725 0.02275 = 0.9545. The number of days which lies within two standard deviation of the mean = 0.9545 * 365 = 348.39 = 348 days 25% of the area denotes the first quartile. The number of days corresponding where daily consumption is below the first quartile of demand is 0.25 * 365 = 91.25 days= 91 days. The z-score for 95% distribution of data is found to be 1.645 from the table of normal distribution (Kisbu-Sakarya et al., 2014). The corresponding value of X when the mean is 5ml and the standard deviation is 1.25 ml is given by 1.645 = (X 5) / 1.25 Or, X = (1.645 * 1.25) + 5 Or, X = 2.05625 + 5 = 7.05625 ml. Therefore, the water supply authority should set their capacity at the level of 7.05625 ml in order to save money. 3. A random sample of 25 evening calls is selected at0.05 level of significance. The mean of the sample is 17.2 and the variance of the sample is 4. The null hypothesis and the alternative hypothesis of the problem is as follows: H0: the average length of the evening long distance telephone call is equal to 18.1. H1: the average length of the evening long distance telephone call is not equal to 18.1. The critical value of the test at 0.05 level of significance and 24 degrees of freedom is 0.063 (Efron, 2012). The null hypothesis of the test would be rejected if the p value of the test is less than 0.05; i.e. if the test is statistically significant. The value of the test statistics of the test is (17.2 18.1) / (2/sqrt (25)) = (-0.9) * 5/ 2 = -2.25. The p value of the test at 24 degrees of freedom is 0.0339. The p value of the test is less than 0.05; which shows that the test is significant (Koch, 2013). This leads to the rejection of null hypothesis and it can be interpreted that the average length of the evening long distance telephone call is equal to 18.1. References Al-Rawwash, M., Pourahmadi, M. (2013). Gaussian estimation of regression and correlation parameters in longitudinal data.Journal of the Association of Arab Universities for Basic and Applied Sciences,13(1), 28-34. Barr, D. J., Levy, R., Scheepers, C., Tily, H. J. (2013). Random effects structure for confirmatory hypothesis testing: Keep it maximal.Journal of memory and language,68(3), 255-278. Cohen, J., Cohen, P., West, S. G., Aiken, L. S. (2013).Applied multiple regression/correlation analysis for the behavioral sciences. Routledge. Efron, B. (2012). Large-scale simultaneous hypothesis testing.Journal of the American Statistical Association. Kisbu-Sakarya, Y., MacKinnon, D. P., Mio?evi?, M. (2014). The distribution of the product explains normal theory mediation confidence interval estimation.Multivariate behavioral research,49(3), 261-268. Koch, K. R. (2013).Parameter estimation and hypothesis testing in linear models. Springer Science Business Media. Lauda?ski, L. M. (2013). Regression versus Correlation. InBetween Certainty and Uncertainty(pp. 67-85). Springer Berlin Heidelberg. Struben, J., Sterman, J., Keith, D. (2015). Parameter and confidence interval estimation in dynamic models: maximum likelihood and bootstrapping methods.Analytical Handbook for Dynamic Modelers.
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment
Note: Only a member of this blog may post a comment.