STATISTICS
1
4
Module Three Assignment
Student Name
University Name
Course Code: Course Name
Instructors Name
Due Date
Module Two Notes
Data Analysis
The sample of size 30 generated is also expected to be a reflective of the national market. A comparison between the sample statistics and the population parameters can help determine whether the sample generated is truly random and accurately reflects the characteristics of the entire population. Notably, the comparison depicts that the sample statistics are not close or approximately equal to the national statistics. The sample descriptive statistics of the median listing price and the median square foot variables are not close to those national statistics for the same variables.
Scatterplot
Pattern:From the scatterplot, the regression equation used to model the relationship between the median square feet (x) and median listing price variables (y) is; y =133.33x 44133. The median listing price is the numerical target variable while the median square feet is the predictor variable. Predictor variable(s) are used in making forecast or prediction of the dependent variable (Warner, 2012). In this case, the median square feet variable is used to predict median listing price. The two variables are positively and linearly related. As the median square feet increases, the median listing price also appear to increase.
Module Three Assignment
Regression Equation
The regression equation for the line of best fit based the scatterplot is; y =133.33x – 44133. The regression equation is used to model the relationship between the median square feet (x) and median listing price variables (y). The regression equation can be rewritten as; Median listing price =133.33* Median Square footage – 44133. The median listing price is the numerical target variable while the median square feet is the predictor variable
Determine r
The correlation coefficient, r, for the given scenario is 0.730889869. This is determined from the coefficient of determination in the scatterplot and then institutively visualizing the monotonic relationship between the two variables to know the r sign. That is ?0.5342= ± 0.730889869. Simply, correlation coefficient is equal to the square root of coefficient of determination. From the scatterplot, the values of response variables appear to increase as the values of the predictor variable increases, and thus we choose the positive value. So, r= 0.730889869. Alternatively, it can be computed using the CORREL function in excel. This Excel function returns the correlation coefficient of two cells ranges.
The correlation coefficient is a statistical measure for the strength and direction of linear association between two continuous variables on a scatterplot (Ratner,2009). Markedly, the correlation coefficient, r, takes values between -1 and 1, and is based on the linearity assumption between variables. In the above scatter plot, the observed linear pattern between the two variables; the median square feet and median listing price, warrants the use of the correlation coefficient as a measure of strength of linear association. By firm linear rule, correlation coefficient values between 0.7 and 1.0 ( -0.7 and -1.0) indicate strong linear association (Ratner,2009). Therefore, it can be said that there is a strong linear association between the median square footage and listing prices of the properties because r, 0.730889869, falls between 0.7and 1.0.
Imperatively, the strength of the linear association is independent of the direction or the sign of correlation coefficient. The sign of correlation coefficient, r, corresponds to the direction of the monotonic relationship (Warner, 2012). A positive correlation coefficient, r, depicts that a direct relationship while a negative correlation coefficient indicates an inverse relationship. In this case, the sign of correlation coefficient, 0.730889869, is positive. Therefore, it can be said that there is positive or direct relationship between the median square feet and median listing prices of the homes. In a nutshell, there is a strong positive linear relationship between the median square footage and listing prices of the properties.
Examine the Slope and Intercepts
When there appears to be a linear association between the explanatory variable and the target variable, fitting the data by computing the best-fitting line through minimizing the sum of squares of the vertical deviations from each data point to the line is very useful (Sekaran & Bougie, 2016). The linear model for the best line of fit is usually of the form; Y=b0+b1X. The b1 is the slope and b0 is the intercept (the value of Y when x=0). In the above given model, y =133.33x 44133, the slope is 133.33 while the y-intercept is -44133. Notably, the response variable (y) is the median listing price while the predictor variable (x) is the median square feet. The y-intercept of -44,133 does not makes sense because even when the square footage of the house is zero, the price of the land cannot be -$-44,133. The price of the land cannot be negative value; price of land =133.33*0 44133=-44,133 The slope simply means a consistent change between the two variables; x and y (Sekaran & Bougie, 2016). In this case, the slope of 133.33 means that an increase of the size of the house by one square footage is coupled by an increment of the listing price by $133.33.
R-squared Coefficient
The r-squared coefficient or the coefficient of determination is a statistical measure that shows the amount of variance explained in the outcome in the linear regression setting (Sekaran & Bougie, 2016). In particular, it is the percentage of the outcome variable that is explained by the linear model and the predictor variable. In this context, the r-squared coefficient of 0.5342 tell us that 53.42% of the variation the median listing prices of the houses is explained by the size of the house in terms of square footage. R-squared coefficient can also be defined as a statistic that indicates how well the regression equation fits the observed data (Warner, 2012). In this context, the coefficient of determination of 0.5342 indicates that 53.42% of the data fits the linear regression model. Therefore, the regression model y =133.33x 44133 can be said to be a good fit because it fits more than 50% of the data points. A higher r-squared coefficient is an indicator of better goodness of fit of the data points or observations (Warner, 2012).
Conclusion
The regression equation for the best line of fit, y =133.33x 44133, can be considered as good model. It can be employed in appraising the values of homes. Based on the model, the listing price of a house go up by $13,333 for every 100 square feet. That is 133.33*100=13,333. Also, based on the linear model, a 1,200 square foot house should be listed at $115,863. That is ; listing price = 133.33*1200-44133= $115,863. However, the model can be employed to predict the national listing prices because it based on sample data from East North Central that is different from that of national market. For instance, the square footage for homes in East North Central region different than for homes overall in the United States. The histogram informs the ranges for square footage under the from East North Central region. The linear model can be utilized to predict listing prices for homes with sizes between 1440.21428575 and 3700.4940477 square feet.On the other hand, boxplots aid in comparing the range for square footage for homes for East North Central region and that of the national market.
References
Ratner, B. (2009). The correlation coefficient: Its values range between +1/?1, or do they?
Journal of Targeting, Measurement and Analysis for Marketing, 17(2), 139-142. https://doi.org/10.1057/jt.2009.5
Sekaran, U., & Bougie, R. (2016). Research methods for business: A skill building approach.
John Wiley & Sons.
Warner, R. M. (2012). Applied statistics: From bivariate through multivariate techniques: From
bivariate through multivariate techniques. SAGE.
APPENDIX
Graph for informing square footage ranges
BOXPLOTS
Scatterplot
sss
1696.2261904999998 1704.3571427500001 1577.2738095000002 1596.6130953333334 1504.5892858333334 2126.7499999166666 1643.2083334166666 1876.21428575 1744.9047618333334 1651.4166666666667 1743.9583334166666 1588.2738094166668 1983.66071425 1719.6547618333334 1897.9404762500001 1555.5833333333333 2412.8511904999996 1709.2083333333333 1591.0654761666699 1497.75 1685.9642858333334 1865.5654761666667 2067.2380952500002 1442.3273809166667 1722.1547619166668 1876.6309523333332 2097.8809523333334 3700.494047666667 1675.8749999166666 1440.21428575 186989.88094999999 106560.71428333333 78820.833333333328 130292.47023333334 123631.54761666666 265857.14285 242460.71428333331 249307.14286666666 156276.19047499998 130832.14285833335 224954.82142499997 202147.02381666668 290111.90476666664 226137.32143333333 251713.01190833331 145082.26189999998 253513.69048333334 105964.880945 128070.83333333333 197664.46429166666 192225.69641666664 179426.78571666669 247848.21429166663 314629.42262500001 167180.95238333332 166764.88094999999 232319.04762500001 448567.51190833334 161705.35714166667 121609.52380833332
Median square Feet
Median Listing Price
StatisticsSample National
Mean$197,622$288,407
Median$189,608$256,936
Standard deviation$76,545$163,986
Median Listing Price
StatisticsSample National
Mean18131,944
Median17071,901
Standard deviation420367
Median square Feet
Applied Sciences
Architecture and Design
Biology
Business & Finance
Chemistry
Computer Science
Geography
Geology
Education
Engineering
English
Environmental science
Spanish
Government
History
Human Resource Management
Information Systems
Law
Literature
Mathematics
Nursing
Physics
Political Science
Psychology
Reading
Science
Social Science
Home
Homework Answers
Blog
Archive
Tags
Reviews
Contact
twitterfacebook
Copyright © 2021 SweetStudy.com


Recent Comments