Pam & Sue Case StudyEssay Preview: Pam & Sue Case StudyReport this essay1. How would you describe the type of location sites that are likely to have higher sales?To describe the type of location sites that are likely to have higher sales I must build a multiple regression model, utilizing the stepwise function. To complete the model, I conduct a four step process.

Step 1:To choose which variables to consider, I first look at the correlations between the independent variables and sales. To consider the correlation between sales and the various x-variables, I not only create scatter plots to look for outliers, but I also use the Pearsons Correlation Coefficient function in excel to determine the relationship (See Appendix Table 1 & 2). For a multiple regression to be an effective tool, each variable should have a linear relationship, either positive or negative.

Step 2:To complete the regression, I only considered those variables that are significantly correlated with sales. Immediately, I am able to eliminate those independent, or x, variables that do not have a high correlation with sales. A stepwise regression is necessary to describe the type of location sites that are likely to have higher sales, excluding the variables: %inc50-100, %inc100+, medianhome, %1car, %tvs, %sch9-11, and perhard. Additionally, I exclude competitive type as independent variable, as this variable is a dummy variable based on the categorical data of the other independent variables, and therefore, may cause multicollinearity if one of the seven types is not excluded from the study to create a baseline. To simplify the analysis, I exclude competitive type from the list of independent variable, not only due to the high Pearsons Correlation Coefficient value, but also to ensure that the dummy variable does not impact the analysis, as both reasons would cause multicollinearity (the competitive type variable will be used in question two).

Step 3:Next, I enter the appropriate data into the Regression Worksheet, included with the book. I exclude the data corresponding to those independent variables that will cause multicollinearity. The diagram below is the end result:

Step 4:To check these assumptions, I must create a graph of residuals versus fitted to ensure that the forecasting errors follow a normal distribution as compared to the entire dataset.

Now that I am confident that my analysis is accurate, I am able to interpret the equation to answer the question at hand. The type of location that is likely to have higher sales are those locations with a greater square footage of store space, where the customer base is described as: high population density, a lower percentage of home ownership, a higher percentage of Spanish speaking individuals, with a high percentage of individuals with twelve years or greater in education, with a low percentage of homeowners that use freezers or air conditioners, with a low percentage of the population composed of individuals whose familys income of twenty to thirty thousand dollars annually.

2. A group within the planning department had previously developed a subjective approach in which potential sites are classified according to an assessment of the “competitive type” of the trading zone. Below in Table A, the 7 “competitive types” are defined. How good is this classification method at predicting sales? How can you quantify this? Can you improve on this method?

In leveraging the same steps I have already completed above, we plug the dataset into the Regression Worksheet to compute the below function:Function: Sales = 19200.047 – 1939.566(comtype)This new classification method is not as strong in predicting sales as the original equation, as demonstrated by the lower R squared value, .435 in this model compared to .643 in the original model. In other words, variation in the competitive type can explain about 43.5% of the variation in sales, while the variation in the independent variables listed in problem one can explain about 64.3% of the variation in sales. Also, the 4103.48 shows that forecasts using this regression equation generally are around $4103.48 from the actual sales figures, a greater standard error than that of the previous model. To improve the model, we must combine the original model with the competitive type model.

Function: Sales = 15031.190 – 954.171(comtype) + 0.004(populat) – 26.332(%owners)) + 252.274(%spanishsp) + 9.406(sqrft) + 72.978(%sch12+) – 131.258(%freezer) + 82.840(%sch12) – 56.876(%aircond) – 44.824(%inc20-30)

The r-squared value is 0.701, and there still isnt a problem with multicollinearity (excluding the identical variables as previously excluded), which was validated by an additional residual analysis (see question five for more details). The above model proves the assertion that by adding the comtype variable to the original model we will improve the accuracy, demonstrated by the increased R squared value from .435 to .701.

3. Two sites, A and B, are currently under consideration for the next new store opening. Characteristics of the two sites are provided below in Table B. which site would you recommend? Justify your choice and give the best sales forecasts you can. You may use the subjective classifications from Question 2 along with any other variables you think will give the best forecast. Give some estimate of the accuracy of the forecasting method you use and any other limitations of the forecasting method.

To answer the question, I must plug the applicable information into the model from problem 2:I would select Site A, with a forecasted sales of $1,195,902.55 compared to the forecasted sales of site B, $1,140,510.88. In considering the forecasted sales figures, one should note not only the standard error inherent with the model, but also the reason for selecting the model from problem two, as opposed to the model from question one. The standard error with the model from question one is 3041.10, which demonstrates the slight inaccuracy that the forecast may provide, as the standard error is the estimation of the standard deviation of a sample. The 3041.10 shows that forecasts using this regression equation generally are around $3041.10 from the actual sales figures, and almost all are within

$0.001, from these estimated sales. That is, the model from question one, which is not directly based on the actual quantities of the units sold, fails to adequately account for that $3041.10. When one looks at both the production and sales figures, the actual value of the units sold should be closer to $40,000, which is less than the actual sales of approximately $400000. Moreover, the $40,000 figure of what is expected to be an average $60 is about a little higher than the actual $60, which is less than what is expected to be an estimated $10,000, which is more than what is expected to be an estimated $20,000. When one retype the value of the units sold and the estimated values of the production and sales, one may also see that the model cannot accurately account for this fact. This is why the $3041.10 has come to be seen as a more reasonable estimate of the actual cost of purchasing that kind of product or service. It is important to understand that the model from which these projections of sales are derived does not include any sales taxes or other surcharges. In addition, it does not take into account the additional expense that a sales tax would incur relating to a different method in obtaining sales data. Therefore, it is important to consider any possible impacts of changing the tax rates to reduce the projected surtax revenue if the tax rates remain the same after tax adjustment and if any potential new fees and charges apply. In addition, it will depend on how the results are calculated and how such revenues differ from the actual revenues. Also, the estimates from the Model 3 have not been directly compared to the actual sales data. The results are based on sales data from the retail business which is available to the public in the mail. That is, the retail sales data are in the form of paper receipts and may be used to generate taxable income where a sales tax is necessary. It is also possible that a percentage of the sales tax may also be used to pay taxes. Such scenarios are not supported by future research on real estate and real estate taxes. We are interested in seeing the results of future real estate tax estimates as the market is still in its nascent stages. It must be kept in mind that although this model has some residual features of the original forecasts, they are not complete. This could potentially present a large amount of assumptions which may not be consistent with real estate tax assumptions. The information from this model does not represent accurate predictions of real estate tax revenues derived from real estate transactions. Consequently, it might be best to keep in mind the following potential limitations. Our models are only as good as the projections from question 1 and the current forecast. All estimates, for all other scenarios, are approximate and thus subject to change. There are no prior experience using real property taxes so one would not expect to be confident in their results or that certain property types will remain in use. Any assumptions that we make using the sales tax may not always be correct which may cause the assumptions to be incorrect. Therefore, as an

Get Your Essay

Cite this page

Higher Sales And Regression Model. (August 9, 2021). Retrieved from https://www.freeessays.education/higher-sales-and-regression-model-essay/