Student Name: ________________________________
Instructions:
Please answer each of the following questions and show work/calculations. Open up sufficient space in the Word document. Insert your answer using a highlight color (blue, green, red).
Please show your calculations and copy and paste outputs from SPSS and/or Excel into this document.
Please note to consult a Chi=Square (X2) Distribution Table use Appendix B in electronic textbook (p 308) or Google a table.
Good luck.
Question 1: Key Definitions and Concepts Part I (10 points)
- Explain and define the differences between independent and dependent variables. Give examples. (2 points)
- Explain and define the differences between the statistical relationships “association” [correlation] and “causation” [causality]. Give examples. (2 points)
- Explain the difference between an experimental research design and a quasi -experimental research design. Which is the good standard? (2 points)
- What is sampling error? (2 points)
- What is the Central Limit Theorem and with is its significance in the field of statistics (2 points)?
Question 2: True False (10 points)
- Are mean, median, mode, and frequency distribution (or count) statistics of central tendency? True___ False_____ (2 points)
- Are variance, standard deviation, and statistics of dispersion? True___ False___ (2 points)
- Will the sampling distribution of an infinite number of relatively large samples be normally distributed? True____ False____ (2 points)
- Pearson’s Correlation Coefficient is the same at the Spearman Correlation Coefficient
- Ethics in statistics is not as important as ethics in law and medicine. The latter areas deal with people and the former just with number. True_____ False _____ (2 Points)
True____ False______ (2 points)
Question 3: Key Definitions and Concepts Part II (10 points).
- Explain the Kolmogorov-Smirnov test, what does it test for, how do you interpret the computer output?
- What is R-Square?
- What are the assumptions that must hold for a t-test?
- What are the assumptions that must hold for ANOVA?
- What are the Chi-square assumptions?
- How do you interpret the F Statistic in ANOVA analysis? If the F statistic is significant what does it mean? What does the F Statistic in regression analysis mean?
- With what types of variables are t-tests used for and with what type of variables are Chi-Square tests used?
- What are the five steps of hypothesis testing?
- When do you use the student t distribution?
- What is the difference between the observed and predicted values of the dependent variable? What is the key assumption made about the error term in regression?
Question 4: Chi-Square (10 points)
Assume that a citizen survey yielding 1,034 responses has been completed. We as statistical analysts what to check for over sampling or under-sampling with respect to US Census data. We want to whether the age distribution of the survey respondents is consistent with the age distribution in the decennial US Census at the 5 percent level of significance.
Ho The age distribution of the sample is consistent or the same as that of the population
HA The age distribution of the sample is inconsistent or different as that of the population.
Table US Census Response by Age Groups
Age Groups
|
US Census (Percent)
|
Survey Sample (Percent)
|
|
|
|
18-45
|
62. 3
|
62.8
|
46-65
|
24.1
|
26.8
|
66+
|
13.6
|
10.4
|
Write formula, show calculations, determine Chi-Square test value, identify degree of freedom, identify critical value, and make conclusion.
Question 5: Confidence Interval (10 points)
In a sample of 1,000 persons, 15.4 percent of the respondents report personal income at or below the poverty line, whereas 84.6 percent of the respondents report personal income above the poverty line. Please calculate a confidence interval at the 95% and 99% for this proportion of people who live in poverty. Write the appropriate formula to use, apply the formula for proportions, calculate the upper and lower bounds. Show step by step calculations.
- 95% confidence interval
- 99% confidence interval
Question 6: One Sample T-Test (10 points)
A psychosocial functional score (PFS) is used to assess school age children’s psychosocial behaviors. A score of 25 points or above is considered normal. A sample of 15 students is tested and their and their PFS corers are as follows:
Case ID
|
Psychosocial Functional Score (PFS)
|
1
|
29
|
2
|
32
|
3
|
18
|
4
|
23
|
5
|
27
|
6
|
19
|
7
|
34
|
8
|
32
|
9
|
27
|
10
|
23
|
11
|
26
|
12
|
32
|
13
|
29
|
14
|
|
15
|
|
- Input the data into SPSS (create new dataset, label Variable PFS_Score).
- Evaluate whether the students’ average PFS score is greater than 25 using one sample t test. Write a brief explanation of the results of your t test for a non- technical program manager.
What is mean___
What is standard deviation___
What is the t statistic, the 2 tailed significance value, it is less than .05, the mean difference?
Clip and paste SPSS output
Question 7: Paired T-Test (10 points)
Students at a school are given a test before beginning a special program of instruction and then a test after. Used paired samples t-test to determine if there was evidence of improvement (95% level of confidence).
Student
|
Before Test Score
|
After Test Score
|
1
|
4.5
|
6.9
|
2
|
3.2
|
4.8
|
3
|
5.8
|
5.2
|
4
|
3.9
|
4.3
|
5
|
4.2
|
5.0
|
6
|
3.9
|
4.8
|
7
|
2.6
|
3.2
|
8
|
5.2
|
4.8
|
9
|
4.5
|
4.5
|
10
|
3.9
|
4.1
|
11
|
3.8
|
3.6
|
12
|
4.2
|
5.9
|
Clip and paste SPSS output
Report was is the mean difference and confidence interval (95%)
What is test statistic ____
What is p value ____
How does the p value compare to .05 level of significance?
Question 8: Descriptive Statistics and Graphical Exercise (10 points)
- For the graduation data presented in the table below please calculate mean, variance, and standard deviation for each of the schools (3 points)
- Produce a line chart for graduation rates for each school. Label graph with title, axis, and legend. (3 points)
- What is your interpretation of this data and chart? If you were the School Superintendent for the district that includes these two school what sorts of questions would this chart trigger in your mind? What additional investigation would you like to undertake? (4 points)
Graduation Rates (Number of Graduates per Teacher) at Two Different Schools
|
Year
|
School A
|
School B
|
1
|
19
|
33
|
2
|
43
|
25
|
3
|
26
|
32
|
4
|
47
|
32
|
5
|
19
|
32
|
Mean
|
|
|
Variance
|
|
|
Standard Deviation
|
|
|
Question 9: Simple Linear Regression —Use Excel
[Data–Data Analysis-Select Regression from Dialogue box—filling in boxes with appropriate cell ranges]
- Please enter the following data into an Excel spreadsheet. Note make sure that you have Data Analysis Tool Pak Installed (Tools -Insert-Excel Add-ins-(click to select Data Analysis Tool Pak) for Microsoft 365 Subscription versions. If you have older versions of Excel see page 108 of Chapter 19 Excel User’s Guide, Section Loading the Data Analysis Tool Pak if not try Help-how to install Add-ins)
- Run a simple linear regression with DV= EnvironSpend and IV= PopDensity
- Interpret results.
Data: Environmental Spending
City ID
|
EnvironSpend
|
PopDensity
|
1
|
0.11
|
149
|
2
|
0.04
|
44
|
3
|
0.26
|
459
|
4
|
0.07
|
97
|
5
|
0.17
|
345
|
6
|
0.13
|
523
|
7
|
0.08
|
24
|
8
|
0.22
|
275
|
9
|
0.11
|
183
|
10
|
0.1
|
287
|
11
|
0.2
|
137
|
12
|
0.11
|
86
|
13
|
0.18
|
300
|
14
|
0.2
|
260
|
15
|
0.15
|
380
|
Variable definitions:
City ID==identification code for city
EnivronSpend=Environmental Spending=percent of annual total spending on environmental protection/concerns
PopDensity=population density=number of people per square mile)
Paste output here:
R-Square____ Interpretation______________
Which coefficients are statistically significant and different from zero at the 5% level
Intercept___________________________
Coefficient for PopDensity______________________
What report would you give a non-technical policy maker about the relationship your discovered with your regression analysis? What is the sign of the relationship between population density and environmental spending as a share of total spending? If population density were to increase in a particular jurisdiction what would you tell policy makers to prepare for in preparing future budgets?
Question 10: Multiple Linear Regression-Use SPSS (10 pairs)
Use the SPSS Public Perceptions formatted database (.sav)
- Run a Linear Regression with an intercept
- Interpret the sign and meaning of each of the coefficients significantly different from zero. What “economic theory or narrative” could you develop based on the regression findings to explain why people live for long periods in Orange County (Hint: Explain the possible contribution of each significant coefficient/variable)?
Dependent Variable: Yearsorc (Years Lived in Orange County
Independent Variable:
About what is your household income?
Age
Race/ethnicity
Gender
Zipcode of residence
Do you rent or own
How much formal education do you have.
Do you think property taxes are too….
Report R-square and Coefficients for each of the independent variables and significance level
(Clip and Paste SPSS output).