Hypothesis testing I and II; Testing differences between means, variances and proportions

by

All posts must 100% original work. no plagiarism. Post results must be provided using Excel.  MAKE SURE YOU INTERPRET YOUR RESULTS ON A WORD DOCUMENT LISTING EACH CALCULATION BY STEPS FOR PEERS TO UNDERSTAND THE CALCULATION. Attach is the Forum 2 excel for the calculation.
A town official claims that the average vehicle in their area sells for more than the 40th percentile of your data set. Using the data, you obtained in Forum 1, as well as the summary statistics you found for the original data set (excluding the super car outlier), run a hypothesis test to determine if the claim can be supported. Make sure you state all the important values, so your fellow classmates can use them to run a hypothesis test as well. Use the descriptive statistics you found during Forum 2 NOT the new SD you found during Forum 4. Because again, we are using the original 10 sample data set NOT a new smaller sample size. Use alpha = .05 to test your claim.
(Note: You will want to use the function =PERCENTILE.INC in Excel to find the 40th percentile of your data set. Hopefully this Excel function looks familiar to you from Forum 2.)
First determine if you are using a z or t-test and explain why. Then conduct a four-step hypothesis test including a sentence at the end justifying the support or lack of support for the claim and why you made that choice.
I encourage you to review the Week 6 Hypothesis Testing PDF at the bottom of the discussion. This will give you a step by step example on how to calculate and run a hypothesis test using Excel.

Sheet1

Never use plagiarized sources. Get Your Original Essay on
Hypothesis testing I and II; Testing differences between means, variances and proportions
Hire Professionals Just from $11/Page
Order Now Click here

Type Year Make Model Price MPG(CITY) MPG (HIGHWAY) Weight

variable type: qualitative variable type: quantitative variable type: qualitative variable type: qualitative variable type: quantitative variable type: quantitative variable type: quantitative variable type: quantitative

SUV 2021 Mazda CX-30 $22,795 25 33 3232

compact crossover 2021 Toyota rav4prime $29,458 36 40 5,530

SUV 2021 Chrysler Voyager $28,730 19 28 4,330

minivan 2020 kia Sedona $28,720 18 24 6,085

minivan 2020 Dodge grand caravan $29,025 17 25 4,510

passenger wagon 2020 Ford transit connect $28,315 24 26 3,689

SUV 2020 Volkswagen Tigwan $25,965 22 29 3,847

SUV 2019 kia Sorento $28,110 22 29 3,810

SUV 2020 Honda Odyssey $32,110 19 28 4,593

SUV 2021 Hyundai Palisade $33,665 19 24 4,284

sports car 2020 Bugatti La Voiture $3,250,000 9 14 4,400

SUMMARY BEFORE ADDING OUTLIER

mean $28,689 $22 $29 $4,391

standard deviation 2977.7926966873 5.5467708324 4.8579831206 863.2415652643

median $28,725 $21 $28 $4,307

SUMMARY AFTER ADDING OUTLIER

mean $321,536 $21 $27 $4,392

standard deviation $971,266 $7 $6 $819

median $28,730 $19 $28 $4,330

DISCUSSION

From the initial analysis, the following can be established:
The average MPG stands at 22 in the city and at 14 on the highway. This defines the average of all the vehicles selected in the list. Therefore, this figure can be used as a substitute for all the MPG figures respectively. However, the MPG figure for the highway deviates from the central measure of tendency with a certain value. This measure of dispersion is determined by the standard deviation. In the MPG in the city, the standard deviation is given by 5.55. this figure is used in determining the range within which most of the data points are found, defined from the central point of tendency (Mean). This is given by 16.55 and 2755. Therefore, the city MPG for all the data points is within the given range at a confidence interval of 95%. the same concept can be applied in the subsequent means and standard deviation. The median point defines the central position of the data points, when they are arranged in a chronological manner.

After addition of the outlier [chosen one is Bugatti. The figures change tremendously. The figures chosen for descriptive statistics, hike up for the standard deviation, but reduces for the mean and the media. An increase in the measures of dispersion is attributed with a decrease in the values for measures of central tendency. This is because larger outliers are attributed to minimizing the measures of central tendency, and hence the deviation from these points is increased.

Without supercar

Without supercar

Mean 28689.3

Standard Error 941.6607321347

Median 28725

Mode ERROR:#N/A

Standard Deviation 2977.7926966873

Sample Variance 8867249.34444445

Kurtosis 1.2469446572

Skewness -0.3237289483

Range 10870

Minimum 22795

Maximum 33665

Sum 286893

Count 10

Confidence Level(95.0%) 2130.1845701243

With Supercar

With Supercar

Mean 321535.727272727

Standard Error 292847.665977757

Median 28730

Mode ERROR:#N/A

Standard Deviation 971265.828779546

Sample Variance 943357310154.818

Kurtosis 10.9997518649

Skewness 3.3165733432

Range 3227205

Minimum 22795

Maximum 3250000

Sum 3536893

Count 11

Confidence Level(95.0%) 652505.262278539

Hypothesis Testing is a decision-making process called a Test of Significance.

Below are the 4 unique parts to Hypothesis Testing.

1) The Hypothesis Scenario. This includes the Null and Alternative scenarios.

a. Ho: Null Hypothesis

Ha or H1: Alternative Hypothesis
2) T- Test Statistic

=
�̅� −


Where �̅� is the sample mean, is the sample standard deviation, is the sample size,
and is the population mean defined in the null hypothesis of step 1

3) P-value. The p-value tells you if there is evidence to suggest that your null hypothesis is
incorrect. You will either reject your null hypothesis, or you fail to reject the null
hypothesis. *NOTE: we NEVER Accept null hypotheses, rather find that there was not
enough evidence to reject a null hypothesis. To find the p-value we will use =T.DIST(),
T.DIST.RT() or T.DIST.2T depending on the direction of the alternative hypothesis. The
degrees of freedom (DF) is n – 1.

4) Conclusion:

a. State whether your p-value is greater than alpha ( ) or less than alpha ( ).
*NOTE, the value of alpha ( ) is defined prior to conducting a hypothesis test.
The most common choices for the value of alpha ( ) are 0.05, 0.01, 0.10 and
0.005. Alpha ( ) is the acceptable Type 1 error. Type 1 error occurs when a null
hypothesis is rejected, though in reality the null hypothesis should not have been
rejected because the null hypothesis is true. Type 1 error is also called the “false
positive” rate.

b. Decide to Reject the null hypothesis (Ho) if the p-value is less than alpha ( ), or
Fail to reject the null hypothesis (Ho) if the p-value is more than alpha ( )

c. If you reject Ho, state that there is evidence for the alternative hypothesis. If you
fail to reject Ho, state that there is not enough evidence to support the
alternative hypothesis. *NOTE: we NEVER Accept or Reject alternative
hypotheses, rather we say that our sample provide evidence to support the
alternative hypothesis, or our sample does not provide enough evidence to
support the alternative hypothesis. The alternative hypothesis is not directly
being tested and therefore can not directly be accepted or rejected!

There are 3 different directions of the alternative hypothesis

1) A Left sided, or a Left Tailed Test. We use a left sided test to examine if the population
mean is lower than the mean defined in the null hypothesis (Ho). A left sided test uses
the following hypothesis scenario:

0: =

: < Where is the number we are testing whether the population mean, , is less than. For example if we wanted to know if the population mean, , is less than 15, then would be 15 in both the null and alternative hypothesis. To find the p-value for a left sided test, we use =T.DIST() in excel. 2) A Right sided, or a Right Tailed Test. We use a right sided test to examine if the population mean is higher than the mean defined in the null hypothesis (Ho). A right sided test uses the following hypothesis scenario: 0: = : >

Where is the number we are testing whether the population mean, , is more than.
For example if we wanted to know if the population mean, , is more than 15, then
would be 15 in both the null and alternative hypothesis. To find the p-value for a right
sided test, we use =T.DIST.RT() in excel.

3) A Two sided, or a Two Tailed Test. We use a two sided test to examine if the population
mean is different than the mean defined in the null hypothesis (Ho). A two sided test
uses the following hypothesis scenario:

0: =

: ≠

Where is the number we are testing whether the population mean, , is different
than. For example if we wanted to know if the population mean, , is not 15, then
would be 15 in both the null and alternative hypothesis. To find the p-value for a two
sided test, we use =T.DIST.2T() in excel.

Example 1 – A Left Sided Test

Now let’s continue to look at our car price data from Week 1. We know the sample mean, �̅� =

25,650, the sample standard deviation, = 3,488 and n = 10. A friend claims that the average

vehicle from the type of car you chose during week 1 sells for less than $27,000. Let us test

our friend’s hypothesis using alpha = 0.05. We will write out the 4 steps to do this.

1) State the null and alternative hypotheses

0: = 27,000

: < 27,000 2) Calculate the test statistic = �̅� − √ ⁄ You may notice that the denominator (bottom of the fraction) looks familiar. √ ⁄ is the standard error, SE. I start solving for my test statistic, TS, by finding the standard error, SE: When I press enter, I find that the standard error, SE, is 1,103. Now I can simplify my equation for the test statistic: = �̅� − = 25,650 − 27,000 1,103 = −1.2238 *Notice that I am plugging in = 27,000 while calculating my test statistic. I am using the population mean, , from the null hypothesis, Ho. 3) P-value. To find the p-value for a left sided test we use =T.DIST() in Excel. The degrees of freedom (DF) is n – 1. When I press enter, I find that the p-value is 0.1261 4) Conclusion: a. Recall from the problem statement that we set alpha = 0.05. 0.1261 > 0.05, therefore our p-value is greater than alpha.

b. We fail to reject the null hypothesis (Ho) because our p-value is more than alpha

c. There is not enough evidence to suggest that the average vehicle from the type of
car you chose during week 1 sells for less than $27,000.

Example 2 – A Right Sided Test

Next, a friend claims that the average vehicle from the type of car you chose during week 1

sells for higher than $23,500. Let us test our friend’s hypothesis using alpha = 0.05. We will

write out the 4 steps to do this.

1) State the null and alternative hypotheses

0: = 23,500

: < 23,500 2) Calculate the test statistic = �̅� − √ ⁄ You may notice that the denominator (bottom of the fraction) looks familiar. √ ⁄ is the standard error, SE. I start solving for my test statistic, TS, by finding the standard error, SE: When I press enter, I find that the standard error, SE, is 1,103. Now I can simplify my equation for the test statistic: = �̅� − = 25,650 − 23,500 1,103 = 1.9490 *Notice that I am plugging in = 23,500 while calculating my test statistic. I am using the population mean, , from the null hypothesis, Ho. 3) P-value. To find the p-value for a right sided test we use =T.DIST.RT() in Excel. The degrees of freedom (DF) is n – 1. When I press enter, I find that the p-value is 0.0416 4) Conclusion: a. Recall from the problem statement that we set alpha = 0.05. 0.0416 < 0.05, therefore our p-value is less than alpha. b. We reject the null hypothesis (Ho) because our p-value is less than alpha c. There is enough evidence to suggest that the average vehicle from the type of car you chose during week 1 sells for more than $23,500. Example 3 – A Two Sided Test Lastly, a friend claims that the average vehicle from the type of car you chose during week 1 will not sell for $23,500. Let us test our friend’s hypothesis using alpha = 0.05. We will write out the 4 steps to do this. 1) State the null and alternative hypotheses 0: = 23,500 : ≠ 23,500 2) Calculate the test statistic = �̅� − √ ⁄ You may notice that the denominator (bottom of the fraction) looks familiar. √ ⁄ is the standard error, SE. I start solving for my test statistic, TS, by finding the standard error, SE: When I press enter, I find that the standard error, SE, is 1,103. Now I can simplify my equation for the test statistic: = �̅� − = 25,650 − 23,500 1,103 = 1.9490 *Notice that I am plugging in = 23,500 while calculating my test statistic. I am using the population mean, , from the null hypothesis, Ho. 3) P-value. To find the p-value for a two sided test we use =T.DIST.2T() in Excel. The degrees of freedom (DF) is n – 1. When I press enter, I find that the p-value is 0.0831 4) Conclusion: a. Recall from the problem statement that we set alpha = 0.05. 0.0831 > 0.05, therefore our p-value is greater than alpha.

b. We fail to reject the null hypothesis (Ho) because our p-value is more than alpha

c. There is not enough evidence to suggest that the average vehicle from the type of
car you chose during week 1 sells a price different than $23,500.

*Important note when using =T.DIST.2T() in Excel

When your test statistics (TS) is negative, and you have a two sided test, you need to remove

the negative sign before using T.DIST.2T(). One way to remove the negative sign is to use the

absolute value function, ABS(). For example, if I wanted to test whether the average car price

is different than 27,000, the test statistics would be -1.2238. The p-value for this test can be

calculated in Excel as follows:

When I press enter, I find that the p-value is 0.2521. If you forget to take the absolute value

Excel will give you #NUM!. It is important to use the absolute value only for the two sided

t-test.

In this document we will discuss 2 – sample T- Paired or Matched hypothesis

testing and confidence intervals that used a mean and sample SD.

There are still 3 different hypothesis scenarios with a 2 – Sample Paired

Hypothesis Test.

Lower Tail Test (1 tail):

Ho: �̅� = 0

Ha: �̅� < 0 Upper Tailed Test (1 tail): Ho: �̅� = 0 Ha: �̅� > 0

Two Tailed Test:

Ho: �̅� = 0

Ha: �̅� ≠ 0

The hypothesized value is 0 and the same key words apply from a 1 – sample

hypothesis test to determine which scenario to use. �̅� is the average of the

difference column. Paired samples are samples that share something in common.

They are dependent on one another. Since they share something in common the

samples go from 2 to 1. This will be very similar to the 1-sample T hypothesis test

in the discussion forum.

The T – Test Statistic =
�̅� −0

Where this is SD of the difference column

We can use =T.DIST, =T.DIST.RT and =T.DIST.2T to find the p-values. These should

look familiar from the discussion forum.

Example:

Blood plasma cancer is characterized by increased blood vessel formulation in the

bone marrow that is a predictive factor in survival. One treatment approach used

for blood plasma cancer is stem cell transplantation with the patient’s own stem

cells. Measurements were taken immediately prior to the stem cell transplant

and at the time complete response was determined. The estimate of the mean

difference, in bone marrow microvessel density before and after the stem cell

transplant. There are 7 patients that are sampled. Is the blood vessel formulation

in the bone marrow different after the stem cell transplant? Use alpha = .05.

Before After

187 168
199 183

175 155
177 160

193 180
188 175

179 166

Here we see that these sample “share” something in common because it is the

same patient and measurements were taken before and after treatment.

First step is to state the hypothesis scenario. Because the key word says different

this means it is a two tailed test.

Ho: �̅� = 0

Ha: �̅� ≠ 0

Before we start calculating anything by hand and because we are given the raw

data set, we can actually run this hypothesis test in Excel. And since you installed

the Data Analysis Toolpak it is easy to do.

Go to Data -> Data Analysis -> and scroll to where it says t-Test Paired Two

Samples for Means and click OK

Under Input:

Variable 1 Range: you will highlight the Before column and make sure you include

the top row where the Label is located.

Variable 2 Range: you will highlight the After column and make sure you include

the top row where the Label is located.

Check the “Labels” box because we did include the first row of labels. For Alpha

out 0.05 but this can be change depending on what significance level you use.

Then make sure the bubble for New Workbook Ply: highlight and click OK. It

should look similar to the screenshot below.

Once you click OK in a new Worksheet this should populate.

t-Test: Paired Two Sample
for Means

Before After
Mean 185.4285714 169.5714286

Variance 78.61904762 106.2857143
Observations 7 7

Pearson Correlation 0.963199569

Hypothesized Mean
Difference

0

df 6

t Stat 14.13506279

P(T<=t) one-tail 3.91511E-06 t Critical one-tail 1.943180281 P(T<=t) two-tail 7.83022E-06 t Critical two-tail 2.446911851 Here we have all the values we need to state a conclusion. We see the T- Test Statistic = 14.135 and because we ran a two tailed test the p-value = 7.83022E-06. The “E-06” means scientific notation. We can rewrite this as 7.83 x 10-6, then to convert this to decimal form we need to move the decimal 6 places to the left. p -value = .00000783 < .05. This p-value is less than .05 which means we Reject Ho. Yes, there is statistical evidence that blood vessel formulation in the bone marrow different after the stem cell transplant. If we were running a 1-tailed test, we are given the p-value which is 3.91511E-06. Converting this to a decimal we get .000003915. The T-Test Statistic is the same and so is the conclusion for a 1-tailed test. Now that we ran a hypothesis test, let calculate a confidence interval and draw the same conclusion. The equation for a 2 – paired confidence interval: �̅� ± 2 ∗ ∗ √ Where Standard Error (SE) = √ Margin of Error (ME) = 2 ∗ ∗ √ Before we can start plugging values into our equation we need to find the Difference Column. We will subtract Before – After to get the Difference Column and then calculate the average and SD based on that column. Before After Diff Column 187 168 19 199 183 16 175 155 20 177 160 17 193 180 13 188 175 13 179 166 13 Average 15.85714286 SD 2.968084199 Plugging in what we know: 15.85714± 2 ∗ ∗ 2.96808 √7 The last thing we need to find is the T- Critical Value. We will use the =T.INV.2T function to find this. This function should look familiar from Week 4. If we want to find a 95% confidence interval, then alpha = .05. We will use this value in our Excel function and the DF is n – 1 = 7 – 1 = 6. =T.INV.2T(.05,6) We see the T – Critical Value is 2.44691. We will plug this into the equation and solve. BUT if we look at our output when we ran the Test, the T-Critical Value was also given to us as well. t-Test: Paired Two Sample for Means Before After Mean 185.4285714 169.5714286 Variance 78.61904762 106.2857143 Observations 7 7 Pearson Correlation 0.963199569 Hypothesized Mean Difference 0 df 6 t Stat 14.13506279 P(T<=t) one-tail 3.91511E-06 t Critical one-tail 1.943180281 P(T<=t) two-tail 7.83022E-06 t Critical two-tail 2.446911851 15.85714± 2 ∗ ∗ 2.96808 √7 15.85714±2.446911 ∗ 2.96808 √7 15.85714±2.446911 ∗ 1.121828 15.85714±2.74501 (13.112, 18.60215) The confidence interval goes from 13.112 to 18.60215. This interval goes from a positive value to a positive value. This means that 0 is NOT in this interval. Because 0 is NOT in the interval it is Significant, and we Reject Ho. This is the same conclusion that we got with the hypothesis test. In this document we will discuss 2 – sample Z- hypothesis testing and confidence intervals that uses a mean’s and known population S’s. This PDF discusses Z-Critical Value and you are discussing a sample mean and a population S. There are still 3 different hypothesis scenarios with a 2 – Sample Z Hypothesis Test. Lower Tail Test (1 tail): Ho: �̅�1 − �̅�2 = 0 Ha: �̅�1 − �̅�2 < 0 Upper Tailed Test (1 tail): Ho: �̅�1 − �̅�2 = 0 Ha: �̅�1 − �̅�2 > 0

Two Tailed Test:

Ho:�̅�1 − �̅�2 = 0

Ha: �̅�1 − �̅�2 ≠ 0

The hypothesized value is 0 and the same key words apply from a 1 – sample

hypothesis test to determine which scenario to use. 1 − 2 is the difference

between the average in the first sample and the average in the second sample.

The Z – Test Statistic =
�̅�1− �̅�2−0


1

2

1
+

2
2

2

Where S is the population standard deviation, 1 2 are averages and n1
and n2 are the sample sizes.

We can use =NORM.S.DIST to find the p-values. These should look familiar from

the discussion forum.

Example:

A dietitian has developed a diet that is low in fats, carbs, and cholesterol. The

dietitian wishes to examine the effects this diet has on the weights of obese

people. Two random samples of 30 obese each are selected, and one group of 30

people is places on the low-fat diet. The other 30 people are places on a diet that

contains approximately the same quantity of food, but has is not low in fats,

carbs, and cholesterol. For each person the amount of weight lost (or gained) in a

three-week period is recorded. There is a difference in the population mean

weight losses for the two diets? The population S1 = 4.67 and the population S2 =

4.04. Use alpha = .05. Here we see we are given the Raw Data set.

WL Low Diet WL Regular
Diet

8 6

21 14
13 4

8 6
11 13

4 11

3 11
6 8

16 14
5 8

10 6
8 4

8 12

12 2
7 1

3 2
12 6

14 1

16 0
11 9

10 5
9 10

10 6
8 6

14 9

3 8
7 3

14 1
11 7

14 8

First step is to state the hypothesis scenario. Because the key word says

difference this means it is a two tailed test.

Ho: �̅�1 − �̅�2 = 0

Ha: �̅�1 − �̅�2 ≠ 0

Before we start calculating anything by hand and because we are given the raw

data set, we can actually run this hypothesis test in Excel. And since you installed

the Data Analysis Toolpak it is easy to do. First you need to input this Raw Data

into Excel.

Then go to Data -> Data Analysis -> and scroll to where it says z-Test Two Sample

for Means and click OK

Under Input:

Variable 1 Range: you will highlight the WL Low Diet column and make sure you

include the top row where the Label is located.

Variable 2 Range: you will highlight the WL Regular Diet column and make sure

you include the top row where the Label is located.

Hypothesize Mean Difference: can be left as 0

Variance 1 Variance (known): Here is where you will put the Known Variance for

the First Sample. In the problem we are given the Known Standard Deviation. To

find the Variance all we did was Square it. Input that value in the box.

Variance 2 Variance (known): Here is where you will put the Known Variance for

the Second Sample. In the problem we are given the Known Standard Deviation.

To find the Variance all we did was Square it. Input that value in the box.

Check the “Labels” box because we did include the first row of labels. For Alpha

out 0.05 but this can be change depending on what significance level you use.

Then make sure the bubble for New Workbook Ply: highlight and click OK. It

should look similar to the screenshot below.

Once you click OK in a new Worksheet this should populate.

z-Test: Two Sample for
Means

WL Low Diet WL Regular
Diet

Mean 9.866666667 6.7
Known Variance 21.8089 16.3216

Observations 30 30

Hypothesized Mean
Difference

0

z 2.808838232

P(Z<=z) one-tail 0.002486031 z Critical one-tail 1.644853627 P(Z<=z) two-tail 0.004972062 z Critical two-tail 1.959963985 Here we have all the values we need to state a conclusion. We see the Z - Test Statistic = 2.8088 and because we ran a two tailed test the p-value = .00497. p -value = .00497 < .05. This p-value is less than .05 which means we Reject Ho. Yes, there is statistical evidence that there is a difference in the population mean weight losses for the two diets. If we were running a 1-tailed test, we are given the p-value which is .002486. Z- Test Statistic is the same and so is the conclusion for a 1-tailed test. Using Excel to run a hypothesis test when we are given the Raw Data is very convenient. But if we aren’t given the Raw Data and we are given the averages and known S’s we will need to compute the Z-Test Stat by hand and then use the Excel function to find the p-value. To find the Z-Test Stat we will use this equation and plug in what we know. You should know by now how to calculate the average and SD using Excel. Which is what I did here. Z – Test Statistic = �̅�1− �̅�2−0 √ 1 2 1 + 2 2 2 Z – Test Statistic = 9.86667− 6.7−0 √4.67 2 30 + 4.042 30 Z – Test Statistic = 3.16667 √.72696333+.54405333 Z – Test Statistic = 3.16667 √1.27101666 Z – Test Statistic = 3.16667 1.1273937 Z – Test Statistic = 2.80884 When we calculate the Test Stat by hand using algebra we get the same value. Next, we need to find the p-value. We will use the =NORM.S.DIST function to find the p-value. In Excel input =NORM.S.DIST(2.80884,TRUE) and hit Enter. We will type True because this is a cumulative test. We see this p-value = .997514 BUT remember when we use this function in Excel, this function is in the less than form. This means if we were running a Lower Tailed test, this would be our p-value. If we were running an Upper Tailed Test we need to take 1 - .997514 to get the p-value for our test. P-value = 1 - .997514 = .002486. This is the p-value for an upper tailed test. But since we are running a Two Tailed, we take whichever p-value is smaller and multiply it by 2. p-value = .002486*2 = .004972. This is the p-value for a two tailed test. And if we compare these to the Excel output that should be the same and draw the same conclusion. This is how you would run a 2 – sample Z hypothesis test using averages and population S’s when we don’t have the raw data and can’t use Excel. Now that we ran a hypothesis test, let calculate a confidence interval and draw the same conclusion. The equation for a 2 – sample Z confidence interval: �̅�1 − �̅�2 ± 2 ∗ ∗ √ 1 2 1 + 2 2 2 Where Standard Error (SE) = √ 1 2 1 + 2 2 2 Margin of Error = 2 ∗ ∗ √ 1 2 1 + 2 2 2 We have all the values we need let’s plug them into our equation. 9.86667 − 6.7 ± 2 ∗ ∗ √ 4.672 30 + 4.042 30 The last thing we need to find is a Z-Critical Value. We will use the =NORM.S.INV in Excel to find the Z-Critical Value. If we want to find a 95% confidence interval, then alpha = 1 - .95 = .05. But because this is a confidence interval and we need to take into account the plus AND minus on both sides if the bell-shaped curve we will divide alpha be 2. .05/2 = .025. Then we take 1 - .025 = .975. We will use this value in our Excel function. =NORM.S.INV(.975) We see the Z – Critical Value is 1.96. We will plug this into the equation and solve. But if you compare this Critical Value to the Excel output we got when we ran the hypothesis test it is the same because we used Alpha = .05 in the output. But this value will change depending on what you input for Alpha. z-Test: Two Sample for Means WL Low Diet WL Regular Diet Mean 9.866666667 6.7 Known Variance 21.8089 16.3216 Observations 30 30 Hypothesized Mean Difference 0 z 2.808838232 P(Z<=z) one-tail 0.002486031 z Critical one-tail 1.644853627 P(Z<=z) two-tail 0.004972062 z Critical two-tail 1.959963985 9.86667 − 6.7 ± 2 ∗ ∗ √ 4.672 30 + 4.042 30 9.86667 − 6.7 ± 1.96 ∗ √ 4.672 30 + 4.042 30 3.16667 ± 1.96 ∗ 1.1273937 3.16667 ± 2.209697 The confidence interval goes from .95697 to 5.376367. This interval goes from a positive value to a positive value. This means that 0 is NOT in this interval. Because 0 is NOT in the interval, Yes, it is Significant, and we Reject Ho. This is the same conclusion that we got with the hypothesis test. This week will continue to discuss hypothesis testing and confidence interval, but now we will discuss 2 samples. Just like with 1 – sample hypothesis testing there are 4 steps we will follow. To review those 4 steps please review the Week 6 Hypothesis Testing PDF. But the conclusion will still be the same. If the p-value is < alpha, you Reject Ho and state this test is significant. If the p-value is > alpha, you Do Not Reject Ho and state this test is not significant.

In this document we will discuss 2 – sample proportion hypothesis testing and

confidence intervals.

There are still 3 different hypothesis scenarios with a 2 – sample proportion

hypothesis test.

Lower Tail Test (1 tail):

Ho: 1 − 2 = 0

Ha: 1 − 2 < 0 Upper Tailed Test (1 tail): Ho: 1 − 2 = 0 Ha: 1 − 2 > 0

Two Tailed Test:

Ho: 1 − 2 = 0

Ha: 1 − 2 ≠ 0

The hypothesized value is 0 and the same key words apply from a 1 – sample

hypothesis test to determine which scenario to use.

The Z – Test Statistic =
1− 2

√ ∗ (
1

1
+

1

2
)

Where =
1+ 2

1+ 2

= 1 −

We will then use =NORM.S.DIST function in Excel to find the p-value. This Excel

function should look familiar from Week 4.

Example:

In a developing section of a district 50 people were surveyed and 38 were in favor

of the new proposal. For the rest of the district 100 people were surveyed and

only 65 people were in favor of the new proposal. Is there evidence that the

number of people favoring the new proposal is greater in the developing section

than the rest of the district? Use alpha = .05

First step is to state the hypothesis scenario. Because the key word says greater

this means it is an upper tailed test.

Ho: 1 − 2 = 0

Ha: 1 − 2 > 0

The first proportion is favoring the new proposal in the developing district and the

second proportion is favoring the new proposal in the rest of the district.

1 =
38

50
= .76

1 = 1 − .76 = .24

2 =
65

100
= .65

2 = 1 − .65 = .35

=
38 + 65

50 + 100
= .68667

= 1 − .68667 = .31333

Now that we have these values we can plug them in to find the Test Statistic.

Z – Test Statistic =
.76−.65

√.68667∗.31333(
1

50
+

1

100
)

= 1.369

Now that we have the Z-Test Statistics we can use the =NORM.S.DIST function to

find the p-value.

And yes, we can have a negative Z- Test Statistic, if we do that is fine. You DO

NOT have to take the absolute value of anything. Use the Test Stat. as is in the

Excel function.

In Excel input =NORM.S.DIST(1.369,TRUE)

We will write out TRUE because this test is cumulative.

We see this p-value = .9145 BUT remember when we use this function in Excel,

this function is in the less than form. This means if we were running a Lower

Tailed test, this would be our p-value. BUT since we are running an Upper Tailed

Test we need to take 1 – .9145 to get the p-value for our test.

P-value = 1 – .9145 = .0855.

We see the p-value for our upper tailed test is .0855. If we compare this to .05,

we see that:

.0855 > .05. Since the p-value is greater than alpha, We Do Not Reject Ho. This

test is not significant and No, there is no evidence that the proportion of people

favoring the new proposal is greater in the developing section than the rest of the

district at alpha = .05.

What if we were running a two tailed test? To find this p-value we would take

whichever p-value is smaller and multiple it by 2.

.0855*2 = .171. The p-value for a two tailed test would be .171.

Now that we ran a hypothesis test, let calculate a confidence interval and draw

the same conclusion.

The equation for a 2 – sample proportion is:

1 − 2 ±
2


1 1
1

+
2 2

2

Where Standard Error (SE) = √
1 1

1
+

2 2

2

Margin of Error (ME) =
2


1 1

1
+

2 2

2

Plugging in what we know:

. 76 − .65 ±
2

∗√
. 76 ∗ .24

50
+

. 65 ∗ .35

100

The last thing we need to find is the Z- Critical Value. We will use the

=NORM.S.INV function to find this. This function should look familiar from Week

4.

If we want to find a 95% confidence interval, then alpha = 1 – .95 = .05. But

because this is a confidence interval and we need to take into account the plus

AND minus on both sides if the bell-shaped curve we will divide alpha be 2. .05/2

= .025. Then we take 1 – .025 = .975. We will use this value in our Excel function.

=NORM.S.INV(.975)

We see the Z – Critical Value is 1.96. We will plug this into the equation and

solve.

. 76 − .65 ±
2

∗√
. 76 ∗ .24

50
+

. 65 ∗ .35

100

. 76 − .65 ± 1.96√
. 76 ∗ .24

50
+

. 65 ∗ .35

100

. 76 − .65 ±
2

∗√
. 76 ∗ .24

50
+

. 65 ∗ .35

100

. 76 − .65 ± 1.98(.076961)

. 11 ± .1508

(-.0408, .2608)

The confidence interval goes from -4.08% to 26.08%. This interval goes from a

negative value to a positive value. This means that 0 is in fact in this interval.

Because 0 is in the interval it is Not Significant, and we Do Not Reject Ho. This is

the same conclusion that we got with the hypothesis test.

In this document we will discuss 2 – sample T- hypothesis testing and confidence

intervals that uses a mean’s and unequal unknown Sample SD’s.

This PDF discusses T-Critical Value and you are assuming unequal variances and

discussing a sample mean and sample SD.

There are still 3 different hypothesis scenarios with a 2 – Sample T Hypothesis

Test.

Lower Tail Test (1 tail):

Ho: �̅�1 − �̅�2 = 0

Ha: �̅�1 − �̅�2 < 0 Upper Tailed Test (1 tail): Ho: �̅�1 − �̅�2 = 0 Ha: �̅�1 − �̅�2 > 0

Two Tailed Test:

Ho: �̅�1 − �̅�2 = 0

Ha: �̅�1 − �̅�2 ≠ 0

The hypothesized value is 0 and the same key words apply from a 1 – sample

hypothesis test to determine which scenario to use. 1 − 2 is the difference

between the average in the first sample and the average in the second sample.

The T – Test Statistic =
�̅�1− �̅�2−0


1

2

1
+

2
2

2

Where SD is the sample standard deviation, 1 2 are averages and n1 and
n2 are the sample sizes.

We can use =T.DIST, =T.DIST.RT and =T.DIST.2T to find the p-values. These should

look familiar from the discussion forum.

Example:

Suppose you wish to compare a new method of teaching reading to “slower

learners” with the current standard method. You decide to base your comparison

on the results of a reading test given at the end of a learning period of 6 months.

A random sample of 23 “slower learners”, 11 are taught by the new method and

12 are taught by the standard method. All 23 children are taught by qualified

instructors under similar conditions for 6 months. Does the new reading method

increase test scores when compared to the standard reading method? Use Alpha

= .05.

New
Method

Standard
Method

80 79

76 73

70 72
80 62

79 76
66 68

85 70
71 86

81 75

76 68
75 73

66

First step is to state the hypothesis scenario. Because the key word says increase

this means it is an upper tailed test.

Ho: �̅�1 − �̅�2 = 0

Ha:�̅�1 − �̅�2 > 0

Before we start calculating anything by hand and because we are given the raw

data set, we can actually run this hypothesis test in Excel. And since you installed

the Data Analysis Toolpak it is easy to do. First you need to input this Raw Data

into Excel.

Then go to Data -> Data Analysis -> and scroll to where it says t-Test: Two-Sample

Assuming Unequal Variances and click OK

Under Input:

Variable 1 Range: you will highlight the New Method column and make sure you

include the top row where the Label is located.

Variable 2 Range: you will highlight the Standard Method column and make sure

you include the top row where the Label is located.

Hypothesize Mean Difference: 0

Check the “Labels” box because we did include the first row of labels. For Alpha

out 0.05 but this can be change depending on what significance level you use.

Then make sure the bubble for New Workbook Ply: highlight and click OK. It

should look similar to the screenshot below.

Once you click OK in a new Worksheet this should populate.

t-Test: Two-Sample Assuming
Unequal Variances

New Method Standard
Method

Mean 76.27272727 72.33333333
Variance 30.81818182 40.24242424

Observations 11 12
Hypothesized Mean Difference 0

df 21

t Stat 1.587847328

P(T<=t) one-tail 0.063632175 t Critical one-tail 1.720742903 P(T<=t) two-tail 0.12726435 t Critical two-tail 2.079613845 Here we have all the values we need to state a conclusion. We see the T - Test Statistic = 1.587847 and because we ran an upper tailed test the p-value = .063632. p -value = .063632 > .05. This p-value is greater than .05 which means we Do Not

Reject Ho. No, there is not statistical evidence that the new reading method

increase test scores when compared to the standard reading method.

If we were running a 2-tailed test, we are given the p-value which is .12727. T-

Test Statistic is the same and so is the conclusion for a 2-tailed test.

Using Excel to run a hypothesis test when we are given the Raw Data is very

convenient. But if we aren’t given the Raw Data and we are given the averages

and unknown sample SD’s we will need to compute the T-Test Stat by hand and

then use the Excel function to find the p-value.

To find the T-Test Stat we will use this equation and plug in what we know. You

should know by now how to calculate the average and SD using Excel. Which is

what I did here

T – Test Statistic =
�̅�1− �̅�2−0

√ 1
2

1
+

2
2

2

T – Test Statistic =
76.2727− 72.3333−0

√5.551413
2

11
+

6.3436922

12

T – Test Statistic =
3.9394

√2.8016533+3.3535356

T – Test Statistic =
3.9394

√6.1551889

T – Test Statistic =
3.9394

2.48096

T – Test Statistic = 1.58785

When we calculate the Test Stat by hand using algebra we get the same value.

Next, we need to find the p-value. We will use the =T.DIST.RT function to find the

p-value, because this is an upper tailed test.

In Excel input =T.DIST.RT(1.58785, 21) and hit Enter.

We used 21 as the degrees of freedom because when you have a 2-sample T Test

DF = n1 + n2 – 2 = 11 + 12 – 2 = 21.

We see this p-value = .06363. This is the same p-value we got using Excel and it is

still greater than .05, which means we get the same conclusion as above.

If we were running a Lower Tailed Test we would use =T.DIST(1.58785,21) to get

the p-value. And if we were running an Upper Tailed Test we would use

=T.DIST.2T(1.58785,21) to get the p-value. Remember if we are running a two

tailed test and the Test Stat is NEGATIVE, that is fine it can be. But if it is, then

you need to use the absolute value of the Test Stat in the function.

=T.DIST.2T(ABS(1.58785),21). Once you hit Enter you should see that this p-value

is also the same as the Excel output above.

This is how you would run a 2 – sample T hypothesis test using averages and

unequal unknown SD’s when we don’t have the raw data and can’t use Excel.

Now that we ran a hypothesis test, let calculate a confidence interval and draw

the same conclusion.

The equation for a 2 – sample T confidence interval:

�̅�1 − �̅�2 ±
2

∗ ∗ √
1

2

1
+

2
2

2

Where Standard Error (SE) = √
1

2

1
+

2
2

2

Margin of Error =
2

∗ ∗ √
1

2

1
+

2
2

2

We have the values we need let’s plug them into our equation.

76.2727 − 72.3333 ±
2

∗ ∗ √
5.551413

2

11
+

6.3436922

12

The last thing we need to find is a T-Critical Value. We will use the =T.INV in Excel

to find the T-Critical Value.

If alpha = .05 then 1 – .05 = .95. We will use this value in your Excel function.

=T.INV(.95,21)

We see the T – Critical Value is 1.720742. We will plug this into the equation and

solve. But if you compare this Critical Value to the Excel output we got when we

ran the hypothesis test it is the same because we used Alpha = .05 in the output.

But this value will change depending on what you input for Alpha.

t-Test: Two-Sample Assuming
Unequal Variances

New Method Standard
Method

Mean 76.27272727 72.33333333

Variance 30.81818182 40.24242424
Observations 11 12

Hypothesized Mean Difference 0

df 21

t Stat 1.587847328

P(T<=t) one-tail 0.063632175 t Critical one-tail 1.720742903 P(T<=t) two-tail 0.12726435 t Critical two-tail 2.079613845 Note: If we were running a two tailed test we would use =T.INV.2T(.05, 21) to get the critical value for a two tailed test. Try this out and you should get the same value as the Excel output. 76.2727 − 72.3333 ± 2 ∗ ∗ √ 5.551413 2 11 + 6.3436922 12 76.2727 − 72.3333 ± 1.72074 ∗ √ 5.551413 2 11 + 6.3436922 12 3.9394 ± 1.72074 ∗ 2.48096 3.96667 ± 4.269087 The confidence interval goes from -1.1024 to 7.435757. This interval goes from a negative value to a positive value. This means that 0 is in this interval. Because 0 is in the interval, No, it is Not Significant, and we Do Not Reject Ho. This is the same conclusion that we got with the hypothesis test. Hypothesis Testing is a decision-making process called a Test of Significance. There are 4 unique parts to Hypothesis Testing. 1) The Hypothesis Scenario. This includes the Null and Alternative scenarios. a. Ho: Null Hypothesis Ha or H1: Alternative Hypothesis 2) Z- Test Statistic Z- Test Stat = �̂�− 0 (√ 0∗ 0 ) Where “ 0” is the hypothesized value and 0 = 1 − 0. 3) P- value. The p-value tells you if something will be significant or not and if you can Accept or Reject the claim. You will use the p-value to draw a conclusion regarding the hypothesis test. a. We will use =NORM.S.DIST function to find the p-value. It should look familiar from Week 4. 4) Conclusion: a. If the p-value is less than alpha (< α) then Reject Ho/Accept Ha. b. If the p-value is greater than alpha (> α) then We Do Not Reject Ho.

c. The most common alpha value is .05. If no, alpha value is given it will

default to .05 but do note that alpha can also be, .10, .01, and .005 to

name a few. Essentially alpha can be any value the statistician

deems fit, but the most common values are .05, .01 and .10.

One last thing before we get to an example. There are 3 different scenarios that

are associated with the Hypothesis Scenario.

1) There is a Lower tailed (one tailed) Test or a Left Tailed Test. If the problem

asks if there a significant decrease or less than or lower than or fewer than,

then the problem is a lower tailed test. The “<” sign corresponds with the Ha. The hypothesis scenario will look like: a. Ho: �̂� = 0 Ha: �̂� < 0 (Here we see that “ 0” is the hypothesized value and the Less Than Sign “<” lines up with the Ha) 2) There is an Upper tailed (one tailed) Test or a Right Tailed Test. If the problem asks is there a significant increase or more than or greater than or higher than, then the problem is an upper tailed test. The “>” sign

corresponds with the Ha. The hypothesis scenario will look like:

a. Ho: �̂� = 0

Ha: �̂� > 0

(Here we see that “ 0” is the hypothesized value and the
Greater Than Sign “>” lines up with the Ha)

3) There is a Two tailed Test. If the problem asks is there a significant

difference or statistical evidence or asks if it is not the same, then the

problem is a two-tailed test. The “≠” sign corresponds with the Ha. The

hypothesis scenario will look like:

a. Ho: �̂� = 0

Ha: �̂� ≠ 0

(Here we see that “ 0” is the hypothesized value and the
Greater Than Sign “≠” lines up with the Ha)

The hypothesized value is what we think should happen or what has been found

to be true in the past.

Now let’s continue to look at our car price data from Week 3. In Week 3, I asked

you to calculate the average and then find how many data points fell below the

average. We called this value p and then we found q. If we look back at my data

set, we see that p = .70 and q = .30.

We will call this �̂� = .70 and �̂� = .30.

We want to run a test to see how close our data set is to a 50/50 spread? 50% of

the data would fall above the mean and 50% of the data would fall below the

mean, in a perfect world.

In other words, is there a difference between your data set and 50%? We will

calculate a 95% hypothesis to test this claim.

(Note: YES! I realize that some of you did see in your Week 3 forum that you did

get p = .50 and q = .50. If this is the case, your Test Statistic will be 0 and the p-

value will come out to be 1. That is fine, BUT it is still a good idea to go through

this example and make sure you can run a hypothesis test to get the correct

results. Extra practice never hurt anyone.)

Getting back to our test, this tells us that the hypothesized value is .50. The

hypothesis scenario will look like this:

1) Ho: �̂� = .50

Ha: �̂� ≠ .50

2) Z-Stat =
�̂�− 0

(√
�̂�∗�̂�

)

=
.70−.50

(√
.50∗.50

10
)

= 1.264911

Note: If your Z-Stat is negative that is fine. That does not mean the

problem is incorrect. And if your �̂� = .50, your Z-Stat would be 0 here
and that is fine also.

3) To find the p-value we will use the =NORM.S.DIST function. In Excel

type in =NORM.S.DIST(1.264911,TRUE) and hit Enter. We type in TRUE

because the hypothesis test is cumulative.

We see that the p-value = .897048. But remember this is in the Less Than form. If

we were running a Lower Tailed Test this would be our p-value. To find the p-

value for an Upper Tailed Test we would take p-value = 1 – .897048 = .102952.

Since we are running a Two Tailed Test, to get the correct p-value we would

multiply whichever p-value is smaller by 2. It will be different depending on the

test, so you need to make sure you use whichever one is smaller. Remember, p-

values CANNOT be greater than 1. If you get a p-value greater than 1, you did

something wrong.

p-value = .102952*2 = .205904. This is the p-value we will use for our conclusion.

If your Z-Stat is 0 then your p-value in this test will be 1. That is fine. Your p-value

can be 1 but it CANNOT be greater than 1.

4) Lastly, we need to state the conclusion. We will use alpha as .05. When

alpha is not stated it is defaulted to .05. .205904 > .05. The p-value is

greater than alpha. We DO NOT Reject Ho. Since we Do NOT Reject Ho,

this which means we CANNOT reject the claim that the distribution is

different from a 50/50 spread.