in class activity 10_24 - xenon garcia
.pdf
keyboard_arrow_up
School
Norco College *
*We aren’t endorsed by this school
Course
70A
Subject
Mathematics
Date
Jan 9, 2024
Type
Pages
3
Uploaded by HighnessButterflyPerson542
Xenon Garcia
Mat 70A
Fall 2023
Lab #7 A/B Testing
One special kind of hypothesis test we do in this class is called an A/B test. The steps
used to run an A/B test are the same as a general hypothesis test, but A/B tests have a
specific null hypothesis (that two samples were drawn from the same distribution). We
carry out this test by performing a permutation of our data.
Mid-Semester Check In:
What has been your favorite
topic/assignment/lecture/anything so far?
A/B Testing and Error Probabilities
Kevin, a museum curator, has recently been given specimens of caddisflies collected
from various parts of Northern California. The scientists who collected the caddisflies
think that caddisflies collected at higher altitudes tend to be bigger. They tell him that the
average length of the 560 caddisflies collected at high elevation is 14mm, while the
average length of the 450 caddisflies collected from a slightly lower elevation is 12mm.
He’s not sure that this difference really matters, and thinks that this could just be the
result of chance in sampling.
1.
Warmup:
When should you use an A/B test versus another kind of
hypothesis test?
You should use an AB test when determining whether two samples, also
known as an A group and a B group, were sampled from the same underlying
distribution/population
2.
What’s an appropriate null hypothesis that Kevin can simulate under?
The distribution of specimen lengths is the same for caddisflies sampled
from high elevation as those sampled from low elevation.
3.
How could you test the null hypothesis in the A/B test from above? What
assumption would you make to test the hypothesis, and how would you
simulate under that assumption?
The caddisflies shouldn't have come from different areas if the null hypothesis
is true. This means that it shouldn't matter whether the samples were labeled
as high elevation or low elevation. Based on this idea, you could change the
caddisflies' labels and use this "relabelled" data to find your test statistic.
4.
What would be a useful test statistic for the A/B test? Remember that the
direction of your test statistic should come from the initial setting.
Difference in mean lengths between the two groups.
Xenon Garcia
Mat 70A
Fall 2023
5.
Assume flies refers to the following table:
Elevation
Specimen Length
High
12.3
Low
13.1
High
12.0
(1007 rows omitted)
Fill in the blanks in this code to generate one value of the test statistic under the null
hypothesis.
def one_simulation():
shuffled_labels = flies.
sample(with_replacement = False)
.column(‘Elevation’)
shuffled_flies = flies.drop(‘Elevation’).with_columns(
‘Elevation’, shuffled_labels
)
grouped = shuffled_flies.
group
(
‘Elevation’, np.mean
)
means = grouped.column(‘Specimen length mean’)
statistic =
means.item(0) - means.item(1)
return statistic
6.
Fill in the code below to simulate 10000 trials of our permutation test.
test_stats =
make_array()
repetitions =
10000
for i in np.arange(
repetitions
):
one_stat =
one_simulation()
test_stats =
np.append(test_stats, one_stat)
test_stats
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help