import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from scipy.stats import ttest_1samp
Quantifying uncertainty
Suggested answers
An article in the The Tucson Citizen-Times published in the summer of 2020 claims that the average price per guest (ppg) for properties in Tucson is $100 on Airbnb. To evaluate their claim we will use a dataset on 50 randomly selected Asheville Airbnb listings in July 2024. These data can be found in data/tucson.csv
.
Let’s load the packages we’ll use first.
And then the data.
= pd.read_csv("data/tucson.csv") tucson
Hypotheses
- Write out the correct null and alternative hypothesis. Do this in both words and in proper notation.
Null Hypothesis \(H_0\): The population mean is 100.
Alternative Hypothesis \(H_1\): The population mean is not 100. In proper notation:
\(H_0=100\)
\(H_1\ne100\)
Observed data
Our goal is to use calculate the probability of a sample statistic at least as extreme as the one observed in our data if in fact the null hypothesis is true.
- Calculate and report the sample statistic below using proper notation.
= np.mean(tucson)
sample_mean print(f"Sample Mean (x̄) = {sample_mean}")
Sample Mean (x̄) = 116.24
\[\bar{x} = 116.24\]
The null distribution
Let’s use simulation-based methods to conduct the hypothesis test specified above.
Generate
We’ll start by generating the null distribution.
- Generate the null distribution and call it
null_dist
4321) np.random.seed(
= np.random.normal(loc=100, scale=tucson['ppg'].std(), size=1000) null_dist
- Take a look at
null_dist
. What does each element in this distribution represent?
null_dist
array([ 50.03477537, 162.65487326, 194.93106499, 65.46579584,
82.74130219, 62.70337221, 51.03913965, 245.24489938,
142.84833267, 26.18667883, 121.52265421, 108.54001302,
152.24213198, 4.65396831, 7.99945009, -7.75783147,
90.65062465, 133.66708015, 183.44169565, 20.14010862,
49.21399216, 91.58607486, 136.20392946, 127.27325708,
121.46585262, 174.31174926, 99.47730758, 150.18901071,
11.8026206 , 57.47695089, -7.57612584, 164.31051874,
213.83596689, 80.60786117, 123.79650421, 4.02649307,
134.75867521, -5.89264938, 11.50108587, 49.97103992,
118.5506322 , 12.38859311, 35.64476738, 31.43368303,
76.72419991, 16.26035182, 228.14269044, 110.62569413,
127.87323838, 111.28257104, 40.76076578, 134.85515917,
267.50017303, 184.828656 , 162.05471212, 51.64389619,
19.71374339, 97.50338125, 174.09883061, 17.18417508,
46.06683466, -26.09261409, 195.74295681, 122.84642852,
43.18323498, 172.21573446, 107.78222391, 81.59440132,
124.91730229, 189.82708578, 43.80534897, 29.3806034 ,
156.47789448, -77.87403112, 169.16026359, 144.36725648,
119.5981452 , 133.98092805, 209.06125973, 88.70135948,
117.69170449, 73.81173556, 159.74481661, 174.71030438,
-11.66276834, 32.85956222, 93.33251141, 197.98806749,
105.76650384, 45.21975612, 89.41257315, 139.59734251,
63.65595664, 140.84097732, 109.02315651, 38.55153372,
179.72332537, 29.21506949, -44.50213291, 82.90718337,
141.34780875, 86.9060758 , 37.02845199, 86.43741944,
56.27832012, -13.18165335, 85.35981833, 82.77992955,
158.49727934, 131.77775737, 174.47820502, -2.97219417,
164.92103389, 78.71476133, 131.3153663 , 69.15511409,
172.94061275, 102.80175097, 165.76724604, -25.92430498,
61.15786494, 79.10043594, 127.42064928, -32.36565577,
147.14067022, 87.15616093, 51.79222151, 223.72790513,
177.16746666, 37.51361428, 93.42769412, 67.56451882,
89.30774363, 87.1367526 , 146.95063271, 132.39549426,
236.04539625, 83.87180643, 181.30906025, 129.52649907,
149.73703062, 88.5063165 , 202.41890758, 90.71484082,
223.20965695, -42.96660578, 66.51272242, 81.76844948,
174.59128854, 28.46072012, -34.34408263, 26.50853357,
57.47881921, 196.36039416, 89.85212574, 202.39710527,
39.44006389, 26.20748416, 152.93419099, 65.73306136,
76.15160163, 110.17581183, 85.01689034, 2.11688942,
69.91651262, 110.46185883, 121.31333953, -18.29999849,
94.85352447, 78.84853585, 149.81572107, 132.61847733,
203.14346401, 69.58899544, 79.39913017, 98.03201902,
187.66433139, 164.97035709, -8.90458706, 91.70685827,
162.15748114, 20.82123933, 109.19986866, 137.73186335,
145.65968199, 105.81663209, 85.70626059, 98.73920164,
35.50367189, 179.10643713, 30.85771642, 70.47482547,
191.0488096 , -39.57463069, 167.4731793 , 86.83301089,
80.59868979, 36.66864536, 125.18761125, 45.18938202,
-43.36100845, 106.57612465, 100.57068298, -52.494337 ,
46.7514217 , 175.34713677, 124.97788937, 167.95041601,
39.91041093, 53.44331349, 90.25055209, 22.85310816,
131.66131385, 199.33591465, 210.41069548, 128.02471404,
96.53704018, 141.49976281, 65.24161702, 213.92567555,
190.6353054 , 198.3140315 , 99.41053935, 78.90655367,
-3.49133821, 38.37484546, 41.02726262, -12.8268799 ,
31.59170686, 80.53614899, 3.87269836, 133.49269003,
89.31375895, 117.01179148, 6.58379686, 102.99658827,
211.33324252, 101.00076715, 48.20206965, 119.37396474,
116.10165816, 59.89927992, 152.07471801, 51.0745799 ,
90.66543854, -8.64560307, 14.93422678, 108.6231274 ,
86.90878462, 100.13816433, 134.3666715 , 83.10956739,
86.66882805, 99.57203663, 110.70916352, 98.29057436,
43.28270013, 156.33266087, 61.40125244, 74.1455775 ,
216.31490751, 187.42522799, 62.51091613, 206.70907811,
-17.06567909, 144.60340096, 72.38409907, 136.72934156,
124.42333655, 208.87529176, 151.4755531 , 43.67291462,
23.36929675, 123.83349443, 27.23303186, 30.05703205,
-11.31270716, 85.62782564, 48.26224309, 39.17121767,
99.96222567, 278.11522916, 100.82030179, 168.8927298 ,
87.89074577, 275.04897954, 131.48766206, 93.44432843,
106.07919479, 64.17541289, 165.64195502, 65.5742781 ,
117.29443561, 113.90627944, 31.3969706 , 97.84244564,
101.334517 , 123.68758695, 193.15530499, 22.48339778,
88.84196308, 48.78782986, 86.6445352 , 65.55026344,
102.58490892, 58.65826137, 25.32552571, 193.43617362,
155.95369616, 32.47891225, 5.71804076, 31.61106213,
110.19373216, 128.62369924, 95.56078661, 48.01804111,
74.47577163, 80.40757825, 170.68864975, 68.03900094,
158.8422413 , 143.23896841, 88.52817827, 77.56306375,
117.8467025 , 117.28056555, 130.41957323, 32.86492458,
136.42513328, 66.02730555, 200.25293066, 149.08539361,
43.72803874, 157.40046199, 123.78666082, 31.96265279,
172.72485542, 25.93192901, 150.6308481 , 98.2712681 ,
124.83190047, 127.07962357, -35.52653712, 142.80969401,
122.21466583, 105.14016077, 233.44644143, 118.36945528,
61.29304433, 83.64753028, 30.9128471 , 198.05904837,
137.42548023, 69.4718388 , 143.05174323, 146.73863655,
138.72374026, 115.81737411, 90.45291363, 145.71051425,
41.20311271, 94.46941847, -20.85079215, -41.47632443,
3.10570214, 166.46391219, 101.8542853 , 234.85516864,
43.30501759, 52.08126949, 177.75474292, 176.71356109,
217.51329628, 38.39256573, 24.41474428, 112.615036 ,
177.50573083, 108.81856212, 90.6318923 , 133.2719098 ,
-29.929483 , 116.24176422, 152.54242865, 291.45160614,
129.42847224, 198.3137503 , 68.09383314, -20.59555998,
270.28893202, 200.33457336, 212.77407594, 116.96194532,
30.00139798, 62.67213666, 49.13354278, 119.04787936,
-46.97854604, 16.21655963, 96.56573161, 131.93326492,
53.34076595, 150.26661473, 19.38815062, 25.90059663,
92.66619765, 115.93991423, 196.94295444, 160.13560498,
109.03785752, 7.28965331, 186.63549917, 37.90613997,
189.2021799 , 157.42513292, 179.95133968, -5.77006256,
260.87012072, 154.51698335, -8.41322528, 5.26891891,
133.59358158, 198.2062956 , 31.7572955 , 138.09565829,
-7.68504696, 122.00149444, 132.99939165, 127.74987029,
8.93047564, 178.58029168, 130.71733298, 179.94840842,
49.57583168, -22.75865631, 110.67777638, 235.39192231,
158.22733683, 89.93216619, 12.43560813, 65.33497757,
57.69195717, 39.47019198, 107.64137403, 146.58757115,
94.10436073, 164.97795405, 79.61158445, 114.73599938,
201.5150552 , 110.36574725, 135.34416344, 121.22243811,
69.34390429, -5.11608558, 188.81754578, 217.03677753,
-22.0996553 , 175.22875008, 42.43590492, -32.44512902,
17.88189511, 58.46764944, 52.4198987 , 134.20034263,
33.47173165, 37.24022915, 157.69505801, -23.73722151,
87.1140624 , 183.29636684, 46.1379621 , 159.26386891,
161.20613825, 45.14862188, 16.0487506 , 95.94162518,
134.73712373, 169.11140365, 20.48836696, 132.66538363,
91.36483052, 10.84498753, 77.60531639, 110.97190111,
109.06567206, 247.46999783, 86.09116702, 105.52675235,
69.91120231, 36.17594383, 159.68320934, 79.40300067,
265.66607899, 125.86884091, 170.45525917, 116.91882114,
122.8187129 , 175.56890559, 18.50856925, 73.47955237,
139.03476665, 45.56103297, 7.24820456, 91.46922022,
36.39206812, 112.58977368, 62.94104308, 26.83956316,
103.9416371 , 95.69756212, 177.85825144, 152.34371351,
202.97497898, 32.72002818, 133.36479191, 173.96950517,
58.97750925, 153.59813041, 51.95969862, 164.00557661,
218.40662049, 149.38959918, 108.35488311, 136.52133245,
117.20174876, 105.76114601, 80.92480655, 161.39572507,
158.22807951, 142.70416329, 60.45988109, 98.99976515,
127.92924561, 93.92355622, 242.38277125, 111.23135704,
167.73167973, 132.77100957, 171.60347096, 118.82480072,
33.47958101, 78.4365225 , 150.05396124, 151.04940056,
86.25553608, 182.73527874, 126.26225316, 122.61339718,
74.58896451, 128.0175515 , 60.88288226, 78.44899522,
186.41969118, 119.18307356, 153.18190205, 98.03931489,
209.1383879 , 158.04898855, 144.0202827 , 133.50632511,
138.30666307, 237.40351383, 121.56707727, 109.66608483,
77.8470543 , -7.60293646, -12.32960326, 112.0408156 ,
65.81992191, 112.99106499, 88.04151642, 77.69303622,
119.23033016, 54.5953792 , 141.9323225 , 80.65855389,
210.779073 , 192.66645026, 82.10695781, 177.91073788,
120.17950195, 104.85750109, 47.58952969, 145.07614208,
160.55601209, 139.04864833, 145.94601569, 130.76173562,
91.72467301, -47.86360162, 182.811149 , 223.25454533,
80.14854246, 47.14023982, 4.64939869, 3.69944265,
123.68580107, 99.96686455, 23.92270615, 141.23723309,
125.72868489, 138.61018087, 86.46938655, -6.09543628,
209.63189481, 48.78625572, -19.77081485, 10.0466482 ,
154.56405452, 140.15028138, 52.11965768, 110.22257002,
99.43793133, 79.33266239, 11.10075585, 39.66061776,
101.05790292, 75.67210489, 85.67878568, 117.56131287,
-44.92581639, 163.72894567, 30.51432303, 287.05274177,
77.18479367, 92.03055246, 131.09192823, 14.36024811,
46.24184697, 199.56150758, 22.60876933, 52.85689941,
61.40021674, 198.74138913, 129.3538796 , 145.41361522,
95.65404806, 121.60533516, 140.08020563, 190.94028775,
98.04756976, 97.04919582, 171.66583254, 121.85377048,
161.9292013 , 123.85621667, -86.34246407, 168.64028814,
39.63822726, 146.40181145, 79.99975457, 91.1729282 ,
165.45501495, -9.31084539, 112.96713139, 188.45579206,
141.49563689, 194.60985876, 47.97512043, 56.49394837,
211.94771074, 154.01587344, 116.67889898, 10.53061668,
119.88697374, 69.07145634, 126.92990266, 219.71016004,
92.76915171, 115.99014122, 146.61942439, -5.41921302,
204.3604502 , 99.29667241, 1.3753202 , 52.41065343,
126.8209805 , 98.56736021, 153.22987457, 222.29932116,
102.87214273, 149.02023916, 21.64149149, 97.5264277 ,
67.89017443, -56.98246263, 37.87858447, 87.36334955,
193.03197207, 119.50264443, 225.52403809, 163.6907018 ,
111.34084682, 82.47209095, 198.2606687 , 44.21108783,
8.83954279, 64.25305435, -13.54524455, 44.02396654,
100.6614043 , 92.54436655, 30.36927832, 89.16700162,
21.29793016, 228.05251142, 138.14357519, 183.41468256,
170.01971047, 150.77410514, 132.24862745, 52.48777558,
107.20575994, 154.8257222 , 115.09865453, 52.87316414,
148.6365115 , 56.40870255, 154.33407766, 47.77764549,
84.39443538, 168.62837302, 154.03621847, 123.1307913 ,
152.81495759, 29.69343183, 151.55996635, 160.66190487,
192.13319232, 117.26433434, 153.57880414, 172.58636579,
51.68457862, 162.06681773, 205.64076059, 92.95599937,
37.5516246 , 136.13685013, 58.37281338, 115.50333122,
46.88615283, -92.88525262, 184.03533564, 80.22977381,
89.90543017, 149.09038527, 127.94080305, 84.73919225,
33.09411859, 65.7363112 , 27.54835241, 166.96508045,
174.9431656 , 83.93113337, 132.58893323, 46.26679525,
85.4289971 , 58.91669165, 90.24579269, 161.03287282,
79.32345616, 125.90835937, 87.54497916, 10.5447459 ,
229.53869337, 131.99298622, 76.71954507, 226.19497054,
60.0459713 , 91.74096382, 231.35454867, -43.78219432,
139.95032452, 83.34817373, 77.84157847, 62.75422266,
97.63543707, 60.1483067 , 80.81659112, 146.09645649,
89.63486314, 114.77963205, 145.18882543, 157.26272551,
213.0383413 , 117.27511248, 194.95828888, 122.56402944,
94.15334215, 9.19722467, 165.54627326, 62.26886157,
66.16969734, 149.1126258 , 85.4417487 , 42.68864231,
161.28995141, 31.57898596, 166.23798717, 162.65247464,
46.07511404, 111.41667369, 56.68838668, 92.92761897,
44.75340845, 218.77920338, 76.64268612, 62.36082462,
142.60048317, 139.05504233, 97.02256758, 155.29692527,
161.41577911, 79.21820276, 31.13981384, 166.27669283,
82.89332738, 96.28340898, 198.62387325, 106.5692619 ,
192.58304601, 123.10009631, 79.63411191, 120.48473844,
62.08372307, 249.85715311, 136.79427319, 13.29827742,
-10.57803885, 49.72492207, -4.15713336, 297.54973298,
86.00102375, 113.11036453, 51.6036414 , 108.25362924,
178.01559651, 232.92963646, 64.6619643 , 137.22871104,
-2.18550518, 57.99695096, 205.88235894, 62.67314752,
79.78822045, 113.99573347, 93.48519209, 88.80112176,
86.9241504 , 126.8065952 , 118.94290879, 120.51990244,
164.3930736 , 174.23977326, 96.20838572, 60.20928801,
16.52112725, 67.06582105, 210.13216131, 104.67267484,
-12.03442876, 37.09508665, 70.77199552, 165.11617635,
219.07980779, 142.82759531, 51.27202576, 164.78036426,
171.07621271, 154.41378671, 111.58150677, 101.87427717,
140.13342838, 210.43886018, 44.80694817, 154.21119467,
115.49700414, 187.81019085, -12.10242778, 165.00484795,
120.77217759, 239.19180246, -1.20825092, 110.40101739,
166.40260651, -99.76291722, 93.71349591, 109.97991264,
77.65466238, 165.11858273, 149.88391579, 90.62321645,
72.37884081, 139.24090685, 140.28683694, 206.28330594,
264.95636459, 98.21565728, 105.73742748, 181.19738883,
115.74809147, 204.62703977, 80.13909664, 48.99109267,
215.17723324, 55.04057618, -6.5407207 , 109.73988255,
131.22313984, 239.56384429, 81.41347579, 191.00363375,
95.46691658, 215.76492035, 116.92735202, 61.9363502 ,
102.8914134 , 75.16653501, 244.31287003, -43.79812565,
63.34672358, 145.6404274 , 67.77330133, 196.01786617,
132.55337106, 179.59494808, 100.82465785, 86.8274616 ,
106.17454612, 69.70623925, 119.35071763, 4.54387517,
153.03310978, 166.62688634, 31.4415788 , 65.45358125,
94.61227476, 159.05595392, 54.2484859 , 10.0904782 ,
-13.51147304, 95.942916 , 136.85399226, 71.49770242,
124.35786213, 103.99998955, 45.00331983, 163.16714273,
169.40416695, 159.88402451, 88.35847288, 69.96860958,
76.22091419, 170.17409284, 197.14733411, -48.17085397,
84.04900777, 34.67567656, 78.52271043, 143.4256792 ,
130.44794273, 158.88849751, 201.48585198, 68.5977069 ,
114.13311534, 123.7190078 , 69.62342198, 205.57850237,
21.32812133, 129.90334648, 67.88402524, 21.23596491,
75.22661864, 90.15695264, 56.05598149, 3.05778085,
171.60250908, 183.51526687, 16.72896952, 54.11277909,
66.58364056, 180.8671317 , 254.68221636, 40.67707914,
104.83667629, 111.75032197, -16.26297054, 132.85242288,
138.6726092 , 99.82635096, 87.96752679, 118.06593876,
-18.79223914, 133.8235922 , 95.5207079 , 91.91877248,
187.02998049, 35.02670179, 92.27980414, 144.56018346])
Visualize
- Question: Before you visualize the distribution of
null_dist
– at what value would you expect this distribution to be centered? Why?
At 100, since we created this distribution assuming \(\mu=100\)
- Create an appropriate visualization for your null distribution. Does the center of the distribution match what you guessed in the previous question?
=True, palette="colorblind")
sns.histplot(null_dist, kde'Sample Mean')
plt.xlabel('Frequency')
plt.ylabel('Null Distribution with Sample Mean')
plt.title( plt.show()
- Now, add a vertical red line on your null distribution that represents your sample statistic.
=True)
sns.histplot(null_dist, kde116.24, color='red', linestyle='solid', linewidth=2)
plt.axvline('Sample Mean')
plt.xlabel('Frequency')
plt.ylabel('Null Distribution with Sample Mean')
plt.title( plt.show()
- Question: Based on the position of this line, does your observed sample mean appear to be an unusual observation under the assumption of the null hypothesis?
Yes, it’s pretty far from the center.
p-value
Above, we eyeballed how likely/unlikely our observed mean is. Now, let’s actually quantify it using a p-value.
- Question: What is a p-value?
The probability of the observed sample statistic, or something more extreme in the direction of the alternative hypothesis, if in fact the null hypothesis is true.
- Visualize the p-value. Note that the two-sided approach would visualize two lines, one for the sample mean and another for \(H_0 - (\mu - H_0)\) or \(100 - (\mu - 100)\)
= tucson['ppg'].mean()
point_estimate
# Calculate two-sided p-value
= 2 * min((null_dist >= point_estimate).mean(),
p_value_two_sided <= point_estimate).mean())
(null_dist print(f"Two-sided P-value = {p_value_two_sided}")
# Visualize the p-value
=True)
sns.histplot(null_dist, kde116.24, color='red', linestyle='solid', linewidth=2)
plt.axvline(83.76, color='red', linestyle='dashed', linewidth=2)
plt.axvline('Sample Mean')
plt.xlabel('Frequency')
plt.ylabel('Null Distribution with P-value')
plt.title( plt.show()
Two-sided P-value = 0.862
- Your turn: What is the p-value?
\(\text{P-value}=0.862\)
Conclusion
- What is the conclusion of the hypothesis test based on the p-value you calculated? Make sure to frame it in context of the data and the research question. Use a significance level of 5% to make your conclusion.
Since the p-value is larger than the significance level, we cannot reject the null hypothesis. The data provide convincing evidence that the average price per guest of properties on Airbnb in Tucson is different than $100.
- Interpret the p-value in context of the data and the research question.
If in fact the true average price per guest of properties on Airbnb in Tucson is $100, the probability of observing a random sample of 50 Tucson Airbnb listings where the average price per guest is $116.24 or higher or $83.76 or lower is 0.862.
Get real…
- Question: What we did above was a “toy example” to illustrate hypothesis test. What would you change to make this a real, more robust analysis?
Change the number of resamples to a higher number, somewhere ~10,000 replicates.
- Work through the analysis again with these changes.
= np.random.normal(loc=100, scale=tucson['ppg'].std(), size=10000)
null_dist_large
= np.array(null_dist_large)
null_dist_large
= tucson['ppg'].mean()
point_estimate print(f"Point Estimate (Sample Mean) = {point_estimate}")
= (null_dist_large >= point_estimate).mean() if point_estimate > 95 else (null_dist <= point_estimate).mean()
p_value = 2 * min((null_dist_large >= point_estimate).mean(), (null_dist_large <= point_estimate).mean())
p_value_two_sided print(f"Two-sided P-value = {p_value_two_sided}")
Point Estimate (Sample Mean) = 116.24
Two-sided P-value = 0.804
= null_dist_large.mean()
sample_mean_large
# Visualize the new p-value
=True)
sns.histplot(null_dist_large, kde='red', linestyle='dashed', linewidth=2)
plt.axvline(sample_mean_large, color=[0, max(sns.histplot(null_dist_large, kde=True).get_lines()[0].get_ydata())],
plt.fill_betweenx(y=sample_mean_large, x2=max(null_dist_large), color='red', alpha=0.3)
x1'Sample Mean')
plt.xlabel('Frequency')
plt.ylabel('Null Distribution with P-value (Larger Sample)')
plt.title( plt.show()