8 Distributions and Descriptive Statistics in Python
This tutorial introduces distributions and descriptive statistics in Python using pandas and helper functions that mirror R
’s syntax.
8.1 Our Data
## 0 4.5
## 1 5.0
## 2 5.5
## 3 5.0
## 4 5.5
## 5 6.5
## 6 6.5
## 7 6.0
## 8 5.0
## 9 4.0
## dtype: float64
8.5 Spread (2)
8.6 Shape
8.8 Common Distributions
8.9 Comparing Distributions
mysim = p.concat([
p.DataFrame({'x': sw, 'type': "Observed"}),
p.DataFrame({'x': mynorm, 'type': "Normal"}),
p.DataFrame({'x': mypois, 'type': "Poisson"}),
p.DataFrame({'x': mygamma, 'type': "Gamma"}),
p.DataFrame({'x': myexp, 'type': "Exponential"}),
p.DataFrame({'x': myweibull, 'type': "Weibull"})
])
g1 = (ggplot(mysim, aes(x='x', fill='type')) +
geom_density(alpha=0.5) +
labs(x='Seawall Height (m)', y='Density (Frequency)', subtitle='Which distribution fits best?', fill='Type'))
g1
## <plotnine.ggplot.ggplot object at 0x000002382AB1F680>
## <plotnine.ggplot.ggplot object at 0x000002382AB0B2C0>