stat
The stat package provides descriptive statistics functions for numeric lists: measures of central tendency, spread, distribution, correlation, and normalization.
Package Functions
Basic Measures
Function
Returns
Description
sum(data)
integer\|float
Sum of all values
min(data)
integer\|float
Minimum value
max(data)
integer\|float
Maximum value
range(data)
integer\|float
max - min
mean(data)
float
Arithmetic mean
median(data)
integer\|float
Middle value (or average of two middle values)
mode(data)
list
List of most-frequent values
sorted(data)
list
Sorted copy (does not mutate the original)
Spread
Function
Returns
Description
variance(data, population?)
float
Sample variance by default; pass true for population variance
stdev(data, population?)
float
Sample standard deviation; pass true for population
percentile(data, p)
float
p-th percentile in [0, 100] via linear interpolation
zscore(data)
list
List of z-scores (standard scores)
Bivariate
Function
Returns
Description
covariance(x_data, y_data)
float
Sample covariance of two lists
correlation(x_data, y_data)
float
Pearson correlation coefficient [-1, 1]
Distribution
Function
Returns
Description
frequency(data)
hashmap
Maps each unique value (as string) to its count
normalize(data)
list
Rescales values to [0.0, 1.0] range
Function Details
percentile(data, p)
Parameters
Type
Name
Description
list
data
List of numeric values
integer\|float
p
Percentile in [0, 100]
Returns float
variance(data, population?) / stdev(data, population?)
Parameters
Type
Name
Description
Default
list
data
List of numeric values
—
boolean
population
true for population statistic, false for sample
false
covariance(x_data, y_data) / correlation(x_data, y_data)
Both lists must have the same length.
Examples
import "stat"
data = [4, 7, 13, 2, 1, 9, 7, 3]
println stat::sum(data) # 46
println stat::min(data) # 1
println stat::max(data) # 13
println stat::range(data) # 12
println stat::mean(data) # 5.75
println stat::median(data) # 5.5
println stat::mode(data) # [7]
Spread
import "stat"
data = [2.0, 4.0, 4.0, 4.0, 5.0, 5.0, 7.0, 9.0]
println stat::variance(data) # 4.571... (sample)
println stat::variance(data, true) # 4.0 (population)
println stat::stdev(data) # 2.138...
println stat::percentile(data, 25) # 3.5
println stat::percentile(data, 75) # 5.5
scores = stat::zscore(data)
println scores # [-1.16, -0.46, ...]
Correlation
import "stat"
x = [1, 2, 3, 4, 5]
y = [2, 4, 5, 4, 5]
println stat::covariance(x, y) # 1.5
println stat::correlation(x, y) # 0.874...
Frequency and normalization
import "stat"
data = [1, 2, 2, 3, 3, 3, 4]
freq = stat::frequency(data)
println freq # { "1": 1, "2": 2, "3": 3, "4": 1 }
normalized = stat::normalize(data)
println normalized # [0.0, 0.333..., 0.333..., 0.666..., 0.666..., 1.0, ...]