`stat`¶

The stat package provides descriptive statistics functions for numeric lists: measures of central tendency, spread, distribution, correlation, and normalization.

Package Functions¶

Basic Measures¶

Function	Returns	Description
`sum(data)`	`integer\\|float`	Sum of all values
`min(data)`	`integer\\|float`	Minimum value
`max(data)`	`integer\\|float`	Maximum value
`range(data)`	`integer\\|float`	`max - min`
`mean(data)`	`float`	Arithmetic mean
`median(data)`	`integer\\|float`	Middle value (or average of two middle values)
`mode(data)`	`list`	List of most-frequent values
`sorted(data)`	`list`	Sorted copy (does not mutate the original)

Spread¶

Function	Returns	Description
`variance(data, population?)`	`float`	Sample variance by default; pass `true` for population variance
`stdev(data, population?)`	`float`	Sample standard deviation; pass `true` for population
`percentile(data, p)`	`float`	p-th percentile in [0, 100] via linear interpolation
`zscore(data)`	`list`	List of z-scores (standard scores)

Bivariate¶

Function	Returns	Description
`covariance(x_data, y_data)`	`float`	Sample covariance of two lists
`correlation(x_data, y_data)`	`float`	Pearson correlation coefficient [-1, 1]

Distribution¶

Function	Returns	Description
`frequency(data)`	`hashmap`	Maps each unique value (as string) to its count
`normalize(data)`	`list`	Rescales values to [0.0, 1.0] range

Function Details¶

`percentile(data, p)`¶

Parameters

Type	Name	Description
`list`	`data`	List of numeric values
`integer\\|float`	`p`	Percentile in `[0, 100]`

Returns float

`variance(data, population?)` / `stdev(data, population?)`¶

Parameters

Type	Name	Description	Default
`list`	`data`	List of numeric values	—
`boolean`	`population`	`true` for population statistic, `false` for sample	`false`

`covariance(x_data, y_data)` / `correlation(x_data, y_data)`¶

Both lists must have the same length.

Examples¶

import "stat"

data = [4, 7, 13, 2, 1, 9, 7, 3]

println stat::sum(data)        # 46
println stat::min(data)        # 1
println stat::max(data)        # 13
println stat::range(data)      # 12
println stat::mean(data)       # 5.75
println stat::median(data)     # 5.5
println stat::mode(data)       # [7]

Spread¶

import "stat"

data = [2.0, 4.0, 4.0, 4.0, 5.0, 5.0, 7.0, 9.0]

println stat::variance(data)              # 4.571... (sample)
println stat::variance(data, true)        # 4.0      (population)
println stat::stdev(data)                 # 2.138...

println stat::percentile(data, 25)        # 3.5
println stat::percentile(data, 75)        # 5.5

scores = stat::zscore(data)
println scores   # [-1.16, -0.46, ...]

Correlation¶

import "stat"

x = [1, 2, 3, 4, 5]
y = [2, 4, 5, 4, 5]

println stat::covariance(x, y)    # 1.5
println stat::correlation(x, y)   # 0.874...

Frequency and normalization¶

import "stat"

data = [1, 2, 2, 3, 3, 3, 4]

freq = stat::frequency(data)
println freq    # { "1": 1, "2": 2, "3": 3, "4": 1 }

normalized = stat::normalize(data)
println normalized  # [0.0, 0.333..., 0.333..., 0.666..., 0.666..., 1.0, ...]

stat¶

Package Functions¶

Basic Measures¶

Spread¶

Bivariate¶

Distribution¶

Function Details¶

percentile(data, p)¶

variance(data, population?) / stdev(data, population?)¶

covariance(x_data, y_data) / correlation(x_data, y_data)¶

Examples¶

Spread¶

Correlation¶

Frequency and normalization¶

`stat`¶

`percentile(data, p)`¶

`variance(data, population?)` / `stdev(data, population?)`¶

`covariance(x_data, y_data)` / `correlation(x_data, y_data)`¶