csv¶
The csv package provides functions for parsing CSV (comma-separated value) data.
The algorithm conforms to RFC 4180.
Package Functions¶
parse(input, delimiter)¶
Parses a CSV string into a list of rows (each row is a list of string fields).
Parameters
| Type | Name | Description |
|---|---|---|
string |
input |
The CSV data. |
string |
delimiter |
The delimiter to use. (optional, defaults to ,). |
Returns
| Type | Description |
|---|---|
list |
A list of lists representing rows and columns. |
Example
import "csv"
var data = "name,age\nAlice,30\nBob,25"
var rows = csv::parse(data)
# rows[0] is the header row
println rows[0]
# ["name", "age"]
# iterate over data rows, skipping the header
for row in rows[1:]
println row[0] + " is " + row[1] + " years old"
end
# Alice is 30 years old
# Bob is 25 years old
parse_file(file_path, delimiter)¶
Parses a CSV file into a list of rows.
Parameters
| Type | Name | Description |
|---|---|---|
string |
file_path |
The path to a file. |
string |
delimiter |
The delimiter to use. (optional, defaults to ,). |
Returns
| Type | Description |
|---|---|
list |
A list of lists representing rows and columns. |
Example
import "csv"
# Parse a comma-delimited file
var rows = csv::parse_file("/data/employees.csv")
var headers = rows[0]
println "Columns: " + headers.join(", ")
for row in rows[1:]
println row.join(" | ")
end
# Parse a tab-delimited file
var tsv_rows = csv::parse_file("/data/report.tsv", "\t")
println "Row count: " + (tsv_rows.size() - 1).to_string()
to_maps(input, delimiter)¶
Parses a CSV string into a list of hashmaps, using the first row as header keys.
Parameters
| Type | Name | Description |
|---|---|---|
string |
input |
The CSV data. |
string |
delimiter |
The delimiter to use. (optional, defaults to ,). |
Returns
| Type | Description |
|---|---|
list |
A list of hashmaps mapping header names to field values. |
Example
import "csv"
var data = "name,age\nAlice,30\nBob,25"
var records = csv::to_maps(data)
for r in records do
println r["name"] + " is " + r["age"] + " years old"
end
# Alice is 30 years old
# Bob is 25 years old
file_to_maps(file_path, delimiter)¶
Parses a CSV file into a list of hashmaps, using the first row as header keys.
Parameters
| Type | Name | Description |
|---|---|---|
string |
file_path |
The path to a file. |
string |
delimiter |
The delimiter to use. (optional, defaults to ,). |
Returns
| Type | Description |
|---|---|
list |
A list of hashmaps mapping header names to field values. |
Throws
string — if the file does not exist.
Example
import "csv"
var records = csv::file_to_maps("/data/employees.csv")
for r in records do
println r["name"] + " earns " + r["salary"]
end
pipeline()¶
Creates a CsvPipeline builder for fluent, declarative CSV parsing. Chain configuration methods to describe your sources, then call .parse() to execute.
Returns
| Type | Description |
|---|---|
CsvPipeline |
A new pipeline builder instance. |
See CsvPipeline below for the full builder API and examples.
CsvPipeline¶
A fluent builder returned by csv::pipeline(). Configure one or more CSV sources, then call .parse() to get structured data.
All builder methods return self so calls can be chained.
Schema types¶
Pass schema types as strings to .with_schema():
| String | Coercion |
|---|---|
"string" |
left as-is — empty fields stay "", never null |
"string?" / "nullable_string" |
left as-is for non-empty; empty fields become null |
"integer" / "int" |
parsed as integer; empty fields become null |
"float" / "double" / "number" |
parsed as float; empty fields become null |
"boolean" / "bool" |
true for "true", "1", "yes"; false otherwise |
"date" |
parsed via DateTime.TryParse; empty fields become null |
"null" / "none" |
always null |
| anything else | left as-is |
"string" vs "string?" — All non-string typed columns return null for empty CSV fields. String columns are the exception: "string" preserves empty fields as "", which means r["col"] != null is always true. Use "string?" when you need to distinguish a present value from a missing one and want ?? or != null checks to work:
# "string" — empty field stays "", ?? never fires
schema = ["string", "integer", "string"]
# r["notes"] ?? "none" → always returns r["notes"], even when it's ""
# "string?" — empty field becomes null, ?? fires correctly
schema = ["string?", "integer", "string?"]
# r["notes"] ?? "none" → returns "none" when the field was empty
.with_headers(headers)¶
Supplies custom column headers. When set, every row in the file is treated as a data row (no header row is skipped).
| Type | Name | Description |
|---|---|---|
list |
headers |
List of column name strings. |
.with_schema(schema)¶
Sets per-column type coercion for the next from_file(). Each element maps to the column at the same index; extra or missing entries are ignored.
| Type | Name | Description |
|---|---|---|
list |
schema |
List of type-name strings. |
.with_delimiter(delimiter)¶
Overrides the field delimiter for the next from_file() (default ,).
| Type | Name | Description |
|---|---|---|
string |
delimiter |
The delimiter character. |
.from_file(path)¶
Registers a CSV file using the current pending settings, then resets them for the next file. Call .as_dataset() after this to name the result.
| Type | Name | Description |
|---|---|---|
string |
path |
Path to the CSV file. |
.from_string(data)¶
Registers an in-memory CSV string as a source using the current pending settings. Same behaviour as .from_file() but no file is needed — useful for tests, fixtures, or data received from an API.
| Type | Name | Description |
|---|---|---|
string |
data |
The CSV-formatted string to parse. |
.as_dataset(name)¶
Names the most recently registered file. The name becomes the key in the hashmap returned by .parse(). Must be called after .from_file().
| Type | Name | Description |
|---|---|---|
string |
name |
Dataset key name. |
Throws — if called before any .from_file().
.parse()¶
Executes the pipeline and returns the parsed results.
| Condition | Return type |
|---|---|
| Single unnamed file | list of hashmaps |
| Multiple unnamed files | list of lists of hashmaps |
| All files named as datasets | hashmap keyed by dataset name |
| Mixed named/unnamed | throws a pipeline error |
Pipeline Examples¶
Single file with schema¶
import "csv"
rows = csv::pipeline()
.with_schema(["string", "integer", "boolean", "float"])
.from_file("users.csv")
.parse()
for r in rows do
name = r["name"]
id = r["id"] # integer
active = r["active"] # boolean
score = r["score"] # float
println "${name} (${id}): active=${active}, score=${score}"
end
Nullable string columns with "string?"¶
Use "string?" for columns that may be empty in the CSV and where you want null rather than "":
import "csv"
# Weapon Desc is blank when no weapon was used.
# "string?" makes it null so != null and ?? work correctly.
schema = ["integer", "string", "string?", "float"]
rows = csv::pipeline()
.with_schema(schema)
.from_file("incidents.csv")
.parse()
for r in rows do
weapon = r["weapon_desc"] ?? "none"
println "${r["id"]}: ${weapon}"
end
In-memory CSV with from_string¶
Parse a CSV string directly — no temp file needed:
import "csv"
data = "name,score\nAlice,95\nBob,82\nCarol,91"
rows = csv::pipeline()
.with_schema(["string", "integer"])
.from_string(data)
.parse()
for r in rows do
println "${r["name"]}: ${r["score"]}"
end
Custom headers (no header row in file)¶
import "csv"
rows = csv::pipeline()
.with_headers(["name", "id", "active", "score"])
.with_schema(["string", "integer", "boolean", "float"])
.from_file("users_noheader.csv")
.parse()
Tab-delimited file¶
Multiple unnamed files¶
When no datasets are named, multiple files return a list of results — one per file.
import "csv"
all = csv::pipeline()
.from_file("q1.csv")
.from_file("q2.csv")
.from_file("q3.csv")
.parse()
q1 = all[0]
q2 = all[1]
q3 = all[2]
Named datasets¶
Name every source with .as_dataset() and .parse() returns a single hashmap keyed by those names.
import "csv"
data = csv::pipeline()
.with_schema(["string", "integer", "float"])
.from_file("orders.csv").as_dataset("orders")
.with_schema(["string", "float"])
.from_file("products.csv").as_dataset("products")
.parse()
orders = data["orders"]
products = data["products"]
for o in orders do
println "Order: ${o["product"]}, qty=${o["qty"]}"
end
Error: mixing named and unnamed sources¶
If at least one source is named, all must be named — otherwise .parse() throws.