日本語 | Русский | 中文 | 한국어 | Español | Français
filesql is a Go SQL driver that enables you to query CSV, TSV, and LTSV files using SQLite3 SQL syntax. Query your data files directly without any imports or transformations!
This library was born from the experience of maintaining two separate CLI tools - sqly and sqluv. Both tools shared a common feature: executing SQL queries against CSV, TSV, and other file formats.
Rather than maintaining duplicate code across both projects, we extracted the core functionality into this reusable SQL driver. Now, any Go developer can leverage this capability in their own applications!
- 🔍 SQLite3 SQL Interface - Use SQLite3's powerful SQL dialect to query your files
- 📁 Multiple File Formats - Support for CSV, TSV, and LTSV files
- 🗜️ Compression Support - Automatically handles .gz, .bz2, .xz, and .zst compressed files
- 🚀 Zero Setup - No database server required, everything runs in-memory
- 🌍 Cross-Platform - Works seamlessly on Linux, macOS, and Windows
- 💾 SQLite3 Powered - Built on the robust SQLite3 engine for reliable SQL processing
Extension | Format | Description |
---|---|---|
.csv |
CSV | Comma-separated values |
.tsv |
TSV | Tab-separated values |
.ltsv |
LTSV | Labeled Tab-separated Values |
.csv.gz , .tsv.gz , .ltsv.gz |
Gzip compressed | Gzip compressed files |
.csv.bz2 , .tsv.bz2 , .ltsv.bz2 |
Bzip2 compressed | Bzip2 compressed files |
.csv.xz , .tsv.xz , .ltsv.xz |
XZ compressed | XZ compressed files |
.csv.zst , .tsv.zst , .ltsv.zst |
Zstandard compressed | Zstandard compressed files |
go get github.com/nao1215/filesql
For simple file access, use the convenient Open
or OpenContext
functions:
package main
import (
"context"
"fmt"
"log"
"time"
"github.com/nao1215/filesql"
)
func main() {
// Open a CSV file as a database with context
ctx, cancel := context.WithTimeout(context.Background(), 30*time.Second)
defer cancel()
db, err := filesql.OpenContext(ctx, "data.csv")
if err != nil {
log.Fatal(err)
}
defer db.Close()
// Execute SQL query (table name is derived from filename without extension)
rows, err := db.QueryContext(ctx, "SELECT * FROM data WHERE age > 25 ORDER BY name")
if err != nil {
log.Fatal(err)
}
defer rows.Close()
// Process results
for rows.Next() {
var name string
var age int
if err := rows.Scan(&name, &age); err != nil {
log.Fatal(err)
}
fmt.Printf("Name: %s, Age: %d\n", name, age)
}
}
For advanced use cases like embedded files (go:embed
) or custom filesystems, use the Builder pattern:
package main
import (
"context"
"embed"
"io/fs"
"log"
"github.com/nao1215/filesql"
)
//go:embed data/*.csv data/*.tsv
var dataFS embed.FS
func main() {
ctx := context.Background()
// Use Builder pattern for embedded filesystem
subFS, _ := fs.Sub(dataFS, "data")
db, err := filesql.NewBuilder().
AddPath("local_file.csv"). // Regular file
AddFS(subFS). // Embedded filesystem
Build(ctx)
if err != nil {
log.Fatal(err)
}
connection, err := db.Open(ctx)
if err != nil {
log.Fatal(err)
}
defer connection.Close()
defer db.Cleanup() // Clean up temporary files from FS
// Query across files from different sources
rows, err := connection.Query("SELECT name FROM sqlite_master WHERE type='table'")
if err != nil {
log.Fatal(err)
}
defer rows.Close()
// Process results...
}
// Open files with timeout control
ctx, cancel := context.WithTimeout(context.Background(), 30*time.Second)
defer cancel()
db, err := filesql.OpenContext(ctx, "large_dataset.csv")
if err != nil {
log.Fatal(err)
}
defer db.Close()
// Query with context for cancellation support
rows, err := db.QueryContext(ctx, "SELECT * FROM large_dataset WHERE status = 'active'")
// Open multiple files in a single database
ctx, cancel := context.WithTimeout(context.Background(), 30*time.Second)
defer cancel()
db, err := filesql.OpenContext(ctx, "users.csv", "orders.tsv", "products.ltsv")
if err != nil {
log.Fatal(err)
}
defer db.Close()
// Join data across different file formats!
rows, err := db.QueryContext(ctx, `
SELECT u.name, o.order_date, p.product_name
FROM users u
JOIN orders o ON u.id = o.user_id
JOIN products p ON o.product_id = p.id
WHERE o.order_date > '2024-01-01'
`)
// Open all supported files in a directory (recursive)
ctx, cancel := context.WithTimeout(context.Background(), 30*time.Second)
defer cancel()
db, err := filesql.OpenContext(ctx, "/path/to/data/directory")
if err != nil {
log.Fatal(err)
}
defer db.Close()
// Query all loaded tables
rows, err := db.QueryContext(ctx, "SELECT name FROM sqlite_master WHERE type='table'")
// Automatically handles compressed files
ctx, cancel := context.WithTimeout(context.Background(), 30*time.Second)
defer cancel()
db, err := filesql.OpenContext(ctx, "large_dataset.csv.gz", "archive.tsv.bz2")
if err != nil {
log.Fatal(err)
}
defer db.Close()
// Query compressed data seamlessly
rows, err := db.QueryContext(ctx, "SELECT COUNT(*) FROM large_dataset")
filesql automatically derives table names from file paths:
// Table naming examples:
// "users.csv" -> table name: "users"
// "data.tsv" -> table name: "data"
// "logs.ltsv" -> table name: "logs"
// "archive.csv.gz" -> table name: "archive"
// "backup.tsv.bz2" -> table name: "backup"
// "/path/to/sales.csv" -> table name: "sales"
ctx, cancel := context.WithTimeout(context.Background(), 30*time.Second)
defer cancel()
db, err := filesql.OpenContext(ctx, "employees.csv", "departments.tsv.gz")
if err != nil {
log.Fatal(err)
}
// Use the derived table names in queries
rows, err := db.QueryContext(ctx, `
SELECT * FROM employees
JOIN departments ON employees.dept_id = departments.id
`)
Since filesql uses SQLite3 as its underlying engine, all SQL syntax follows SQLite3's SQL dialect. This includes:
- Functions (e.g.,
date()
,substr()
,json_extract()
) - Window functions
- Common Table Expressions (CTEs)
- And much more!
INSERT
,UPDATE
, andDELETE
operations affect the in-memory database- Original files remain unchanged by default - filesql doesn't modify your source files unless you use auto-save
- You can use auto-save to automatically persist changes to files on close or commit
- This makes it safe to experiment with data transformations while providing optional persistence
Since filesql uses SQLite3, you can leverage its full power:
ctx, cancel := context.WithTimeout(context.Background(), 30*time.Second)
defer cancel()
db, err := filesql.OpenContext(ctx, "employees.csv", "departments.csv")
if err != nil {
log.Fatal(err)
}
defer db.Close()
// Use window functions, CTEs, and complex queries
query := `
WITH dept_stats AS (
SELECT
department_id,
AVG(salary) as avg_salary,
COUNT(*) as emp_count
FROM employees
GROUP BY department_id
)
SELECT
e.name,
e.salary,
d.name as department,
ds.avg_salary as dept_avg,
RANK() OVER (PARTITION BY e.department_id ORDER BY e.salary DESC) as rank
FROM employees e
JOIN departments d ON e.department_id = d.id
JOIN dept_stats ds ON e.department_id = ds.department_id
WHERE e.salary > ds.avg_salary * 0.8
`
rows, err := db.QueryContext(ctx, query)
filesql provides auto-save functionality to automatically persist database changes to files. You can choose between two timing options:
Automatically save changes when the database connection is closed (recommended for most use cases):
ctx, cancel := context.WithTimeout(context.Background(), 30*time.Second)
defer cancel()
// Enable auto-save on close
builder := filesql.NewBuilder().
AddPath("data.csv").
EnableAutoSave("./backup") // Save to backup directory
validatedBuilder, err := builder.Build(ctx)
if err != nil {
log.Fatal(err)
}
defer validatedBuilder.Cleanup()
db, err := validatedBuilder.Open(ctx)
if err != nil {
log.Fatal(err)
}
defer db.Close() // Auto-save triggered here
// Make modifications - they will be automatically saved on close
_, err = db.ExecContext(ctx, "UPDATE data SET status = 'processed' WHERE status = 'pending'")
_, err = db.ExecContext(ctx, "INSERT INTO data (name, status) VALUES ('New Record', 'active')")
Automatically save changes after each transaction commit (for frequent persistence):
ctx, cancel := context.WithTimeout(context.Background(), 30*time.Second)
defer cancel()
// Enable auto-save on commit - empty string means overwrite original files
builder := filesql.NewBuilder().
AddPath("data.csv").
EnableAutoSaveOnCommit("") // Overwrite original files
validatedBuilder, err := builder.Build(ctx)
if err != nil {
log.Fatal(err)
}
defer validatedBuilder.Cleanup()
db, err := validatedBuilder.Open(ctx)
if err != nil {
log.Fatal(err)
}
defer db.Close()
// Each commit will automatically save to files
tx, err := db.BeginTx(ctx, nil)
if err != nil {
log.Fatal(err)
}
_, err = tx.ExecContext(ctx, "UPDATE data SET status = 'processed' WHERE id = 1")
if err != nil {
tx.Rollback()
log.Fatal(err)
}
err = tx.Commit() // Auto-save triggered here
if err != nil {
log.Fatal(err)
}
If you prefer manual control over when to save changes to files instead of using auto-save:
ctx, cancel := context.WithTimeout(context.Background(), 30*time.Second)
defer cancel()
db, err := filesql.OpenContext(ctx, "data.csv")
if err != nil {
log.Fatal(err)
}
defer db.Close()
// Make modifications
_, err = db.ExecContext(ctx, "UPDATE data SET status = 'processed' WHERE status = 'pending'")
if err != nil {
log.Fatal(err)
}
// Export the modified data to a new directory
err = filesql.DumpDatabase(db, "/path/to/output/directory")
if err != nil {
log.Fatal(err)
}
Contributions are welcome! Please see the Contributing Guide for more details.
If you find this project useful, please consider:
- ⭐ Giving it a star on GitHub - it helps others discover the project
- 💝 Becoming a sponsor - your support keeps the project alive and motivates continued development
Your support, whether through stars, sponsorships, or contributions, is what drives this project forward. Thank you!
This project is licensed under the MIT License - see the LICENSE file for details.