A lightweight, web-based viewer for exploring audio data embedded in Parquet files. Built with Rust, Axum, and Polars, this tool allows you to browse, search, and play audio clips directly in your browser. Perfect for data scientists, audio researchers, or anyone working with large-scale audio datasets in columnar formats.
Part of the RustedBytes organization, focusing on high-performance data tools in Rust.
- Parquet Integration: Seamlessly read and query audio metadata from Parquet files using Polars.
- Audio Playback: Embedded HTML5 audio players for in-browser listening, with progress bars and duration display.
- Pagination & Search: Efficiently navigate large datasets with paginated results (configurable page size) and basic filtering.
- Responsive UI: Clean, dark-mode-friendly interface built with vanilla HTML/CSS/JS—no heavy frontend frameworks.
- Concurrent Handling: Leverages Tokio for scalable, async web serving to handle multiple requests efficiently.
- Error-Resilient: Robust error handling with anyhowfor production-grade reliability.
The interface displays audio entries with thumbnails, durations, and transcript snippets. Use the pagination controls to browse through results.
- Rust 1.75+ (stable channel)
- A Parquet file containing audio data (e.g., columns for audioas bytes/binary,durationas f64,transcriptas String).
- 
Clone the repository: git clone https://github.com/RustedBytes/data-viewer-audio.git cd data-viewer-audio
- 
Build the project: cargo build --release 
The server will start at http://localhost:3000. Open it in your browser to start exploring your audio data.
- 
Browse Data: The root route ( /) serves a paginated table of audio entries. Each row includes:- An audio player (play/pause, seek, volume).
- Duration (e.g., "00:06.020").
- Transcript snippet (truncated for preview).
 
- 
Navigation: - Use "Prev/Next" buttons or page numbers for large datasets.
- Adjust "Page size" dropdown (default: 10) for more/less results per page.
 
- 
Back to List: From individual audio views, return to the full list. 
Example CLI output on startup:
Server listening on http://0.0.0.0:3000
This project prioritizes the Rust standard library where possible, with minimal, battle-tested crates:
| Crate | Purpose | Version | 
|---|---|---|
| axum | Async web framework | 0.8.6 | 
| polars | Parquet reading & querying | 0.51.0 (with parquet,dtype-struct) | 
| tokio | Async runtime | 1.48.0 (full features) | 
| serde | JSON serialization | 1.0.228 (derive) | 
| clap | CLI argument parsing | 4.5.49 (derive) | 
| anyhow | Error handling | 1.0.100 | 
| tokio-util | Async utilities | 0.7.16 (full) | 
See Cargo.toml for full details.
- Testing: Run unit tests with cargo test. Coverage includes data loading, pagination, and error paths.
- Formatting: Use cargo fmtfor code style.
- Linting: cargo clippyfor warnings.
- Building Docs: cargo doc --openfor Rustdoc.
Contributions welcome! See CONTRIBUTING.md (create if needed) for guidelines. Fork from RustedBytes/data-viewer-audio.
This project is licensed under the MIT License - see the LICENSE file for details.
