- Building Unum Cloud since 2015.
- Computer Science & AI researcher (unpublished, by choice).
- Twice an Astrophysics dropout, lifelong Bioinformatics fan.
- Investing in deep-tech, cloud, & semiconductors.
- Fluent in English, Russian & Armenian.
- Lived in ๐บ๐ธ๐ฌ๐ง๐ท๐บ๐ฆ๐ฒ & ๐ฒ๐ฝ๐ต๐ฆ๐ฆ๐ท๐ฉ๐ช๐ฆ๐ช๐น๐ญ๐ฒ๐พ๐ป๐ณ๐ฎ๐ฉ.
- Frequent host of "Systems" meetups in Armenia, and beyond.
For ~20 years, Iโve been coding in C++, CUDA, and Python โ optimizing Assembly on x86 & ARM. Prefer spaces over tabs, east-const, and procedural code over OOP or functional abstractions.
Want to chat?
I'm @ashvardanian on GitHub, LinkedIn, Twitter, Facebook, and YouTube.
For venture, reach me at [email protected] ๐ค
- USearch - a universal search engine powering many databases, AI labs, and experiments in Natural Sciences. Compact C++ core with 10+ language bindings โ 10โ100ร faster than Meta FAISS for vector search and far beyond Apache Lucene.
- StringZilla - SIMD, SWAR, and CUDA-accelerated string algorithms for search, matching, hashing, and sorting at Web Scale and Bioinformatics scale. Hundreds of hand-tuned kernels with manual multi-versioning, exposed to C, C++, Rust, Python, Swift, and JavaScript, up to 10ร faster on CPUs and 100ร faster on GPUs.
- SimSIMD - a large collection of mixed-precision vector math kernels for C, Python, Rust, and JavaScript. Designed for linear algebra, scientific computing, statistics, information retrieval, and image processing, delivering consistent SIMD speedups over BLAS and NumPy on both x86 and ARM architectures.
- UCall - a kernel-bypass web server backend for C and Python built on io_uring. Achieves 70ร higher throughput and 50ร lower latency than FastAPI for real-time workloads, including serving compact AI models.
- UForm - tiny multimodal AI models with state-of-the-art parameter and data efficiency. Compatible with Python, JS, and Swift, serving as a lightweight alternative to OpenAI CLIP for on-device and server inference.
- ForkUnion - ultra-low-latency parallelism library for Rust and C++. Avoids allocations, mutexes, and even Compare-And-Swap atomics โ achieving up to 10ร speedups over Rayon and TaskFlow.
Some of those are used in ClickHouse, DuckDB, TiDB, ScyllaDB, yugabyteDB, LangChain, SemanticKernel, MemGraph, Vald, and many other less "open" systems, such as competitive AI labs, Cloud companies, Fortune 500, iOS and Android apps with 100M-1B MAU, and government agencies. There are also some cool scientific datasets and HPC tutorials you can borrow:
- usearch-molecules - 28 billion fingerprints for drug discovery, published with AWS
- less_slow.cpp - teaches a performance oriented mindset for C++, CUDA, PTX, and ASM
- less_slow.rs - Rust adaptation with a focus on higher-level abstractions
- less_slow.py - Python adaptation with a focus on scripting & data-management
- SpaceV - 1 billion vectors from Microsoft SpaceV extended for usability
And more demos, benchmarks, and fun hackathon projects:
- UStore - multimodal embedded database for C, C++, and Python designed around key-value stores
- StringWars - micro-benchmarking StringZilla against the best Rust tools
- HashEvals - testing avalanche effect & differential patterns of string hash functions
- ScalingElections - parallel combinatorial voting in CUDA and Mojo for H100 GPUs
- TinySemVer - semantic versioning GitHub CI tool that doesn't take 300K lines of JavaScript
- SwiftSemanticSearch - example of on-device real-time AI using UForm and USearch on iOS
- ParallelReductionsBenchmark - GPGPU benchmarks for SyCL, CUDA, OpenCL, Vulkan, etc.
- LibSee - non-intrusively profiling LibC calls with
LD_PRELOADtricks - PyBindToGPUs - C++ and CUDA starter kit for Python developers avoiding CMake
- StringTape - Apache Arrow compatible tapes for space-efficient string arrays
- JaccardIndex - exploring CPU port utilization with Carry-Save Adders & Lookups
- USearchBench.py - Billion-scale search benchmarks against FAISS, Weaviate, and Qdrant
- USearchBench.java - Billion-scale search scaling benchmarks against Lucene, using Spark
- ucsb - parallel benchmarks for ACID-compliant key-value stores, like RocksDB
- affine-gaps - "less wrong" local and global Gotoh sequence alignments in one NumBa Python file







