vhash is a C++ reimplementation of videohash for detecting near-duplicate videos. It takes any input video or image file and generate a 64-bit equivalent hash value.
- A C++ compiler supports C++14
 - CMake >= 3.11
 
- opencv for image decoding & resizing
 - ffmpeg for video decoding & frame extracting
 - fftw for discrete cosine transform (DCT)
 - sqlite3 for file hash value caching
 - spdlog for logging
 
CentOS
sudo yum install opencv-devel ffmpeg-devel fftw-devel sqlite-devel spdlog-develUbuntu
sudo apt install libopencv-dev libavformat-dev libavcodec-dev libavdevice-dev libavutil-dev libswscale-dev
sudo apt install libfftw3-dev libsqlite3-dev libspdlog-devmacOS
brew install opencv@4 ffmpeg@5 fftw sqlite spdlog
brew link ffmpeg@5- wavelib for wavelet decomposition
 - sqlite_orm for file hash value caching
 - cpptqdm for tqdm like progress bar
 - CLI11 for command line parsing
 
git clone https://github.com/helloall1900/vhash.git
cd vhash
makebin/vhash hash tests/testdata/lena.png- googletest for unit testing
 - google benchmark for benchmarking
 
CentOS
sudo yum install gtest-devel google-benchmark-develUbuntu
sudo apt install libgtest-dev libbenchmark-devmacOS
brew install googletest google-benchmark- Generate hash value of single file or files in directory.
 - Store file's hash value in db cache to speed up hash generation.
 - Find duplicate video or image files in directory.
 
Generating hash for video or image files
Usage: vhash hash [OPTIONS] path  
Positionals:  
path TEXT:PATH(existing) REQUIRED file or directory path  
Options:  
-h,--help                   Print this help message and exit  
-e,--ext TEXT ...           file extension filter (i.e. -e mp4,mkv)  
-c,--cache TEXT             cache file or url  
-o,--output TEXT            output file  
-C,--use-cache              use cache  
-r,--recursive              recursively find files  
-P,--no-progress            not print progress bar  bin/vhash hash -C -o hash.txt some_dir_pathOperating on hash cache
Usage: vhash cache [OPTIONS] [path]  
Positionals:  
path TEXT                     full file path  
Options:  
-h,--help                     Print this help message and exit  
-c,--cache TEXT               cache file or url  
-f,--find                     find cache item  
-d,--del                      delete cache item  
-C,--clear                    clear all hash cache  
-p,--pure                     pure expired hash cache  
-P,--pure-period INT [604800] pure period in secondsbin/vhash cache -f some_file_pathFinding duplicate video or image files
Usage: vhash dup [OPTIONS] [path]  
Positionals:  
path TEXT:PATH(existing)    file or directory path  
Options:  
-h,--help                   Print this help message and exit  
-e,--ext TEXT ...           file extension filter (i.e. -e mp4,mkv)  
-c,--cache TEXT             cache file or url  
-o,--output TEXT            output file  
-C,--use-cache              use cache  
-r,--recursive              recursively find files  
-P,--no-progress            not print progress barbin/vhash dup -C -o dup.txt some_dir_path- videohash for video hash.
 - imagehash for image hash.
 - fastimagehash for C++ implementation of image hash.
 
Copyright (c) 2023 Leo. See LICENSE for details.
