Skip to content

zhenglab/LPAH

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

LPAH: Language-guided Patch Aggregation Hashing for Fine-grained Image Retrieval

Official PyTorch implementation of the paper "LPAH: Language-guided Patch Aggregation Hashing for Fine-grained Image Retrieval" (PRCV 2025).


Overview

LPAH utilizes automatically synthesized textual semantics to guide token aggregation in ViT, preserving discriminative visual patterns for better hashing performance. The framework introduces three key modules: CLA, CPA and PWA. These modules improve the model's ability to capture subtle visual distinctions, enhancing performance in fine-grained image retrieval. The method is trained and evaluated on common fine-grained datasets, including CUB-200-2011, Stanford Dogs, FGVC-Aircraft, and VegFru.

Alt text


How to run

Requirements

Please refer to requirements.txt for the required dependencies.

Step 1: Prepare Datasets

- datasets
    - CUB_200_2011
    - Stanford_Dogs
    - Aircraft
    - vegfru 

Step 2: Synthetize Captions

Navigate to the text_generation directory:

cd text_generation
  • For CUB or Stanford Dogs, run

    python cub_dogs.py
  • For Aircraft, run

    python aircraft.py
  • For VegFru

    python vegfru.py

Step 3: Download Pre-trained ViT models

Please visit Please visit the following links to download the pre-trained ViT models:

During training, the above models will be used as a backbone for LPAH.

Step 4: Training and Evaluation

Run the following command to start training and evaluation:

python train.py --dataset [dataset_name] --epoch 100 --eval_every 5 --warmup_epochs 20  --name [logs_dir_name] --train_batch_size 64 --hash_bit_list 16,32,48,64 --learning_rate 0.02

Optional dataset_name

  • CUB_200_2011 for CUB-200-2011.

  • Stanford_Dogs for Stanford Dogs.

  • vegfru for VegFru

  • Aircraft for FGVC-Aircraft

The above command runs training of LPAH, with evaluation conducted every 5 epochs.

About

LPAH

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages