Add customizable pytorch-huggingface VLM model fine-tuning pipeline example #4

juk1329 · 2025-08-28T07:00:54Z

In the general_vlm_pytorch_huggingface branch, I added a fine-tuning and evaluation pipeline for VLM models. Previously, there was only a pipeline for language models that handled text data. With this VLM pipeline, it is now possible to work with vision data as well.

Similar to the language model pipeline, the VLM pipeline runs within the FastTrack environment, and all paths are properly redirected to FastTrack’s vFolder.

Since VLM models require more user configuration, the setup now involves four YAML files. Details can be found in the README.md.

If you have time, I would greatly appreciate any feedback, questions, or suggestions regarding the code or commit history.

Thank you very much for your time and review.

juk1329 and others added 30 commits July 30, 2025 14:06

Fix setup_env.sh with more flexible logic

0c6e8d5

Fix the .gitignore and comments in yaml files

44f1066

Add initial vlm fine-tuning pipeline copied from language model pipeline

ca18e95

Add the pipeline code for vlm

f7b7915

Simplify code and fix data download logic

baec079

Add video data process and fix image data process

540c0f3

Fix evaluate_base_model in vlm_cli.py

2ab2eab

Fix configs/README.md

6a626ca

Fix args and path

547716f

Fix loading model and processor

f9fad93

Fix identify special token logic and loading processor code

a21f106

Fix data_collator and evaluation logic

4cc6ea6

Fix identify special token logic

cb718c3

Fix model config example and print code in collate_fn.py

7057d41

Simplify special token logic and fix SFTTrainer parameter

ec66044

Fix evaluation and fine-tuning logic

92a031c

Fix fine-tuning and evaluate code

d963f35

Fix vlm_collator_config.yaml

1277ee9

Fix fine-tuning setting

d369792

Fix fine-tuning setting and simplify log print

bcbee6f

Fix train_config.yaml

fb76303

Simplify log print code

02cfd23

Fix wandb run name

e73bae9

Fix default model loader

3861490

Fix requirements.txt path in setup_env.sh

64c6d10

Fix image data processing logic and simplify code in collate_fn.py

0bf9512

Fix image processing logic and simplify code in evaluation.py

5255361

Fix evaluate logic and generalize data collator

b8e9039

Simplify evaluation logic in language model and refine code

b506b3f

Fix collate_fn.py to generalize VLM pipeline

f8b4035

rapsealk self-requested a review August 28, 2025 07:06

rapsealk assigned juk1329 Aug 28, 2025

juk1329 added 2 commits August 29, 2025 10:13

Updated README.md and vlm_collator_config example

c6d9bfd

Fix chmod of setup_env.sh

cdb0b26

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add customizable pytorch-huggingface VLM model fine-tuning pipeline example #4

Add customizable pytorch-huggingface VLM model fine-tuning pipeline example #4

Uh oh!

juk1329 commented Aug 28, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Add customizable pytorch-huggingface VLM model fine-tuning pipeline example #4

Are you sure you want to change the base?

Add customizable pytorch-huggingface VLM model fine-tuning pipeline example #4

Uh oh!

Conversation

juk1329 commented Aug 28, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant