Skip to content
This repository was archived by the owner on Jun 3, 2025. It is now read-only.

Conversation

Satrat
Copy link

@Satrat Satrat commented Jan 4, 2024

This PR updates the one-shot modifiers SparseGPT, Wanda and SmoothQuant and Quantization to be compatible with FSDP. This enables us to run alternating one-shot/finetuning flows with FSDP

** NOTE: ** #1912 should be merged first, it covers the initial alternating flow implementation.

Summary of Changes

  • Remove any references of specific devices from the one-shot modifiers, device is now handled by SparseCausalLM, and defaults to "auto" for splitting the model across multiple GPUs (this isn't FSDP related, we can split the model even outside of FSDP) "auto" actually isn't compatible with quantization :( so keeping the default as "cuda:0", but you can pass "auto" through the CLI for a non-quantized oneshot
  • Refactored the SparseGPT class to be a module wrapper, so that we can update weights using module.apply as required for FSDP compatibility
  • Refactored Wanda in the same way, also cleaned up the code sharing between SparseGPT and Wanda(@rahul-tuli would like your input specifically on this)
  • Bug fixes related to quantizing FSDP models

@Satrat Satrat requested review from bfineran and rahul-tuli January 9, 2024 19:14
bfineran
bfineran previously approved these changes Jan 9, 2024
@Satrat
Copy link
Author

Satrat commented Jan 9, 2024

Remove any references of specific devices from the one-shot modifiers, device is now handled by SparseCausalLM, and defaults to "auto" for splitting the model across multiple GPUs (this isn't FSDP related, we can split the model even outside of FSDP)

It doesn't seem like device defaults to "auto" if that is an intended change. Current obcq.py arg:

    parser.add_argument("--device", type=str, default="cuda:0")

See updated PR comment :( device_map="auto" doesn't seem to be compatible with quantization so I'm leaving it off the default. It can still be specified on the CLI for non-quantized one-shot

@Satrat Satrat requested review from bfineran and mgoin January 9, 2024 22:56
bfineran
bfineran previously approved these changes Jan 10, 2024
@bfineran bfineran merged commit 5007b8c into main Jan 11, 2024
@bfineran bfineran deleted the sgpt_fsdp branch January 11, 2024 16:37
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants