update Zoo stub loading for SparseZoo 1.1 refactor #54

bfineran · 2022-07-30T00:46:29Z

test_plan:
tested manually via CLI

KSGulin

Passed local commit integration tests. LGTM!

dbogunowicz · 2022-08-01T16:31:07Z

Manually tested by running:

sparseml.transformers.question_answering     --model_name_or_path zoo:nlp/masked_language_modeling/bert-base/pytorch/huggingface/wikipedia_bookcorpus/12layer_pruned80_quant-none-vnni     --dataset_name squad     --do_train     --do_eval     --output_dir './output'     --distill_teacher disable     --recipe zoo:nlp/masked_language_modeling/bert-base/pytorch/huggingface/wikipedia_bookcorpus/12layer_pruned80_quant-none-vnni?recipe_type=transfer-question_answering --overwrite_output_dir

The training starts as expected.

Some logs to confirm that the manual tests have been run properly:

[INFO|configuration_utils.py:646] 2022-08-01 12:18:32,204 >> loading configuration file /home/damian/.cache/sparsezoo/03af0342-d7d0-470b-8958-80230ff0af10/training/config.json
[INFO|configuration_utils.py:684] 2022-08-01 12:18:32,205 >> Model config BertConfig {
  "_name_or_path": "/home/damian/.cache/sparsezoo/03af0342-d7d0-470b-8958-80230ff0af10/training",
  "architectures": [
    "BertForMaskedLM"
  ],
  "attention_probs_dropout_prob": 0.1,
  "classifier_dropout": null,
  "gradient_checkpointing": false,
  "hidden_act": "gelu",
  "hidden_dropout_prob": 0.1,
  "hidden_size": 768,
  "initializer_range": 0.02,
  "intermediate_size": 3072,
  "layer_norm_eps": 1e-12,
  "max_position_embeddings": 512,
  "model_type": "bert",
  "num_attention_heads": 12,
  "num_hidden_layers": 12,
  "pad_token_id": 0,
  "position_embedding_type": "absolute",
  "transformers_version": "4.18.0.dev0",
  "type_vocab_size": 2,
  "use_cache": true,
  "vocab_size": 30522
}

[INFO|tokenization_utils_base.py:1703] 2022-08-01 12:18:32,205 >> Didn't find file /home/damian/.cache/sparsezoo/03af0342-d7d0-470b-8958-80230ff0af10/training/tokenizer.json. We won't load it.
[INFO|tokenization_utils_base.py:1703] 2022-08-01 12:18:32,205 >> Didn't find file /home/damian/.cache/sparsezoo/03af0342-d7d0-470b-8958-80230ff0af10/training/added_tokens.json. We won't load it.
[INFO|tokenization_utils_base.py:1784] 2022-08-01 12:18:32,205 >> loading file /home/damian/.cache/sparsezoo/03af0342-d7d0-470b-8958-80230ff0af10/training/vocab.txt
[INFO|tokenization_utils_base.py:1784] 2022-08-01 12:18:32,205 >> loading file None
[INFO|tokenization_utils_base.py:1784] 2022-08-01 12:18:32,205 >> loading file None
[INFO|tokenization_utils_base.py:1784] 2022-08-01 12:18:32,205 >> loading file /home/damian/.cache/sparsezoo/03af0342-d7d0-470b-8958-80230ff0af10/training/special_tokens_map.json
[INFO|tokenization_utils_base.py:1784] 2022-08-01 12:18:32,205 >> loading file /home/damian/.cache/sparsezoo/03af0342-d7d0-470b-8958-80230ff0af10/training/tokenizer_config.json
[INFO|configuration_utils.py:646] 2022-08-01 12:18:32,205 >> loading configuration file /home/damian/.cache/sparsezoo/03af0342-d7d0-470b-8958-80230ff0af10/training/config.json

while:

(sparseml_venv) damian@lambdaquad:~/.cache/sparsezoo/03af0342-d7d0-470b-8958-80230ff0af10/training$ ls
config.json  pytorch_model.bin  special_tokens_map.json  tokenizer_config.json  trainer_state.json  training_args.bin  vocab.txt

update Zoo stub loading for SparseZoo 1.1 refactor (#54) add flag to signal NM integration is active (#32) Add recipe_name to file names

* Update trainer and model flows to accommodate sparseml Disable FP16 on QAT start (#12) * Override LRScheduler when using LRModifiers * Disable FP16 on QAT start * keep wrapped scaler object for training after disabling Using QATMatMul in DistilBERT model class (#41) Removed double quantization of output of context layer. (#45) Fix DataParallel validation forward signatures (#47) * Fix: DataParallel validation forward signatures * Update: generalize forward_fn selection Best model after epoch (#46) fix sclaer check for non fp16 mode in trainer (#38) Mobilebert QAT (#55) * Remove duplicate quantization of vocabulary. enable a QATWrapper for non-parameterized matmuls in BERT self attention (#9) * Utils and auxillary changes update Zoo stub loading for SparseZoo 1.1 refactor (#54) add flag to signal NM integration is active (#32) Add recipe_name to file names * Fix errors introduced in manual cherry-pick upgrade Co-authored-by: Benjamin Fineran <[email protected]>

* Add recipe_name to default file names * Upgrade to transformers release V4.30.2 (#62) * Update trainer and model flows to accommodate sparseml Disable FP16 on QAT start (#12) * Override LRScheduler when using LRModifiers * Disable FP16 on QAT start * keep wrapped scaler object for training after disabling Using QATMatMul in DistilBERT model class (#41) Removed double quantization of output of context layer. (#45) Fix DataParallel validation forward signatures (#47) * Fix: DataParallel validation forward signatures * Update: generalize forward_fn selection Best model after epoch (#46) fix sclaer check for non fp16 mode in trainer (#38) Mobilebert QAT (#55) * Remove duplicate quantization of vocabulary. enable a QATWrapper for non-parameterized matmuls in BERT self attention (#9) * Utils and auxillary changes update Zoo stub loading for SparseZoo 1.1 refactor (#54) add flag to signal NM integration is active (#32) Add recipe_name to file names * Fix errors introduced in manual cherry-pick upgrade Co-authored-by: Benjamin Fineran <[email protected]> * update build versions for NM fork pypi push (#74) * fix nightly package name (#75) * add make build command (#76) * add GHA workflow files to build nightly and release packages (#77) * add GHA workflow files to build nightly and release packages * fix name --------- Co-authored-by: dhuang <[email protected]> * bump up version to 1.6.0 (#79) Co-authored-by: dhuang <[email protected]> --------- Co-authored-by: Konstantin <[email protected]> Co-authored-by: Konstantin Gulin <[email protected]> Co-authored-by: dhuangnm <[email protected]> Co-authored-by: dhuang <[email protected]>

(previous commits) * Add recipe_name to default file names * Upgrade to transformers release V4.30.2 (#62) * Update trainer and model flows to accommodate sparseml Disable FP16 on QAT start (#12) * Override LRScheduler when using LRModifiers * Disable FP16 on QAT start * keep wrapped scaler object for training after disabling Using QATMatMul in DistilBERT model class (#41) Removed double quantization of output of context layer. (#45) Fix DataParallel validation forward signatures (#47) * Fix: DataParallel validation forward signatures * Update: generalize forward_fn selection Best model after epoch (#46) fix sclaer check for non fp16 mode in trainer (#38) Mobilebert QAT (#55) * Remove duplicate quantization of vocabulary. enable a QATWrapper for non-parameterized matmuls in BERT self attention (#9) * Utils and auxillary changes update Zoo stub loading for SparseZoo 1.1 refactor (#54) add flag to signal NM integration is active (#32) Add recipe_name to file names * Fix errors introduced in manual cherry-pick upgrade Co-authored-by: Benjamin Fineran <[email protected]> * update build versions for NM fork pypi push (#74) * fix nightly package name (#75) * add make build command (#76) * add GHA workflow files to build nightly and release packages (#77) * add GHA workflow files to build nightly and release packages * fix name --------- Co-authored-by: dhuang <[email protected]> * bump up version to 1.6.0 (#79) Co-authored-by: dhuang <[email protected]> --------- Co-authored-by: Konstantin <[email protected]> Co-authored-by: Konstantin Gulin <[email protected]> Co-authored-by: dhuangnm <[email protected]> Co-authored-by: dhuang <[email protected]> minor improvements for build workflow files (#83) Co-authored-by: dhuang <[email protected]> fix minor issue (#84) Co-authored-by: dhuang <[email protected]> OPT with quantizable MatMuls (#85) fix a minor issue for release build (#86) Co-authored-by: dhuang <[email protected]> update version in version.py Testmo (#91) * improve GHA workflow files to build nightly and release, and report status to testmo * clean up * report exit code * Assign value to exit_code --------- Co-authored-by: dhuang <[email protected]> Update trainer.py - fix DistributedSampler import (#93) DistributedSampler is used but not imported in `trainer.py` Research/llama/bmm quantization (#94) * Quantize attention matmuls * Quantize attention matmuls bump base transformers version

update Zoo stub loading for SparseZoo 1.1 refactor

4aed18e

bfineran requested a review from dbogunowicz July 30, 2022 00:46

bfineran self-assigned this Jul 30, 2022

KSGulin approved these changes Aug 1, 2022

View reviewed changes

dbogunowicz approved these changes Aug 1, 2022

View reviewed changes

bfineran merged commit 78694d9 into master Aug 1, 2022

bfineran deleted the sparsezoo-1.1-refactor branch August 1, 2022 18:23

KSGulin pushed a commit that referenced this pull request Oct 14, 2022

Utils and auxillary changes

05297a8

update Zoo stub loading for SparseZoo 1.1 refactor (#54) add flag to signal NM integration is active (#32) Add recipe_name to file names

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

update Zoo stub loading for SparseZoo 1.1 refactor #54

update Zoo stub loading for SparseZoo 1.1 refactor #54

Uh oh!

bfineran commented Jul 30, 2022 •

edited

Loading

Uh oh!

KSGulin left a comment

Uh oh!

dbogunowicz commented Aug 1, 2022

Uh oh!

Uh oh!

update Zoo stub loading for SparseZoo 1.1 refactor #54

update Zoo stub loading for SparseZoo 1.1 refactor #54

Uh oh!

Conversation

bfineran commented Jul 30, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

KSGulin left a comment

Choose a reason for hiding this comment

Uh oh!

dbogunowicz commented Aug 1, 2022

Uh oh!

Uh oh!

bfineran commented Jul 30, 2022 •

edited

Loading