Support for LLaMA

Thanks for your wonderful work!
Meta released their newest LLM, LLaMA. The checkpoint is available on Huggingface[1]. zphang has presented the code to use LLaMA based on the transformers repo. For FlexGen, could I directly replace OPT model with LLaMA to make inferences on a local card? Do you have any plan to support LLaMA in the future?


[1] https://huggingface.co/decapoda-research
[2] https://github.com/huggingface/transformers/pull/21955

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Support for LLaMA #104

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Support for LLaMA #104

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions