-
Notifications
You must be signed in to change notification settings - Fork 13.8k
Open
Labels
Description
Depends on: #5214
The llamax library will wrap llama and expose common high-level functionality. The main goal is to ease the integration of llama.cpp into 3rd party projects. Ideally, most projects would interface through the llamax API for all common use cases, while still have the option to use the low-level llama API for more uncommon applications that require finer control of the state.
A simple way to think about llamax is that it will simplify all of the existing examples in llama.cpp by hiding the low-level stuff, such as managing the KV cache and batching requests.
Roughly, llamax will require it's own state object and a run-loop function.
The specifics of the API are yet to be determined - suggestions are welcome.
ngxson, lin72h, AshD, iceychris, Raphy42 and 14 morejoseph777111, lin7sh, HanClinto, manyoso, shakfu and 2 morejoseph777111, lin7sh, manyoso and borgoatfreelerobot, joseph777111, manyoso and borgoat