server: Allow for longer prompts in q URL parameter #16862
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Resolves the larger part of #16830.
The current limit is 8192 characters. This is not enough for summarizing most articles worth summarizing. Doubling handles many. Quadrupling would arguably handle most.
Supporting web browser integration introduces security risks. If there were an exploit opportunity that opened up in llama-server, this could give the exploiter quadruple the space for an exploit. Furthermore, an exploit might not always require the user to have inadvisably exposed the server to a public network. If a prompt injection attack on a website isn't handled properly anywhere along the way to/from the LLM, and there's some exploitable code where it's mishandled, this change could presumably make this easier/possible to exploit.
It could be sufficient to include a warning somewhere along with advice for good practices which might include using a combination of a strict content blocker and only allowing prompts from trusted websites.