Skip to content

Conversation

@chansikpark
Copy link
Contributor

@chansikpark chansikpark commented Oct 30, 2025

Resolves the larger part of #16830.

The current limit is 8192 characters. This is not enough for summarizing most articles worth summarizing. Doubling handles many. Quadrupling would arguably handle most.

Supporting web browser integration introduces security risks. If there were an exploit opportunity that opened up in llama-server, this could give the exploiter quadruple the space for an exploit. Furthermore, an exploit might not always require the user to have inadvisably exposed the server to a public network. If a prompt injection attack on a website isn't handled properly anywhere along the way to/from the LLM, and there's some exploitable code where it's mishandled, this change could presumably make this easier/possible to exploit.

It could be sufficient to include a warning somewhere along with advice for good practices which might include using a combination of a strict content blocker and only allowing prompts from trusted websites.

@chansikpark chansikpark changed the title Allow for longer prompts in q URL parameter server: Allow for longer prompts in q URL parameter Oct 30, 2025
@ggerganov ggerganov merged commit 16724b5 into ggml-org:master Oct 30, 2025
64 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants