Skip to content

Conversation

dummdidumm
Copy link
Member

Implements query.batch to address the n+1 problem


Please don't delete this checklist! Before submitting the PR, please make sure you do the following:

  • It's really useful if your PR references an issue where it is discussed ahead of time. In many cases, features are absent for a reason. For large changes, please create an RFC: https://github.com/sveltejs/rfcs
  • This message body should clearly illustrate what problems it solves.
  • Ideally, include a test that fails without this PR but passes with it.

Tests

  • Run the tests with pnpm test and lint the project with pnpm lint and pnpm check

Changesets

  • If your PR makes a change that should be noted in one or more packages' changelogs, generate a changeset by running pnpm changeset and following the prompts. Changesets that add features should be minor and those that fix bugs should be patch. Please prefix changeset messages with feat:, fix:, or chore:.

Implements `query.batch` to address the n+1 problem
Copy link

changeset-bot bot commented Aug 19, 2025

🦋 Changeset detected

Latest commit: 4e81119

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 1 package
Name Type
@sveltejs/kit Minor

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@svelte-docs-bot
Copy link

Comment on lines 191 to 197
export const getPost = query.batch(v.string(), async (slugs) => {
const posts = await db.sql`
SELECT * FROM post
WHERE slug = ANY(${slugs})
`;
return posts;
});

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does SvelteKit split the array result back out to the n functions? Like if I have functions that call getPost('a'), getPost('b'), getPost('c'), I'd get ['a', 'b', 'c'] on the server, and return the corresponding posts in an array from my query function, but how will SvelteKit know which item in that array to give back to my functions?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is that a question that you want answered in the docs or a "curious about the implementation" question?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes 😆

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added something to the docs, let me know if that helps / is enough

@Thiagolino8
Copy link

Shouldn't this be the default behavior of query rather than a new type of remote function?

@firatciftci
Copy link

Comparing this upcoming feature with what tRPC has, it appears that this PR is equivalent to their httpBatchLink; however, tRPC also has httpBatchStreamLink for the following purpose/use-case:

When batching requests together, the behavior of a regular httpBatchLink is to wait for all requests to finish before sending the response. If you want to send responses as soon as they are ready, you can use httpBatchStreamLink instead. This is useful for long-running requests.

Will/Does SvelteKit have a similar functionality incorporated into query.batch, or perhaps via a similar equivalent remote function query.streamBatch?

@Rich-Harris
Copy link
Member

Shouldn't this be the default behavior of query rather than a new type of remote function?

At an API level, no — it would be very weird and annoying if I always had to deal with arrays, when most queries are singular. I think what you're asking though is whether we should batch all simultaneous queries into a single request.

We've talked about this a lot, and we've been careful to leave enough design space for us to switch to a batched implementation if it seems like the right approach. So far we've decided against it though, partly because it's a heck of a lot simpler as-is (make it work, make it right, then make it fast), partly because it would likely constrain our options when it comes to caching, when we get round to that.

Will/Does SvelteKit have a similar functionality incorporated into query.batch

No. There's no point. The use case for query.batch is that you have a bunch of IDs and want to get the corresponding objects for those IDs in a single external API call or database query. In that scenario you'll get all the data back at once, so there's nothing to be gained by streaming them back.

If we were to offer something that is like httpBatchLink (which IIUC is about coalescing queries of different types into a single request, rather than turning queries of the same type into a single database call) then yes, the results would be individually streamed. You wouldn't have to configure anything though, it would just be How It Worked. See above for why we may not do that though.

Separately, there will be a primitive for streaming: #14292

Copy link
Member

@Rich-Harris Rich-Harris left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

recapping offline conversation: we want to change the signature to be like this, so that there are no shenanigans around unpredictably-sorted data, and so that we can error on individual items:

export const getStuff = query.batch(v.string(), async (ids) => {
  const stuff = await db.sql`SELECT * FROM stuff WHERE id in (${ids.join(', ')})`;
  const lookup = new Map(stuff.map((thing) => [thing.id, thing]));

  return (id) => lookup.get(id) ?? error(404);
});

Comment on lines 207 to 209
for (let i = 0; i < batched.resolvers.length; i++) {
batched.resolvers[i].resolve(get_result(batched.args[i], i));
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is quite right — if you have 10 items you might resolve the first 3, then the 4th errors, and then you call resolver.reject on all of them in the catch clause, the first 3 rejections will be no-ops but the remaining 7 will all reject.

If the call to run_remote_function throws then we need to reject everything, but otherwise surely it needs to be this?

for (let i = 0; i < batched.resolvers.length; i += 1) {
  const resolver = batched.resolvers[i];
  try {
    resolver.resolve(get_result(batched.args[i], i));
  } catch (error) {
    resolver.reject(error);
  }
}

Copy link
Member Author

@dummdidumm dummdidumm Aug 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

started implementing this but stumbled into an error handling question: Right now, we do call the handleError hook when it's a client->server invocation but don't when it's happening server->server. I'll gloss over it for now but there may be a more general mismatch with error handling and remote functions.

Update: The mismatch AFAICT ends up being:

  • on calls on the server the error is just thrown as-is, no "handleError" hook handling in place
  • on remote calls from the client the error is routed through handleError directly

Right now this wouldn't make a difference, because on the server the error bubbles up all the way to the root, and will be handled there by our root-handleError catcher. But once we we have async SSR in place, and a boundary catches an error on the server (if we want that to happen; maybe a separate discussion), then it gets the raw error on the server but a transformed-to-HttpError error on the client.

Update: We concluded that we won't change the behavior of boundaries in SSR, i.e. they still won't catch errors, so the behavior will be the same, so it's fine as is.

@Thiagolino8
Copy link

I think what you're asking though is whether we should batch all simultaneous queries into a single request

Yes, that's what I thought this new remote function would do
But after reading the PR examples a little more, I better understood what it actually does, but not what it's for

I mean, I understood that the intention is for the front-end part that only cares about one piece of data to use the query that fetches only that piece of data, but without causing multiple calls if you have to fetch several similar pieces

But I feel like in 90% of cases, you won't know in advance which pieces of a list to fetch, so an aggregate call will be necessary anyway, especially if it involves things like pagination and filters

But even when that's not the case, I feel like a much simpler solution would be the ability to add an item to the cache that hasn't been requested yet

Here's a snippet of actual code from a React + Tanstack query app
that I'm currently converting to Svelte

export const usersOptions = ({
	page = 1,
	size = 10,
	email,
	name,
}: NonNullable<UsersRequest["params"]>["query"] = {}) =>
	queryOptions({
		queryKey: ["users", page, size, email, name],
		queryFn: ({ signal }) =>
			apiClient.get["/v1/users"]({
				params: {
					query: {
						page,
						size,
						sort: "name,ASC",
						...(email && { email }),
						...(name && { name }),
					},
				},
				signal,
			}),
	});

export const userOptions = (id: string) =>
	queryOptions({
		queryKey: ["user", id],
		queryFn: ({ signal }) =>
			apiClient.get["/v1/users/{id}"]({
				params: { path: { id } },
				signal,
			}),
	});

export const useUsers = ({
	page = 1,
	size = 10,
	email,
	name,
	sort = "name,ASC",
}: NonNullable<UsersRequest["params"]>["query"] = {}) => {
	const queryClient = useQueryClient();
	const query = useSuspenseQuery(
		usersOptions({ page, size, email, name, sort }),
	);
	query.data.content?.forEach(user => {
		queryClient.setQueryData(userOptions(user.id ?? "").queryKey, user);
	});
	return query;
};

export const useUser = (id: string) => useSuspenseQuery(userOptions(id));

So when I render, for example, a table of users, the table uses the hook that fetches the list, but the rows render the data from the hook that fetches the individual item

But the request for the individual items never happens, because their values ​​have already been cached by the hook that fetches the list

Currently if you try to add an item that has not yet been requested in the cache, the request is made

@Rich-Harris
Copy link
Member

Clearly it's better to fetch the data with /v1/users if you can and populate the cache such that the client never needs to hit /v1/users/${id}. The API for doing so with remote functions would be something like this (i.e. not using withOverride, since this is about ground truth rather than optimistic updates)...

getUser(id).set(userData);

...though it doesn't currently work as expected because the query isn't kept in the cache after a set meaning the data gets refetched anyway. We should probably change the caching behaviour in these cases.

But that doesn't help in the case where you do need to get some user data that wasn't yet fetched for whatever reason. In this case, if you're getting multiple users simultaneously it's obviously better to do it in one database query or external API call.

There are lots of reasons you might find yourself in that state. Suppose you're building something like the Netflix homepage, and you have a bunch of different rows containing movies, some of which are personalized recommendations and some of which are cached trending/top/whatever lists. It might not make sense to get all the lists in a single call (no need to keep regenerating the trending list for every user), so you have several calls to different endpoints. But since those lists will likely contain overlapping movies, it would be a reasonable choice to not include the actual movie data, but rather just a list of IDs. The client can then fetch the minimal amount of movie data, skipping anything it already has in its cache.

Of course, then you'll likely have clients making dozens of getMovie(id) calls. If each one is handled separately, that will create extra database load or use up API rate limits. query.batch is the solution.

Obviously this approach creates a waterfall! You could certainly come up with a more optimal solution. But that might require very significant refactoring, whereas turning getMovie from a query to query.batch is trivial.

if (result.type === 'redirect') {
// TODO double-check this
await goto(result.location);
await new Promise((r) => setTimeout(r, 100));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's this about?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copied from the existing query code. Last time I specifically tested this I had to give the async function a bit of time before I rejected it to not cause problems with async Svelte. Not sure if that's still the case hence the TODO

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just tried deleting it, no tests failed. Not sure if that means it's safe to discard but my inclination is to do so, at most we should be using tick() I think but it might not be necessary at all

Copy link
Member

@Rich-Harris Rich-Harris left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is good to go, aside from that last question about the setTimeout. The docs won't deploy, because they reference query.batch which is unreleased. (Ideally we would update the docs preview workflow to install from the pkg.pr.new link, but that brings its own challenges)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants