-
-
Notifications
You must be signed in to change notification settings - Fork 2.1k
feat: add new remote function query.batch
#14272
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Implements `query.batch` to address the n+1 problem
🦋 Changeset detectedLatest commit: 4e81119 The changes in this PR will be included in the next version bump. This PR includes changesets to release 1 package
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
export const getPost = query.batch(v.string(), async (slugs) => { | ||
const posts = await db.sql` | ||
SELECT * FROM post | ||
WHERE slug = ANY(${slugs}) | ||
`; | ||
return posts; | ||
}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How does SvelteKit split the array result back out to the n
functions? Like if I have functions that call getPost('a')
, getPost('b')
, getPost('c')
, I'd get ['a', 'b', 'c']
on the server, and return the corresponding posts in an array from my query function, but how will SvelteKit know which item in that array to give back to my functions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is that a question that you want answered in the docs or a "curious about the implementation" question?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes 😆
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added something to the docs, let me know if that helps / is enough
Shouldn't this be the default behavior of query rather than a new type of remote function? |
Comparing this upcoming feature with what tRPC has, it appears that this PR is equivalent to their
Will/Does SvelteKit have a similar functionality incorporated into |
At an API level, no — it would be very weird and annoying if I always had to deal with arrays, when most queries are singular. I think what you're asking though is whether we should batch all simultaneous queries into a single request. We've talked about this a lot, and we've been careful to leave enough design space for us to switch to a batched implementation if it seems like the right approach. So far we've decided against it though, partly because it's a heck of a lot simpler as-is (make it work, make it right, then make it fast), partly because it would likely constrain our options when it comes to caching, when we get round to that.
No. There's no point. The use case for If we were to offer something that is like Separately, there will be a primitive for streaming: #14292 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
recapping offline conversation: we want to change the signature to be like this, so that there are no shenanigans around unpredictably-sorted data, and so that we can error on individual items:
export const getStuff = query.batch(v.string(), async (ids) => {
const stuff = await db.sql`SELECT * FROM stuff WHERE id in (${ids.join(', ')})`;
const lookup = new Map(stuff.map((thing) => [thing.id, thing]));
return (id) => lookup.get(id) ?? error(404);
});
for (let i = 0; i < batched.resolvers.length; i++) { | ||
batched.resolvers[i].resolve(get_result(batched.args[i], i)); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this is quite right — if you have 10 items you might resolve the first 3, then the 4th errors, and then you call resolver.reject
on all of them in the catch
clause, the first 3 rejections will be no-ops but the remaining 7 will all reject.
If the call to run_remote_function
throws then we need to reject everything, but otherwise surely it needs to be this?
for (let i = 0; i < batched.resolvers.length; i += 1) {
const resolver = batched.resolvers[i];
try {
resolver.resolve(get_result(batched.args[i], i));
} catch (error) {
resolver.reject(error);
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
started implementing this but stumbled into an error handling question: Right now, we do call the handleError
hook when it's a client->server invocation but don't when it's happening server->server. I'll gloss over it for now but there may be a more general mismatch with error handling and remote functions.
Update: The mismatch AFAICT ends up being:
- on calls on the server the error is just thrown as-is, no "handleError" hook handling in place
- on remote calls from the client the error is routed through handleError directly
Right now this wouldn't make a difference, because on the server the error bubbles up all the way to the root, and will be handled there by our root-handleError catcher. But once we we have async SSR in place, and a boundary catches an error on the server (if we want that to happen; maybe a separate discussion), then it gets the raw error on the server but a transformed-to-HttpError
error on the client.
Update: We concluded that we won't change the behavior of boundaries in SSR, i.e. they still won't catch errors, so the behavior will be the same, so it's fine as is.
Co-authored-by: Rich Harris <[email protected]>
Yes, that's what I thought this new remote function would do I mean, I understood that the intention is for the front-end part that only cares about one piece of data to use the query that fetches only that piece of data, but without causing multiple calls if you have to fetch several similar pieces But I feel like in 90% of cases, you won't know in advance which pieces of a list to fetch, so an aggregate call will be necessary anyway, especially if it involves things like pagination and filters But even when that's not the case, I feel like a much simpler solution would be the ability to add an item to the cache that hasn't been requested yet Here's a snippet of actual code from a React + Tanstack query app export const usersOptions = ({
page = 1,
size = 10,
email,
name,
}: NonNullable<UsersRequest["params"]>["query"] = {}) =>
queryOptions({
queryKey: ["users", page, size, email, name],
queryFn: ({ signal }) =>
apiClient.get["/v1/users"]({
params: {
query: {
page,
size,
sort: "name,ASC",
...(email && { email }),
...(name && { name }),
},
},
signal,
}),
});
export const userOptions = (id: string) =>
queryOptions({
queryKey: ["user", id],
queryFn: ({ signal }) =>
apiClient.get["/v1/users/{id}"]({
params: { path: { id } },
signal,
}),
});
export const useUsers = ({
page = 1,
size = 10,
email,
name,
sort = "name,ASC",
}: NonNullable<UsersRequest["params"]>["query"] = {}) => {
const queryClient = useQueryClient();
const query = useSuspenseQuery(
usersOptions({ page, size, email, name, sort }),
);
query.data.content?.forEach(user => {
queryClient.setQueryData(userOptions(user.id ?? "").queryKey, user);
});
return query;
};
export const useUser = (id: string) => useSuspenseQuery(userOptions(id)); So when I render, for example, a table of users, the table uses the hook that fetches the list, but the rows render the data from the hook that fetches the individual item But the request for the individual items never happens, because their values have already been cached by the hook that fetches the list Currently if you try to add an item that has not yet been requested in the cache, the request is made |
Clearly it's better to fetch the data with getUser(id).set(userData); ...though it doesn't currently work as expected because the query isn't kept in the cache after a But that doesn't help in the case where you do need to get some user data that wasn't yet fetched for whatever reason. In this case, if you're getting multiple users simultaneously it's obviously better to do it in one database query or external API call. There are lots of reasons you might find yourself in that state. Suppose you're building something like the Netflix homepage, and you have a bunch of different rows containing movies, some of which are personalized recommendations and some of which are cached trending/top/whatever lists. It might not make sense to get all the lists in a single call (no need to keep regenerating the trending list for every user), so you have several calls to different endpoints. But since those lists will likely contain overlapping movies, it would be a reasonable choice to not include the actual movie data, but rather just a list of IDs. The client can then fetch the minimal amount of movie data, skipping anything it already has in its cache. Of course, then you'll likely have clients making dozens of Obviously this approach creates a waterfall! You could certainly come up with a more optimal solution. But that might require very significant refactoring, whereas turning |
if (result.type === 'redirect') { | ||
// TODO double-check this | ||
await goto(result.location); | ||
await new Promise((r) => setTimeout(r, 100)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what's this about?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Copied from the existing query code. Last time I specifically tested this I had to give the async function a bit of time before I rejected it to not cause problems with async Svelte. Not sure if that's still the case hence the TODO
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just tried deleting it, no tests failed. Not sure if that means it's safe to discard but my inclination is to do so, at most we should be using tick()
I think but it might not be necessary at all
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is good to go, aside from that last question about the setTimeout
. The docs won't deploy, because they reference query.batch
which is unreleased. (Ideally we would update the docs preview workflow to install from the pkg.pr.new link, but that brings its own challenges)
Implements
query.batch
to address the n+1 problemPlease don't delete this checklist! Before submitting the PR, please make sure you do the following:
Tests
pnpm test
and lint the project withpnpm lint
andpnpm check
Changesets
pnpm changeset
and following the prompts. Changesets that add features should beminor
and those that fix bugs should bepatch
. Please prefix changeset messages withfeat:
,fix:
, orchore:
.