Skip to content

Conversation

@roji
Copy link
Member

@roji roji commented Aug 26, 2025

Closes #28766

Description
Implements both whole-column updating (ExecuteUpdate(s => s.SetProperty(x => x.JsonColumn, foo))) and partial updating (ExecuteUpdate(s => s.SetProperty(x => x.JsonColumn.PropertyInside, foo))). For partial updates, supports scalars, objects and collections.

Ports over the previous ComplexTypeBulkUpdates tests (which were table-splitting-specific) to the association tests (same tests exercise for table splitting and JSON).

Customer impact
This allows users to use ExecuteUpdate on complex JSON columns, which was major missing functionality in these areas we're delivering for EF 10. This rounds off the work already done on both complex types and on JSON columns, to provide a more complete experience to users using these features.

How found
Known bug/missing feature component.

Regression
No.

Testing
Added.

Risk
Medium. The changes here are a bit larger than typical fixes going into rc.2, though the actual product changes here aren't huge (most of the changes are in the tests). Changes also almost mostly add new logic paths (for JSON) rather than changing existing ones, so this shouldn't be too risky.

@AndriySvyryd
Copy link
Member

Closes #28776

Wrong issue.

So, JSON bulk CUD will only work with json type on SQL Server?

@roji
Copy link
Member Author

roji commented Aug 26, 2025

So, JSON bulk CUD will only work with json type on SQL Server?

Yeah, it seems so:

CREATE TABLE foo (id INT, j json);
INSERT INTO foo (id, j) VALUES (1, CAST('{ "a": 8 }' AS JSON));
UPDATE foo SET j.modify('$.a', 9);
SELECT j FROM foo; -- Works: 9

DROP TABLE foo;
CREATE TABLE foo (id INT, j nvarchar(max));
INSERT INTO foo (id, j) VALUES (1, '{ "a": 8 }');
UPDATE foo SET j.modify('$.a', 9); -- Cannot call methods on nvarchar(max)

@roji
Copy link
Member Author

roji commented Aug 26, 2025

@AndriySvyryd what I can do tomorrow is look into falling back to JSON_MODIFY() when not using json date type - this should be really easy and allow it to be used in current/older versions of SQL Server.

@AndriySvyryd
Copy link
Member

@AndriySvyryd what I can do tomorrow is look into falling back to JSON_MODIFY() when not using json date type - this should be really easy and allow it to be used in current/older versions of SQL Server.

Ok. Just make sure to use JSON_VALUE/JSON_QUERY, see #36653

@roji roji force-pushed the ExecuteUpdateOnJson branch 2 times, most recently from c306ad6 to eef5adf Compare August 27, 2025 14:29
@roji roji marked this pull request as ready for review August 27, 2025 14:29
@roji roji requested a review from a team as a code owner August 27, 2025 14:29
@roji
Copy link
Member Author

roji commented Aug 27, 2025

@AndriySvyryd this should be ready for review. Partial JSON updates now work without the new json date type (via JSON_MODIFY), and also with SQLite (via json_set).

@roji roji added the ask-mode label Aug 27, 2025
@roji roji changed the title Implement ExecuteUpdate support for complex JSON [rc2] Implement ExecuteUpdate support for complex JSON Aug 27, 2025
@roji roji force-pushed the ExecuteUpdateOnJson branch from eef5adf to 5c98796 Compare August 27, 2025 14:33
@roji roji requested a review from artl93 August 27, 2025 14:36
Copy link
Member

@artl93 artl93 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rc2. Finishing relevant work.

@AndriySvyryd AndriySvyryd requested a review from Copilot August 28, 2025 00:35
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR implements ExecuteUpdate support for complex JSON columns, enabling both whole-column updates and partial property updates within JSON data. The implementation rounds out the complex types and JSON columns functionality for EF 10 by adding bulk update operations.

Key changes:

  • Implements JSON partial update support for both SQL Server (using modify method for 2025+ and JSON_MODIFY for older versions) and SQLite (using json_set)
  • Reorganizes and consolidates bulk update tests from table-splitting-specific tests to shared association tests that work across both table splitting and JSON scenarios
  • Adds infrastructure for JSON partial updates through provider extension points

Reviewed Changes

Copilot reviewed 39 out of 40 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
src/EFCore.Relational/Query/RelationalQueryableMethodTranslatingExpressionVisitor.ExecuteUpdate.cs Major refactoring to support JSON partial updates with new logic for JsonScalarExpression and JsonQueryExpression handling
src/EFCore.SqlServer/Query/Internal/SqlServerQueryableMethodTranslatingExpressionVisitor.cs Implements SQL Server-specific JSON partial update using modify method for 2025+ and JSON_MODIFY fallback
src/EFCore.Sqlite.Core/Query/Internal/SqliteQueryableMethodTranslatingExpressionVisitor.cs Implements SQLite-specific JSON partial update using json_set function
src/EFCore.Relational/Query/Internal/RelationalJsonUtilities.cs New utility class for JSON serialization operations extracted from other classes
test/EFCore.Specification.Tests/Query/Associations/AssociationsBulkUpdateTestBase.cs New base test class for bulk update operations across different association mapping strategies
test/EFCore.Specification.Tests/BulkUpdates/ComplexTypeBulkUpdatesTestBase.cs Removed old table-splitting-specific bulk update tests
Multiple test files Consolidated test organization moving from provider-specific ComplexTypeBulkUpdates to shared associations testing
Files not reviewed (1)
  • src/EFCore.Relational/Properties/RelationalStrings.Designer.cs: Language not supported

UPDATE [r]
SET [r].[RequiredRelated] = JSON_MODIFY([r].[RequiredRelated], '$.String', JSON_VALUE([r].[OptionalRelated], '$.String')),
[r].[OptionalRelated] = JSON_MODIFY([r].[OptionalRelated], '$.String', @p)
Copy link
Member

@AndriySvyryd AndriySvyryd Aug 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Parameters should be wrapped in JSON_VALUE/JSON_QUERY too. Add a test that sets a string property to a constant with non-ascii and special (e.g. \) characters.

Copy link
Member Author

@roji roji Aug 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test updates scalar properties, not nested complex types; unless I'm mistaken, for such updates we want to avoid JSON_QUERY precisely because we want special characters to be escaped, no? The current logic only adds a JSON_QUERY() wrapper when the thing being set (the left side) represents a structural type (or collection of structural type). Does that make sense?

In any case, added a seed data instance with special characters and a test which matches on it and updates it with other special chars - everything passes.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test updates scalar properties, not nested complex types; unless I'm mistaken, for such updates we want to avoid JSON_QUERY precisely because we want special characters to be escaped, no?

I see, you currently only send JSON payload in parameters for structural types, not scalars. This could result in a different issue - different JSON representation depending on how the value was sent. I don't know the exact cases where this would be a problem, but test byte[], enum and spatial scalars.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The constant vs. parameter thing is orthogonal to scalar vs. structural - the current code follows regular query rules, sending constants when the user provides the value in line, and parameters when the user uses a captured variable etc. (this logic applies to both scalars and structural types). And at least based on my understanding, with JSON_MODIFY() we should wrap the value with JSON_QUERY() iff the value represents a structural type/(i.e. a JSON document) - regardless of whether it's a constant or a parameter (or even a column, for that matter).

To summarize:

// Scalars:

// Inline value, so we generate a scalar constant:
...ExecuteUpdate(s => s.SetProperty(x => x.JsonThing.Int, x => 8)

// Captured variable, so we generate a SQL parameter:
var i = 8;
...ExecuteUpdate(s => s.SetProperty(x => x.JsonThing.Int, x => i)

// Not a captured variable because not inside a lambda (just like Skip/Take), so we generate a SQL parameter:
var i = 8;
...ExecuteUpdate(s => s.SetProperty(x => x.JsonThing.Int, i)

// Structural:

// Inline value, so we generate a string constant containing a JSON object (JSON_QUERY is required for JSON_MODIFY):
...ExecuteUpdate(s => s.SetProperty(x => x.JsonThing.Nested, x => new Nested { ... })

// Captured variable, so we generate a SQL parameter containing a JSON object (JSON_QUERY is required for JSON_MODIFY):
var nested = new Nested { ... };
...ExecuteUpdate(s => s.SetProperty(x => x.JsonThing.Int, x => nested)

(same with the non-lambda local variable)

Does the above make sense?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

However, despite all of the above you're right that I missed the special handling needed for any types that aren't native JSON types (string/int) - good catch.

AFAICT, for structural setting everything should be fine as-is, since we take the .NET object and pass it through SerializeComplexTypeToJson, which produces a JSON string that we just send to the database. But for scalars, I forgot to do the proper conversion to the JSON string representation via the mapping. I'll fix that.

On a side note, I'd to avoid starting to shove in more types into the model, as we've been doing; that doesn't scale, and doesn't work well for other providers which may have their own types they want to test. Instead, I'd prefer us to have a "type tests", where a provider can call a single test method which tests everything on that one specific store type - this would include testing this partial update (this is a bit what BuiltInDataTypesTestBase was supposed to be). Doing this is tracked by #33521.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On a side note, I'd to avoid starting to shove in more types into the model, as we've been doing; that doesn't scale, and doesn't work well for other providers which may have their own types they want to test

I think it's fine here, since all types should have a JSON representation and that would make them compatible with any provider that supports JSON.
Also, in general it should be ok as long as it doesn't impact other tests. The unsupported tests would be skipped and the properties ignored in the model.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But even if sending as a parameter, what store type would it have? What about scalars that don't have a store type and can only be represented in JSON through the provided reader/writer?

We discussed a bit offline, but from what I can understand from looking at the JSON_MODIFY() equivalents on most databases, we'd be sending either scalars or parameters of either numeric, bool or string type mappings; for e.g. DateTime that would be e.g. an nvarchar(max) parameter containing the JSON representation (ISO8601), as returned from our type mapping's JsonValueReaderWriter.

I don't think we have a case of scalars with no store type - in ExecuteUpdate you're always some scalar property (in the model) to something, and so we take the type mapping from that scalar property (but I think I'm probably misunderstanding you)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, in general it should be ok as long as it doesn't impact other tests. The unsupported tests would be skipped and the properties ignored in the model.

Sure, and it's probably OK to add a small set of universal types (bool, enum, byte[]...). But we should have a single place where a provider adds a single test saying "I support IPAddress", and then that test goes off and checks that IPAddress is supported everywhere. Rather than the provider writer having to figure out they need to add IPAddress to this test suite, to that test suite, and so on.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we have a case of scalars with no store type - in ExecuteUpdate you're always some scalar property (in the model) to something, and so we take the type mapping from that scalar property (but I think I'm probably misunderstanding you)

Yeah, technically they'll have some store type, but it could be a UDT (for SQL Server) or anything else that the JSON_MODIFY equivalent doesn't support. My point is that we should strive to be consistent between sending scalars in structural types and on their own. Even for string, the way JSON_MODIFY escapes it could be slightly different from the way STJ escapes it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even for string, the way JSON_MODIFY escapes it could be slightly different from the way STJ escapes it.

OK, but that's a very theoretical argument... At the end of the day, we rely on JSON_MODIFY() (and equivalents in other databases) to do the right thing and properly set JSON data types in the document (stirng, bool, numeric). Where that doesn't work, we probably will have to simply disable partial update. I think that's going to be similar for the (equality somewhat theoretical) case of cases where we don't have a type mapping - we need some way to transform values into something that can be passed to JSON_MODIFY() (BTW if we don't have a type mapping, I'm not sure how we can even serialize the entire document, so this doesn't seem specific to partial updating).

BTW one example of this that I ran into is #36689: we decided to support ulong for SQLite (not sure that was a great decision), but json_set() doesn't work correctly with values over long.MaxValue. But when serializing the entire document, everything obviously works since we're doing it all client-side.

Let's discuss if there's some angle I'm missing here.

@roji roji force-pushed the ExecuteUpdateOnJson branch from 5c98796 to 47fa51e Compare August 28, 2025 10:27
@roji roji requested a review from AndriySvyryd August 28, 2025 10:28
@roji
Copy link
Member Author

roji commented Aug 31, 2025

@AndriySvyryd pushed two additional commits (I recommend reviewing separately).

The first commit takes care of updating multiple properties within the same JSON column, modifying the API to allow providers to combine multiple property changes into the same setter (e.g. by chaining JSON_MODIFY() calls). It also extracts out TryTranslateSetters to be a non-local method so that SQL Server can override it (as pubternal) to work around the new modify only allowing a single update to the column: we translate all setters, and if we see the same column is updated more than once, we re-translate, avoiding modify for that column and using JSON_MODIFY instead. It looks big but it's mostly just moving code out to be overridable.

The second commit properly serializes scalars to JSON, exposing an API to allow providers to control the process (similar to what we have in the update pipeline). This also introduces minimal "store type tests", which is where we can put stuff that we want to test for each and every type (for now only the partial update is tested, but the idea is to add more stuff to it). This way, providers only have to add a single test there for each type they support, and they get the full range of coverage for everything. UPDATE: I pushed an additional commit that redoes this in a better way.

@roji roji force-pushed the ExecuteUpdateOnJson branch 3 times, most recently from cd4acab to 208af23 Compare September 1, 2025 10:21
roji added a commit to roji/efcore.pg that referenced this pull request Sep 1, 2025
And introduce new store type tests
@roji
Copy link
Member Author

roji commented Sep 1, 2025

Also implemented support for PG in npgsql/efcore.pg#3609, including arbitrary expression conversion from relational to JSON (via to_jsonb()).

@roji roji force-pushed the ExecuteUpdateOnJson branch from a9cdd9e to 1efd495 Compare September 2, 2025 15:42
@roji roji force-pushed the ExecuteUpdateOnJson branch from 1efd495 to 9450640 Compare September 2, 2025 16:47
private static string? ParameterJsonSerializer(QueryContext queryContext, string baseParameterName, JsonValueReaderWriter jsonValueReaderWriter)
{
var value = queryContext.Parameters[baseParameterName];
var jsonValue = value is null ? null : jsonValueReaderWriter.ToJsonString(value)[1..^1];
Copy link
Member

@AndriySvyryd AndriySvyryd Sep 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it's correct to expect that the value returned by ToJsonString is surrounded by quotes. It returns a string containing JSON, not necessarily a string containing a JSON string.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right that ToJsonString() returns an unquoted string containing true/false for bool, an unquoted number for numeric, and an unquoted null for nulls. However, TrySerializeScalarToJson() checks for numeric/bool string (and later also for null) and doesn't call into ToJsonString() for those cases, since it assumes that these types can just be passed into the database partiaul update function (e.g. JSON_MODIFY()) directly.

I'll add assertions that the string always starts and ends with double quotes before trimming, but I think because of the above the logic is sound? I'll go ahead and merge like that, but let me know if I'm missing something and I'll fix.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right. Perhaps a comment here would help.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, you're right - added one.

Copy link
Member

@AndriySvyryd AndriySvyryd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It still feels like there are some issues lurking here, but I cannot think of any specific tests to add.

@roji
Copy link
Member Author

roji commented Sep 3, 2025

It still feels like there are some issues lurking here, but I cannot think of any specific tests to add.

Yeah, I'm sure there are, it's a tricky area. I'm reasonably confident that at the very least the mainstream usage scenarios are covered, we'll handle bugs as usual.

@roji roji enabled auto-merge (squash) September 3, 2025 09:59
@roji roji mentioned this pull request Sep 3, 2025
3 tasks
@roji roji merged commit 9a34953 into dotnet:release/10.0 Sep 3, 2025
7 checks passed
@roji roji deleted the ExecuteUpdateOnJson branch September 3, 2025 10:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

ExecuteUpdate: support partial updating inside JSON documents

3 participants