Skip to content

Conversation

sirosen
Copy link
Member

@sirosen sirosen commented Aug 11, 2025

Addresses #1382

The runtime init signature of a validator can be seen with inspect:

>>> import inspect, jsonschema
>>> inspect.signature(jsonschema.validators.Draft7Validator.__init__)
<Signature (self, schema: 'referencing.jsonschema.Schema', resolver=None, format_checker: '_format.FormatChecker | None' = None, *, registry: 'referencing.jsonschema.SchemaRegistry' = <Registry (20 resources)>, _resolver=None) -> None>

This aligns the protocol's declared signature with that behavior more exactly with the following changes:

  • registry is now keyword-only, not keyword-or-positional, and has a
    default.

  • resolver is added to the declared signature, so users who are using
    it won't see typing-time discrepancies with the runtime. It is marked
    as Any and commented inline as deprecated, since it's unclear what
    else we could do to indicate its status.
    This means that code passing a resolver will continue to type check
    (previously it would not).

  • resolver is the second keyword-or-positional and format_checker is
    the third, meaning that a positional-only caller who passes, for
    example: Draft202012Validator(foo, None, bar) will have foo
    slotted as the schema and bar as the format_checker
    This would primarily impact callers with positional-args calling
    conventions, but is more reflective of what they'll see at runtime.

In order to remove resolver from the protocol signature, but match the runtime signatures well, some kind of placeholder is needed to indicate format_checker as positional-or-keyword.
Or else, a large number of overloads for __init__ could be declared to try to simulate its removal.

I considered such options but decided that they add undue complexity for what is supposed to be a hint for correct usage.

Assuming this or something like it merges, I'll backport to typeshed.


📚 Documentation preview 📚: https://python-jsonschema--1396.org.readthedocs.build/en/1396/

@Julian
Copy link
Member

Julian commented Aug 12, 2025

Thanks! Merging main will fix the license check, I still haven't gotten a chance to write a new tool unfortunately :/

The change looks correct obviously, but I'd love to have some actual tests... When I tried to fix this a few weeks ago I added a jsonschema/tests/typing_tests.py with:

"""
Static typing tests for jsonschema.
"""

from typing import assert_type

from jsonschema import Draft202012Validator
from jsonschema.protocols import Validator

assert_type(Draft202012Validator, type[Validator])

to at least address this specific test.

I forget what issue I ran into which made me unable to just add that immediately and fix this. Does it pass after your change here or do we run into some other broken annotation? If it passes (or if you otherwise have a recommendation for what to add to do this) could you add that as well?

The runtime init signature of a validator can be seen with `inspect`:

```pycon
>>> import inspect, jsonschema
>>> inspect.signature(jsonschema.validators.Draft7Validator.__init__)
<Signature (self, schema: 'referencing.jsonschema.Schema', resolver=None, format_checker: '_format.FormatChecker | None' = None, *, registry: 'referencing.jsonschema.SchemaRegistry' = <Registry (20 resources)>, _resolver=None) -> None>
```

This aligns the protocol's declared signature with that behavior more
exactly with the following changes:

`registry` is now keyword-only, not keyword-or-positional, and has a
default.

`resolver` is added to the declared signature, so users who are using
it won't see typing-time discrepancies with the runtime. It is marked
as `Any` and commented inline as deprecated, since it's unclear what
else we could do to indicate its status.
This means that code passing a resolver will continue to type check
(previously it would not).

`resolver` is the second keyword-or-positional and `format_checker` is
the third, meaning that a positional-only caller who passes, for
example: `Draft202012Validator(foo, None, bar)` will have `foo`
slotted as the schema and `bar` as the `format_checker`
This would primarily impact callers with positional-args calling
conventions, but is more reflective of what they'll see at runtime.

In order to remove `resolver` from the protocol signature, but match
the runtime signatures well, some kind of placeholder is needed to
indicate `format_checker` as positional-or-keyword. Or else, a large
number of overloads for `__init__` could be declared to try to
simulate its removal.
@sirosen sirosen force-pushed the improve-protocol-init-signature branch from 6b37b3f to a916d8f Compare August 12, 2025 20:08
@sirosen
Copy link
Member Author

sirosen commented Aug 12, 2025

I'd love to have some actual tests

Yeah, agreed. Given that there wasn't existing art to follow I didn't know about adding something new. But I'm happy to experiment a little and try to add something lightweight.

I forget what issue I ran into which made me unable to just add that immediately and fix this. Does it pass after your change here or do we run into some other broken annotation? If it passes (or if you otherwise have a recommendation for what to add to do this) could you add that as well?

I'll need to try it out to be sure, but I think assert_type is stricter than we want here.
I only know this because I've been corrected by typing experts on it, but apparently assert_type requires an exact match, not a subtype.

e.g.,

x: int = 1
assert_type(x, int | str)  # fails, because `x` is `int`, not `int | str`

But subtypes are what's checked by looking at what's assignable:

x: type[Validator]
x = Draft202012Validator  # only passes if `Draft202012Validator` is a subtype of `Validator`

I'll have to try some stuff out and I'll report back when I can make a viable test.

@Julian
Copy link
Member

Julian commented Aug 12, 2025

OK, cool thanks! Will wait to hear back. I guess given we have a runtime_checkable Protocol, we could just isinstance it too?? But will wait to hear if you come up with anything.

@vkuehn
Copy link

vkuehn commented Aug 18, 2025

@sirosen would that also fix this issue with validator

sirosen and others added 2 commits August 18, 2025 10:56
The typing tests are a new subdirectory of `jsonschema/tests/` which
are checked under a more stringent mypy configuration
(`--warn-unused-ignores`) in order to allow certain kinds of negative
tests against the declared types. [^1]

The first new test confirms that each validator matches the Validator
protocol, and furthermore that this is not vacuously true by way of
`Any`.

The test failed at first, as the return type of `create()` was not
annotated, and therefore under the declared types in `jsonschema`, all
of the validators were of type `Any`. To resolve, the return type of
`create` is now annotated as `type[Validator]`.

[^1]: Technically, the new nox configuration checks these files twice,
but only the second check, with `--warn-unused-ignores`, is doing all
of the necessary work.
@sirosen
Copy link
Member Author

sirosen commented Aug 18, 2025

Yes, based on a quick glance at that issue, I think it will fix that, but the change will need to roll out to typeshed as well, so it's not necessarily going to track closely with this PR.


I've figured out a testing approach which works, which I just pushed in a new commit. I was writing up a reply on why it wouldn't work the way I wanted and realized I was wrong. 😁

The testing approach I've taken here is to declare a separate subpackage of the tests, jsonschema/tests/typing/, and put modules in there which are meant as input to mypy as the testing tool. It's an approach I've used before with some success, and although it looks a little bit wonky, I think a solid docstring makes it clear enough to be maintainable.

One thing I had to change to get my test to pass cleanly was adding -> type[Validator] to the create() class factory.

I guess given we have a runtime_checkable Protocol, we could just isinstance it too?

We definitely can, but isinstance on a runtime_checkable protocol does basically the same work as ABC inheritance, where just having a method name is sufficient to count as implementing that method -- regardless of the signature.

My view is that runtime_checkable is better than nothing because at least it lets you catch at runtime that you're holding something totally unexpected like None, but it doesn't give enough safety to count as a good unit test.1

Footnotes

  1. I wonder if runtime_checkable could be extended with arguments so that we could at least get it to check signatures... I have some ideas. Maybe I'll start a DPO thread if I can cook up a prototype.

@Julian
Copy link
Member

Julian commented Aug 18, 2025

We definitely can, but isinstance on a runtime_checkable protocol does basically the same work as ABC inheritance, where just having a method name is sufficient to count as implementing that method -- regardless of the signature.

Uh right, fun. I'm getting "fond" memories that ~10 years ago I complained that zope.interface's implements had/has the same issue (not actually checking full signatures).

Anyways this looks fine I think, merging. Appreciated as usual!

@Julian Julian merged commit 460f4fa into python-jsonschema:main Aug 18, 2025
93 checks passed
@sirosen sirosen deleted the improve-protocol-init-signature branch August 18, 2025 19:35
sirosen added a commit to sirosen/typeshed that referenced this pull request Aug 19, 2025
In python-jsonschema/jsonschema#1396 , the type signature for
`Validator.__init__` is updated to better match the runtime signature.
This backports the fix to typeshed, keeping the copies of this data in
sync. python-jsonschema/jsonschema#1396 is, itself, a response to
feedback on `jsonschema` about the changes in `typeshed` python#14327.

In addition to the `__init__` fix, a couple of additional small changes
are made, both in `jsonschema` and here in the stubs:

1. In `jsonschema`, the type for `create()` in `validators.py` was
   updated to be notated with `-> type[Validator]`. This was necessary
   for internal testing on types to correctly read that validator
   classes created by this factory implement the protocol.

2. Here, in order to better guarantee that the types align, the
   `_Validator` class (which does not exist in `jsonschema`, but is only
   defined here in the stubs) now inherits from `Validator`.

3. The init signature for `_Validator` is updated to match

4. `tests/mypy_test.py` flags the `schema` instance variable annotation
   as mismatching between `Validator` and `_Validator`. Review against
   the `jsonschema` source reveals that `_Validator` was closer to
   correct, so `Validator` is fixed to match.

Any further changes (e.g., elimination of `_Validator` or changing
`create`'s return type annotation) are left as potential future work.
@edgarrmondragon
Copy link

I'm curious what's the process for backporting this to typeshed, to see if I can help 🙂. Just open a PR and don't touch METADATA.toml since we're still in the 4.25 line?

@sirosen
Copy link
Member Author

sirosen commented Aug 20, 2025

@edgarrmondragon, the offer is very much appreciated! I've actually already opened python/typeshed#14591 which I think is what we need (but a review of that is welcome).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants