Skip to content

Conversation

@Numpsy
Copy link
Contributor

@Numpsy Numpsy commented Mar 12, 2023

Refs #34

This is just replacing Newtonsoft.Json with System.Text.Json - it doesn't contain any of changes to the trimming related functionality mentioned in #34

Fixes #34

@danipen
Copy link
Owner

danipen commented Mar 12, 2023

Thanks for the contribution. Question: Should we use the generators to be compatible with AOT and remove the existing code?

@Numpsy
Copy link
Contributor Author

Numpsy commented Mar 13, 2023

I thought that being better for AOT was supposed to be one of the benefits of the source generator, but I haven't tested that myself

@danipen
Copy link
Owner

danipen commented Mar 13, 2023

Sorry, maybe I didn't explain myself very well.

So my two questions are

  1. As you already added the generator
    [JsonSerializable(typeof(GrammarDefinition))]
    Should we remove the KeepType stuff added by @hez2010?

  2. Should we remove the Newtonsoft dep in the props file?

@hez2010
Copy link
Contributor

hez2010 commented Mar 13, 2023

I think the answer to both is YES.

@Numpsy
Copy link
Contributor Author

Numpsy commented Mar 13, 2023

For (2), this change only changes the part that parsers grammers - there is another use of newtonsoft that does plists at

JsonReader reader = new JsonTextReader(contents);

@danipen
Copy link
Owner

danipen commented Mar 13, 2023

That's used to read the grammar file. We should migrate that too.

@Numpsy
Copy link
Contributor Author

Numpsy commented Mar 15, 2023

I had a small go a while back at replacing JsonTextReader with Utf8JsonReader and I think the small issue I had there was that the current APIs in the area all use Stream or StreamReader, whilst Utf8JsonReader operates on spans and sequences.

From what I recall, I think the situation was that as long as JsonPlistParser is given a StreamReader then it either needs to read the whole thing into an array to use as a span, or do something like https://learn.microsoft.com/en-us/dotnet/standard/serialization/system-text-json/use-dom-utf8jsonreader-utf8jsonwriter?pivots=dotnet-7-0#read-from-a-stream-using-utf8jsonreader with chunks of data.
(Maybe none of the files it's reading are big enough for it to be an issue though, just thought I'd query first)

@danipen
Copy link
Owner

danipen commented Mar 15, 2023

Thanks for taking a look.

Yes, it's odd that the Utf8JsonReader doesn't allow to read from a Stream.

It seems that we need to do it manually ... Have a buffer and read chunks of bytes from the stream, let's say 2048 each time, and pass them to the Utf8JsonReader as a span.

@danipen
Copy link
Owner

danipen commented Mar 15, 2023

@Numpsy thanks for the effort in bringing System.Text.Json to this project. Unfortunately, using the UtfJsonReader to read from a stream is not trivial.

Probably we could go with the current changes, and let the JSONPListParser still use Newtonsoft, but I prefer not to rely on two different mechanisms (which involves downloading both NewtonSoft + System.Text.Json from NuGet).

Probably the dotnet guys add a UtfJsonStreamReader soon and we can revisit this stuff.

@Numpsy
Copy link
Contributor Author

Numpsy commented Mar 15, 2023

Hmm, that looks more complicated that the MS example :-(

Another possibility might be to see how it'd look using JsonDocument instead of using the reader directly?

@danipen
Copy link
Owner

danipen commented Mar 15, 2023

I gave a try to the MS example but I was unable to get it working.
image

I also gave a try to the StackOverflow solution which is not trivial. Also getting exceptions there.

So I'm not confident this is going to work fine unless the dotnet guys release a official UtfJsonStreamReader.

In the other hand, I'm not sure if we can get the tokens from the JsonDocument, but unfortunately, it requires loading the whole JSON grammar in memory to later build the IRawGrammar which doesn't sound like a good idea.

@danipen
Copy link
Owner

danipen commented Mar 15, 2023

That being said, if you give it a try and you find a solution for it, please let me know :-)

@danipen
Copy link
Owner

danipen commented Mar 15, 2023

I made a draft that uses System.Text.Json to read the JSONPListParser. Now is loading the whole bytes in memory (but it's working without exceptions. Not too bad since the biggest grammar is ~655KB.

But this is not an acceptable solution. A requirement for this PR to be merged would be read in chunks. If someone wants to take a look into it, it would be appreciated!

@danipen danipen marked this pull request as draft March 15, 2023 20:59
@Numpsy
Copy link
Contributor Author

Numpsy commented Mar 23, 2023

I'll try to give it another look next week

@Numpsy
Copy link
Contributor Author

Numpsy commented Mar 23, 2023

I found this old commit from when I was last playing with things, just in case it's of interest: Numpsy@15b2cef

@Numpsy
Copy link
Contributor Author

Numpsy commented Mar 24, 2023

I had a first go at making it read the data in chunks... I don't know if it's the best approach, but the unit tests are passing for me at least.

case JsonTokenType.String:
pList.StartElement("string");
pList.AddString((string)reader.Value);
pList.AddString(reader.GetString());
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be possible to use CopyString instead of GetString here as well (pList.AddString appends the data to a StringBuilder, and it might be possible to use the array version of Append instead of the string version to reduce the number of temporary strings it creates)

@danipen danipen marked this pull request as ready for review March 25, 2023 00:00
@danipen danipen merged commit c741bd4 into danipen:master Mar 25, 2023
@Numpsy Numpsy deleted the rw/stj1 branch March 25, 2023 10:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Switch to System.Text.Json?

3 participants