Skip to content

Conversation

@sandersn
Copy link
Member

@sandersn sandersn commented Dec 9, 2024

  1. No incremental reparsing, although the non-incremental reparsing only reparses the statements that might contain await.
  2. No TransformFlags, just an approximation in the parser.

I haven't added all the code to turn off PossibleTopLevelAwait because I'd like some agreement that this approach is good enough. I made enough changes to pass our test suite, in ClassDeclaration and ImportDeclaration parsing, but there are 16-17 more places to save/overwrite the toplevel await tracking bool.

That means in the current commit, for example, a toplevel FunctionDeclaration that contains await somewhere inside will reparse the FunctionDeclaration, even though the old compiler would not. It's possible that reparsing this doesn't change anything anyway, but it's inefficient to reparse it when it's not needed.

1. No incremental reparsing, although the non-incremental reparsing only
reparses the statements that might contain await.
2. No TransformFlags, just an approximation in the parser.

I haven't added all the code to turn off PossibleTopLevelAwait because
I'm not convinced my approach is right. There are 16-17 more places to
save/overwrite the toplevel await tracking bool. I made enough changes
to pass our test suite, that's all.

It's also possible that reparsing, say, a FunctionDeclaration, doesn't
change anything anyway, but it's inefficient to reparse it
when it's not needed.
}

func (p *Parser) reparseTopLevelAwait(sourceFile *ast.SourceFile) *ast.Node {
statements := []*ast.Statement{}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this part is a simplified port from tsc. It's simplified because it drops the attempt to use the incremental parser (which doesn't exist in tsgo).

1. Move possible-await cache to Parser.
2. Ignore "await" identifiers in the same places that tsc does, but
implemented in the parser instead. This also caches fewer statements
in the possible-await cache than the previous commit.

func (p *Parser) parseClassDeclarationOrExpression(pos int, hasJSDoc bool, modifiers *ast.ModifierList, kind ast.Kind) *ast.Node {
saveContextFlags := p.contextFlags
saveHasAwaitIdentifier := p.statementHasAwaitIdentifier
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these are all the save/restores that approximate TransformFlags propagation in tsc.

@sandersn sandersn requested a review from rbuckton December 10, 2024 20:04
}

func (p *Parser) containsPossibleTopLevelAwait(node *ast.Node) bool {
return !(node.Flags&ast.NodeFlagsAwaitContext != 0) && p.getPossibleAwait(node)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
return !(node.Flags&ast.NodeFlagsAwaitContext != 0) && p.getPossibleAwait(node)
return node.Flags&ast.NodeFlagsAwaitContext == 0 && p.getPossibleAwait(node)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ugh, yes, I saw that as I pasted it and then forgot to go fix it.

@DanielRosenwasser
Copy link
Member

That means in the current commit, for example, a toplevel FunctionDeclaration that contains await somewhere inside will reparse the FunctionDeclaration, even though the old compiler would not. It's possible that reparsing this doesn't change anything anyway, but it's inefficient to reparse it when it's not needed.

That might actually have some effects on things. What errors do you issue on the following cases?

// @filename: a.ts
function foo(x = await(10)) {
}

// @filename: b.ts
async function bar(x = await(10)) {
}

// @filename: c.ts
export {};
export function fooExported(x = await(10)) {
}

// @filename: d.ts
export {};
export async function barExported(x = await(10)) {
}

Weirdly, we don't have any test case where await in a parameter initializer is syntactically ambiguous as both an AwaitExpression and as a CallExpression like in these examples. Would you be able to add it?

Copy link
Member

@DanielRosenwasser DanielRosenwasser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't really looked at the diagnostic stitching code yet, but I've left some comments on style and how we run through the statement list that you might want to consider.

Instead of tracking await-possible statements and searching to
reconstruct spans, build spans during parsing.

Then reparsing can read the spans directly, which simplifies code quite
a bit.

The downside is that I need to provide indices in parseList. Because Go
isn't Javascript, I decided to copy parseList to parseListIndex. Two
other options: change parseList to always provide indices then

- Add ignored indices to parseStatement et al.
- Add a parseList that passes an adapter func which ignores indices.

I don't have a strong opinion about which to use, although I lean
slightly to adding a second parameter to parseStatement et al.

The current code isn't tested because of the way AST baselines aren't in
main. I'll push a followup commit with fixes.
@sandersn
Copy link
Member Author

The new commit simplifies tracking a lot compared to Strada. Instead of tracking await-possible statements and searching to
reconstruct spans, it builds spans during parsing. Then reparsing can read the spans directly, which simplifies code quite a bit.

The downside is that I need to provide indices in parseList. Because Go isn't Javascript, I decided to copy parseList to parseListIndex. Two other options: change parseList to always provide indices then

  • Add ignored indices to parseStatement, parseHeritageClause et al.
  • Add a parseList that passes an adapter func which ignores indices.

If adapter funcs aren't optimised away, I'd say the current code is best. I tried adding _ int parameters to parseStatement, et al and didn't like it at all--normal calls to parseStatement now need to pass -1 and there are lots of those calls.

I've tested on our current test cases but I need to try Daniel's cases he suggested on this PR.

@jakebailey
Copy link
Member

If adapter funcs aren't optimised away, I'd say the current code is best. I tried adding _ int parameters to parseStatement, et al and didn't like it at all--normal calls to parseStatement now need to pass -1 and there are lots of those calls.

You can largely assume that simple one-liners that aren't generic are inlineable. If you wanted to double check, there are flags that you can pass to the compiler that would show that information.

@rbuckton
Copy link

rbuckton commented Dec 19, 2024

The downside is that I need to provide indices in parseList.

Since parseTopLevelStatement is only ever called for statements at the top of a SourceFile, we could just have a topLevelStatementIndex field that we increment in parseTopLevelStatement, and reset it to 0 if we reuse a Parser

@rbuckton
Copy link

rbuckton commented Dec 19, 2024

Weirdly, we don't have any test case where await in a parameter initializer is syntactically ambiguous as both an AwaitExpression and as a CallExpression like in these examples. Would you be able to add it?

Not Weirdly. await cannot be an AwaitExpression in a parameter initializer. Per https://tc39.es/ecma262/#sec-async-function-definitions-static-semantics-early-errors, it is a Syntax Error if the parameters contain AwaitExpression.

@sandersn
Copy link
Member Author

@DanielRosenwasser re your test cases:

a: no (parse) errors, await is legal in a non-module (no reparsing either).
b. Same.
c. no (parse) errors, there is a reparse but it doesn't change anything. Not sure if this is legal in a module.
d. Same.

@rbuckton Are (c) and (d) correct? await(10) is parseable as a function call even in a module, but should it be parsed that way?

@rbuckton
Copy link

@rbuckton Are (c) and (d) correct? await(10) is parseable as a function call even in a module, but should it be parsed that way?

await will always be parsed in the Await context of the function:

(c) should be an error because a module is always parsed in strict mode, and await is illegal as an identifier in strict mode, per https://tc39.es/ecma262/#sec-identifiers-static-semantics-early-errors, though we could choose to parse it as an Identifier and issue a grammar error.

(d) should be an error because it will be parsed as an AwaitExpression, but would be a syntax error per https://tc39.es/ecma262/#sec-async-function-definitions-static-semantics-early-errors, which we could also treat as a grammar error.

There are two other scenarios to consider:

// @filename: e.ts
function fooStrict(x = await(10)) {
  "use strict"
}

// @filename: f.ts
(function () {
  "use strict"
  function fooStrict(x = await(10)) {
  }
})()

These should be grammar errors because await is parsed as an Identifier, but fails https://tc39.es/ecma262/#sec-identifiers-static-semantics-early-errors since the function is strict-mode code.

Please note that V8 seems to have a bug here, as it incorrectly allows await as an identifier in strict-mode code, in violation of https://tc39.es/ecma262/#sec-identifiers-static-semantics-early-errors, though it correctly forbids yield as an identifier in strict mode per the same rule.

@rbuckton
Copy link

rbuckton commented Dec 19, 2024

These should be grammar errors because await is parsed as an Identifier, but fails https://tc39.es/ecma262/#sec-identifiers-static-semantics-early-errors since the function is strict-mode code.

Sorry, this was an incorrect interpretation of 13.1.1. 13.1.1 only disallows await if the goal symbol is Module.

To clarify:

  • (a) should parse as a call expression for the id await
  • (b) should parse as an AwaitExpression, but error due to 15.8.1
  • (c) should parse as a call expression for the id await, but error due to 13.1.1 because we are parsing a Module.
  • (d) should parse as an AwaitExpression, but error due to 15.8.1
  • (e) and (f) should be ignored as they were based on a faulty interpretation of 13.1.1

@sandersn
Copy link
Member Author

sandersn commented Dec 20, 2024

Ah, I see the difference. a,b,c,d,e,f all give grammar errors after parsing correctly. I think the PR's code is working the same as the old code -- by the end of parsing it produces exactly the same trees and exactly the same errors (none).

@rbuckton
Copy link

rbuckton commented Jan 7, 2025

Ah, I see the difference. a,b,c,d,e,f all give grammar errors after parsing correctly. I think the PR's code is working the same as the old code -- by the end of parsing it produces exactly the same trees and exactly the same errors (none).

I take it the grammar errors for this are not yet ported in the checker, or are they missing/broken in both compilers?

@sandersn
Copy link
Member Author

sandersn commented Jan 7, 2025

The grammar errors are present and correct in Strada.
Errors for (b) and (d) and two of (e)'s are ported. But this PR ports just the parser code.

p.parseExpected(ast.KindImportKeyword)
afterImportPos := p.nodePos()
// We don't parse the identifier here in await context, instead we will report a grammar error in the checker.
saveHasAwaitIdentifier := p.statementHasAwaitIdentifier
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wish we didn't have all of this manual saving/restoring in favor of some sort of defer but...

result.ScriptKind = p.scriptKind
}
}
p.possibleAwaitSpans = []int{}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than an int array, you could make this a struct {} but I guess it's fine.

@sandersn sandersn merged commit 6f1526c into microsoft:main Jan 7, 2025
12 checks passed
@sandersn sandersn deleted the add-toplevel-await-notest branch January 7, 2025 19:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants