Skip to content

Commit 8a2a1b8

Browse files
jimidleparrt
authored andcommitted
docs: Write release notes and new instrcutions for using the Go target.
Signed-off-by: Jim.Idle <[email protected]>
1 parent 4bf93cc commit 8a2a1b8

File tree

2 files changed

+268
-45
lines changed

2 files changed

+268
-45
lines changed

doc/go-changes.md

Lines changed: 179 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,179 @@
1+
# Changes to the Go Runtime over time
2+
3+
## v4.12.0 to v4.12.1
4+
5+
Strictly speaking, if ANTLR was a go only project following [SemVer](https://semver.org/) release v4.12.1 would be
6+
at least a minor version change and arguably a bump to v5. However, we must follow the ANTLR conventions here or the
7+
release numbers would quickly become confusing. I apologize for being unable to follow the Go release rules absolutely
8+
to the letter.
9+
10+
There are a lot of changes and improvements in this release, but only the change of repo holding the runtime code,
11+
and possibly the removal of interfaces will cause any code changes. There are no breaking changes to the runtime
12+
interfaces.
13+
14+
ANTLR Go Maintainer: [Jim Idle](https://github.com/jimidle) - Email: [[email protected]](mailto:[email protected])
15+
16+
### Code Relocation
17+
18+
For complicated reasons, including not breaking the builds of some users who use a monorepo and eschew modules, as well
19+
as not making substantial changes to the internal test suite, the Go runtime code will continue to be maintained in
20+
the main ANTLR4 repo `antlr/antlr4`. If you wish to contribute changes to the Go runtime code, please continue to submit
21+
PRs to this main repo, against the `dev` branch.
22+
23+
The code located in the main repo at about the depth of the Mariana Trench, means that the go tools cannot reconcile
24+
the module correctly. After some debate, it was decided that we would create a dedicated release repo for the Go runtime
25+
so that it will behave exactly as the Go tooling expects. This repo is auto-maintained and keeps both the dev and master
26+
branches up to date.
27+
28+
Henceforth, all future projects using the ANTLR Go runtime, should import as follows:
29+
30+
```go
31+
import (
32+
"github.com/antlr4-go/antlr/v4"
33+
)
34+
```
35+
36+
And use the command:
37+
38+
```shell
39+
go get github.com/antlr4-go/antlr
40+
```
41+
42+
To get the module - `go mod tidy` is probably the best way once imports have been changed.
43+
44+
Please note that there is no longer any source code kept in the ANTLR repo under `github.com/antlr/antlr4/runtime/Go/antlr`.
45+
If you are using the code without modules, then sync the code from the new release repo.
46+
47+
### Documentation
48+
49+
Prior to this release, the godocs were essentially unusable as the go doc code was essentially copied without
50+
change, from teh Java runtime. The godocs are now properly formatted for Go and pkg.dev.
51+
52+
Please feel free to raise an issue if you find any remaining mistakes. Or submit a PR (remember - not to the new repo).
53+
It is expected that it might take a few iterations to get the docs 100% squeaky clean.
54+
55+
### Removal of Unnecessary Interfaces
56+
57+
The Go runtime was originally produced as almost a copy of the Java runtime but with go syntax. This meant that everything
58+
had an interface. There is no need to use interfaces in Go if there is only ever going to be one implementation of
59+
some struct and its methods. Interfaces cause an extra deference at runtime and are detrimental to performance if you
60+
are trying to squeeze out every last nanosecond, which some users will be trying to do.
61+
62+
This is 99% an internal refactoring of the runtime with no outside effects to the user.
63+
64+
### Generated Recognizers Return *struct and not Interfaces
65+
66+
The generated recognizer code generated an interface for the parsers and lexers. As they can only be implemented by the
67+
generated code, the interfaces were removed. This is possibly the only place you may need to make a code change to
68+
your driver code.
69+
70+
If your code looked like this:
71+
72+
```go
73+
var lexer = parser.NewMySqlLexer(nil)
74+
var p = parser.NewMySqlParser(nil)
75+
```
76+
77+
Or this:
78+
79+
```go
80+
lexer := parser.NewMySqlLexer(nil)
81+
p := parser.NewMySqlParser(nil)
82+
```
83+
84+
Then no changes need to be made. However, fi you predeclared the parser and lexer variables with there type, such as like
85+
this:
86+
87+
```go
88+
var lexer parser.MySqlLexer
89+
var p parser.MySqlParser
90+
// ...
91+
lexer = parser.NewMySqlLexer(nil)
92+
p = parser.NewMySqlParser(nil)
93+
```
94+
95+
You will need to change your variable declarations to pointers (note the introduction of the `*` below.
96+
97+
```go
98+
var lexer *parser.MySqlLexer
99+
var p *parser.MySqlParser
100+
// ...
101+
lexer = parser.NewMySqlLexer(nil)
102+
p = parser.NewMySqlParser(nil)
103+
```
104+
105+
This is the only user facing change that I can see. This change though has a very beneficial side effect in that you
106+
no longer need to cast the interface into a struct so that you can access methods and data within it. Any code you
107+
had that needed to do that, will be cleaner and faster.
108+
109+
The performance improvement is worth the change and there was no tidy way for me to avoid it.
110+
111+
### Parser Error Recovery Does Not Use Panic
112+
113+
THe generated parser code was again essentially trying to be Java code in disguise. This meant that every parser rule
114+
executed a `defer {}` and a `recover()`, even if there wer no outstanding parser errors. Parser errors were issued by
115+
issuing a `panic()`!
116+
117+
While some major work has been performed in the go compiler and runtime to make `defer {}` as fast as possible,
118+
`recover()` is (relatively) slow as it is not meant to be used as a general error mechanism, but to recover from say
119+
an internal library problem if that problem can be recovered to a known state.
120+
121+
The generated code now stores a recognition error and a flag in the main parser struct and use `goto` to exit the
122+
rule instead of a `panic()`. As might be imagined, this is significantly faster through the happy path. It is also
123+
faster at generating errors.
124+
125+
The ANTLR runtime tests do check error raising and recovery, but if you find any differences in the error handling
126+
behavior of your parsers, please raise an issue.
127+
128+
### Reduction in use of Pointers
129+
130+
Certain internal structs, such as interval sets are small and immutable, but were being passed around as pointers
131+
anyway. These have been change to use copies, and resulted in significant performance increases in some cases.
132+
There is more work to come in this regard.
133+
134+
### ATN Deserialization
135+
136+
When the ATN and associated structures are deserialized for the first time, there was a bug that caused a needed
137+
optimization to fail to be executed. This could have a significant performance effect on recognizers that were written
138+
in a suboptimal way (as in poorly formed grammars). This is now fixed.
139+
140+
### Prediction Context Caching was not Working
141+
142+
This has a massive effect when reusing a parser for a second and subsequent run. The PredictionContextCache merely
143+
used memory but did not speed up subsequent executions. This is now fixed, and you should see a big difference in
144+
performance when reusing a parser. This single paragraph does not do this fix justice ;)
145+
146+
### Cumulative Performance Improvements
147+
148+
Though too numerous to mention, there are a lot of small performance improvements, that add up in accumulation. Everything
149+
from improvements in collection performance to slightly better algorithms or specific non-generic algorithms.
150+
151+
### Cumulative Memory Improvements
152+
153+
The real improvements in memory usage, allocation and garbage collection are saved for the next major release. However,
154+
if your grammar is well-formed and does not require almost infinite passes using ALL(*), then both memory and performance
155+
will be improved with this release.
156+
157+
### Bug Fixes
158+
159+
Other small bug fixes have been addressed, such as potential panics in funcs that did not check input parameters. There
160+
are a lot of bug fixes in this release that most people were probably not aware of. All known bugs are fixed at the
161+
time of release preparation.
162+
163+
### A Note on Poorly Constructed Grammars
164+
165+
Though I have made some significant strides on improving the performance of poorly formed grammars, those that are
166+
particularly bad will see much less of an incremental improvement compared to those that are fairly well-formed.
167+
168+
This is deliberately so in this release as I felt that those people who have put in effort to optimize the form of their
169+
grammar are looking for performance, where those that have grammars that parser in seconds, tens of seconds or even
170+
minutes, are presumed to not care about performance.
171+
172+
A particularly good (or bad) example is the MySQL grammar in the ANTLR grammar repository (apologies to the Author
173+
if you read this note - this isn't an attack). Although I have improved its runtime performance
174+
drastically in the Go runtime, it still takes about a minute to parse complex select statements. As it is constructed,
175+
there are no magic answers. I will look in more detail at improvements for such parsers, such as not freeing any
176+
memory until the parse is finished (improved 100x in experiments).
177+
178+
The best advice I can give is to put some effort in to the actual grammar itself. well-formed grammars will potentially
179+
see some huge improvements with this release. Badly formed grammars, not so much.

0 commit comments

Comments
 (0)