|
| 1 | +# Changes to the Go Runtime over time |
| 2 | + |
| 3 | +## v4.12.0 to v4.12.1 |
| 4 | + |
| 5 | +Strictly speaking, if ANTLR was a go only project following [SemVer](https://semver.org/) release v4.12.1 would be |
| 6 | +at least a minor version change and arguably a bump to v5. However, we must follow the ANTLR conventions here or the |
| 7 | +release numbers would quickly become confusing. I apologize for being unable to follow the Go release rules absolutely |
| 8 | +to the letter. |
| 9 | + |
| 10 | +There are a lot of changes and improvements in this release, but only the change of repo holding the runtime code, |
| 11 | +and possibly the removal of interfaces will cause any code changes. There are no breaking changes to the runtime |
| 12 | +interfaces. |
| 13 | + |
| 14 | +ANTLR Go Maintainer: [Jim Idle ](https://github.com/jimidle) - Email: [[email protected]](mailto:[email protected]) |
| 15 | + |
| 16 | +### Code Relocation |
| 17 | + |
| 18 | +For complicated reasons, including not breaking the builds of some users who use a monorepo and eschew modules, as well |
| 19 | +as not making substantial changes to the internal test suite, the Go runtime code will continue to be maintained in |
| 20 | +the main ANTLR4 repo `antlr/antlr4`. If you wish to contribute changes to the Go runtime code, please continue to submit |
| 21 | +PRs to this main repo, against the `dev` branch. |
| 22 | + |
| 23 | +The code located in the main repo at about the depth of the Mariana Trench, means that the go tools cannot reconcile |
| 24 | +the module correctly. After some debate, it was decided that we would create a dedicated release repo for the Go runtime |
| 25 | +so that it will behave exactly as the Go tooling expects. This repo is auto-maintained and keeps both the dev and master |
| 26 | +branches up to date. |
| 27 | + |
| 28 | +Henceforth, all future projects using the ANTLR Go runtime, should import as follows: |
| 29 | + |
| 30 | +```go |
| 31 | +import ( |
| 32 | + "github.com/antlr4-go/antlr/v4" |
| 33 | + ) |
| 34 | +``` |
| 35 | + |
| 36 | +And use the command: |
| 37 | + |
| 38 | +```shell |
| 39 | +go get github.com/antlr4-go/antlr |
| 40 | +``` |
| 41 | + |
| 42 | +To get the module - `go mod tidy` is probably the best way once imports have been changed. |
| 43 | + |
| 44 | +Please note that there is no longer any source code kept in the ANTLR repo under `github.com/antlr/antlr4/runtime/Go/antlr`. |
| 45 | +If you are using the code without modules, then sync the code from the new release repo. |
| 46 | + |
| 47 | +### Documentation |
| 48 | + |
| 49 | +Prior to this release, the godocs were essentially unusable as the go doc code was essentially copied without |
| 50 | +change, from teh Java runtime. The godocs are now properly formatted for Go and pkg.dev. |
| 51 | + |
| 52 | +Please feel free to raise an issue if you find any remaining mistakes. Or submit a PR (remember - not to the new repo). |
| 53 | +It is expected that it might take a few iterations to get the docs 100% squeaky clean. |
| 54 | + |
| 55 | +### Removal of Unnecessary Interfaces |
| 56 | + |
| 57 | +The Go runtime was originally produced as almost a copy of the Java runtime but with go syntax. This meant that everything |
| 58 | +had an interface. There is no need to use interfaces in Go if there is only ever going to be one implementation of |
| 59 | +some struct and its methods. Interfaces cause an extra deference at runtime and are detrimental to performance if you |
| 60 | +are trying to squeeze out every last nanosecond, which some users will be trying to do. |
| 61 | + |
| 62 | +This is 99% an internal refactoring of the runtime with no outside effects to the user. |
| 63 | + |
| 64 | +### Generated Recognizers Return *struct and not Interfaces |
| 65 | + |
| 66 | +The generated recognizer code generated an interface for the parsers and lexers. As they can only be implemented by the |
| 67 | +generated code, the interfaces were removed. This is possibly the only place you may need to make a code change to |
| 68 | +your driver code. |
| 69 | + |
| 70 | +If your code looked like this: |
| 71 | + |
| 72 | +```go |
| 73 | +var lexer = parser.NewMySqlLexer(nil) |
| 74 | +var p = parser.NewMySqlParser(nil) |
| 75 | +``` |
| 76 | + |
| 77 | +Or this: |
| 78 | + |
| 79 | +```go |
| 80 | +lexer := parser.NewMySqlLexer(nil) |
| 81 | +p := parser.NewMySqlParser(nil) |
| 82 | +``` |
| 83 | + |
| 84 | +Then no changes need to be made. However, fi you predeclared the parser and lexer variables with there type, such as like |
| 85 | +this: |
| 86 | + |
| 87 | +```go |
| 88 | +var lexer parser.MySqlLexer |
| 89 | +var p parser.MySqlParser |
| 90 | +// ... |
| 91 | +lexer = parser.NewMySqlLexer(nil) |
| 92 | +p = parser.NewMySqlParser(nil) |
| 93 | +``` |
| 94 | + |
| 95 | +You will need to change your variable declarations to pointers (note the introduction of the `*` below. |
| 96 | + |
| 97 | +```go |
| 98 | +var lexer *parser.MySqlLexer |
| 99 | +var p *parser.MySqlParser |
| 100 | +// ... |
| 101 | +lexer = parser.NewMySqlLexer(nil) |
| 102 | +p = parser.NewMySqlParser(nil) |
| 103 | +``` |
| 104 | + |
| 105 | +This is the only user facing change that I can see. This change though has a very beneficial side effect in that you |
| 106 | +no longer need to cast the interface into a struct so that you can access methods and data within it. Any code you |
| 107 | +had that needed to do that, will be cleaner and faster. |
| 108 | + |
| 109 | +The performance improvement is worth the change and there was no tidy way for me to avoid it. |
| 110 | + |
| 111 | +### Parser Error Recovery Does Not Use Panic |
| 112 | + |
| 113 | +THe generated parser code was again essentially trying to be Java code in disguise. This meant that every parser rule |
| 114 | +executed a `defer {}` and a `recover()`, even if there wer no outstanding parser errors. Parser errors were issued by |
| 115 | +issuing a `panic()`! |
| 116 | + |
| 117 | +While some major work has been performed in the go compiler and runtime to make `defer {}` as fast as possible, |
| 118 | +`recover()` is (relatively) slow as it is not meant to be used as a general error mechanism, but to recover from say |
| 119 | +an internal library problem if that problem can be recovered to a known state. |
| 120 | + |
| 121 | +The generated code now stores a recognition error and a flag in the main parser struct and use `goto` to exit the |
| 122 | +rule instead of a `panic()`. As might be imagined, this is significantly faster through the happy path. It is also |
| 123 | +faster at generating errors. |
| 124 | + |
| 125 | +The ANTLR runtime tests do check error raising and recovery, but if you find any differences in the error handling |
| 126 | +behavior of your parsers, please raise an issue. |
| 127 | + |
| 128 | +### Reduction in use of Pointers |
| 129 | + |
| 130 | +Certain internal structs, such as interval sets are small and immutable, but were being passed around as pointers |
| 131 | +anyway. These have been change to use copies, and resulted in significant performance increases in some cases. |
| 132 | +There is more work to come in this regard. |
| 133 | + |
| 134 | +### ATN Deserialization |
| 135 | + |
| 136 | +When the ATN and associated structures are deserialized for the first time, there was a bug that caused a needed |
| 137 | +optimization to fail to be executed. This could have a significant performance effect on recognizers that were written |
| 138 | +in a suboptimal way (as in poorly formed grammars). This is now fixed. |
| 139 | + |
| 140 | +### Prediction Context Caching was not Working |
| 141 | + |
| 142 | +This has a massive effect when reusing a parser for a second and subsequent run. The PredictionContextCache merely |
| 143 | +used memory but did not speed up subsequent executions. This is now fixed, and you should see a big difference in |
| 144 | +performance when reusing a parser. This single paragraph does not do this fix justice ;) |
| 145 | + |
| 146 | +### Cumulative Performance Improvements |
| 147 | + |
| 148 | +Though too numerous to mention, there are a lot of small performance improvements, that add up in accumulation. Everything |
| 149 | +from improvements in collection performance to slightly better algorithms or specific non-generic algorithms. |
| 150 | + |
| 151 | +### Cumulative Memory Improvements |
| 152 | + |
| 153 | +The real improvements in memory usage, allocation and garbage collection are saved for the next major release. However, |
| 154 | +if your grammar is well-formed and does not require almost infinite passes using ALL(*), then both memory and performance |
| 155 | +will be improved with this release. |
| 156 | + |
| 157 | +### Bug Fixes |
| 158 | + |
| 159 | +Other small bug fixes have been addressed, such as potential panics in funcs that did not check input parameters. There |
| 160 | +are a lot of bug fixes in this release that most people were probably not aware of. All known bugs are fixed at the |
| 161 | +time of release preparation. |
| 162 | + |
| 163 | +### A Note on Poorly Constructed Grammars |
| 164 | + |
| 165 | +Though I have made some significant strides on improving the performance of poorly formed grammars, those that are |
| 166 | +particularly bad will see much less of an incremental improvement compared to those that are fairly well-formed. |
| 167 | + |
| 168 | +This is deliberately so in this release as I felt that those people who have put in effort to optimize the form of their |
| 169 | +grammar are looking for performance, where those that have grammars that parser in seconds, tens of seconds or even |
| 170 | +minutes, are presumed to not care about performance. |
| 171 | + |
| 172 | +A particularly good (or bad) example is the MySQL grammar in the ANTLR grammar repository (apologies to the Author |
| 173 | +if you read this note - this isn't an attack). Although I have improved its runtime performance |
| 174 | +drastically in the Go runtime, it still takes about a minute to parse complex select statements. As it is constructed, |
| 175 | +there are no magic answers. I will look in more detail at improvements for such parsers, such as not freeing any |
| 176 | +memory until the parse is finished (improved 100x in experiments). |
| 177 | + |
| 178 | +The best advice I can give is to put some effort in to the actual grammar itself. well-formed grammars will potentially |
| 179 | +see some huge improvements with this release. Badly formed grammars, not so much. |
0 commit comments