You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
# Versioned Binary Application Record Encoding (VBARE)
2
2
3
-
_Simple schema evoluation with maximum performance_
3
+
_Simple schema evolution with maximum performance_
4
4
5
-
VBARE is a tiny extension to [BARE](https://baremessages.org/) that provides a way of handling schema evoluation.
5
+
VBARE is a tiny extension to [BARE](https://baremessages.org/) that provides a way of handling schema evolution.
6
6
7
7
## Preface: What is BARE?
8
8
@@ -33,57 +33,58 @@ VBARE is a tiny extension to [BARE](https://baremessages.org/) that provides a w
33
33
34
34
Also see the [IETF specification](https://www.ietf.org/archive/id/draft-devault-bare-11.html).
35
35
36
-
## Project goals
36
+
## Project Goals
37
37
38
-
- fast -- self-contained binary encoding, akin to a tuple ->
39
-
- simple -- can rewrite in under an hour
40
-
- portable -- cross-language & well standardized
38
+
**Goals:**
39
+
-**Fast** — Self-contained binary encoding, similar to a tuple structure
40
+
-**Simple** — Can be reimplemented in under an hour
41
+
-**Portable** — Cross-language support with well-defined standardization
41
42
42
-
non-goals:
43
+
**Non-goals:**
44
+
-**Data compactness** — That's what gzip is for
45
+
-**Provide an RPC layer** — This is trivial to implement yourself based on your specific requirements
43
46
44
-
- data compactness -> that's what gzip is for
45
-
- provide an rpc layer -> this is trivial to do yourself based on your specific requirements
46
-
47
-
## Use cases
47
+
## Use Cases
48
48
49
49
- Defining network protocols
50
-
- Storing data at rest that needs to be able to be upgraded
51
-
- Binary data in the database
50
+
- Storing data at rest that needs to be upgradeable:
51
+
- Binary data in databases
52
52
- File formats
53
53
54
-
## At a glance
54
+
## At a Glance
55
55
56
-
- Every message has a version associated with it
57
-
- either pre-negotiated (via something like an http request query parameter/handshake) or embedded int he message itself
56
+
- Every message has a version associated with it, either:
57
+
- Pre-negotiated (via mechanisms like HTTP request query parameters or handshakes)
58
+
- Embedded in the message itself
58
59
- Applications provide functions to upgrade between protocol versions
59
-
- There is no evolution semantics in the schema itself, just copy and paste the schema to write the new one
60
+
- There are no evolution semantics in the schema itself — simply copy and paste the schema to write a new version
60
61
61
-
## evolutino philosophy
62
+
## Evolution Philosophy
62
63
63
-
-declare discrete versions with predefined version indexes
64
-
-manual evolutions simplify the application logic by putting complex defaults in your app code
65
-
-stop making big breaking v1 -> v2 changes, make much smaller changes with more flexibility
66
-
-reshaping structures is important -- not just changing types and names
64
+
-Declare discrete versions with predefined version indexes
65
+
-Manual evolutions simplify application logic by putting complex defaults in your application code
66
+
-Stop making big breaking v1 to v2 changes — instead, make much smaller changes with more flexibility
67
+
-Reshaping structures is important, not just changing types and names
67
68
68
-
## specification
69
+
## Specification
69
70
70
-
### versions
71
+
### Versions
71
72
72
-
each schema version is a monotomically incrementing <TODO: integer type>
73
+
Each schema version is a monotonically incrementing integer. _[TODO: Specify exact integer type]_
73
74
74
-
### embedded version
75
+
### Embedded Version
75
76
76
-
embedded version works by inserting a <TODO: integer type> integer at the beginning of the buffer. this integer is used to define which version of the schema is being used.
77
+
Embedded version works by inserting an integer at the beginning of the buffer. This integer is used to define which version of the schema is being used._[TODO: Specify exact integer type]_
77
78
78
-
the layout looks like this:
79
+
The layout looks like this:
79
80
80
81
```
81
-
TODO
82
+
[TODO: Add layout diagram]
82
83
```
83
84
84
-
### pre-negotiated version
85
+
### Pre-negotiated Version
85
86
86
-
often times, you speicty the protocol version outside of the message iteself. for eaxmple, if making an http request with the version in the path like `POST /v3/users`, we can extract version 3 from the path. in this case, VBARE does not insert a version in to the buffer. for this, vbare simply acts as a simple step function for upgrading/downgrading version data structures.
87
+
Often, you specify the protocol version outside of the message itself. For example, when making an HTTP request with the version in the path like `POST /v3/users`, we can extract version 3 from the path. In this case, VBARE does not insert a version into the buffer. For this use case, VBARE simply acts as a step function for upgrading or downgrading version data structures.
87
88
88
89
## Implementations
89
90
@@ -94,9 +95,9 @@ often times, you speicty the protocol version outside of the message iteself. fo
94
95
95
96
([Full list of BARE implementations](https://baremessages.org/))
96
97
97
-
_Adding an implementation takes less than an hour -- it's really that simple._
98
+
_Adding an implementation takes less than an hour — it's really that simple._
-[Data at rest](https://github.com/rivet-dev/engine/tree/bbdf1c1c49e307ba252186aa4d75a9452d74fca7/sdks/schemas/data)
@@ -109,91 +110,49 @@ _Adding an implementation takes less than an hour -- it's really that simple._
109
110
110
111
## Embedded vs Negotiated Version
111
112
112
-
TODO
113
+
_[TODO: Add detailed comparison]_
114
+
115
+
## Comparison with Other Formats
116
+
117
+
[Read more](./docs/COMPARISON.md)
113
118
114
119
## Clients vs Servers
115
120
116
-
- Only servers need to ahve the evolutions steps
117
-
-clients just send their version
121
+
- Only servers need to have the evolution steps
122
+
-Clients just send their version
118
123
119
124
## Downsides
120
125
121
-
- extensive migration code
122
-
- the older the version the more migration steps (though these migration steps should be effectively free)
123
-
- migration steps are not portable across langauges, but only the server needs to the migration step. so usually this is only done once.
124
-
125
-
## Comparison
126
-
127
-
- Protobuf (versioned: yes)
128
-
- unbelievably poorly designed protocol
129
-
- makes migrations your problem at runtime by making everything optional
130
-
- even worse, makes properties have a default value (ie integers) which leads to subtle bugs with serious concequenses
131
-
- tracking field numbers in a file is a pain in the ass
132
-
- Cap'n'proto (versioned: yes)
133
-
- includes the rpc layer as part of the library, this is out of the scope of what we want in our schema design
134
-
- of the schema languages we evaluated, this provides by far the most flexible schema migrations
135
-
- has poor language support. technically most major languages are supported, but the qulaity of the ipmlementations are lacking. i suspect this is largely due to the complexity of capnproto itself compared to other protocols.
136
-
- generics are cool. but we opt for simplicity with more repetition.
137
-
- the learning curve seems the steepest of any other tool
138
-
- cap'n'web (versioned: no)
139
-
- this is focused on rpc with json. not relevant to what we needed.
140
-
- cbor/messagepack/that mongodb one (versioned: self-describing)
141
-
- does not have a schema, it's completley self-describing
142
-
- requires encoding the entire key, not suitable for our needs
143
-
- Flatbuffers (versioned: yes)
144
-
- intented as a high performance encoding similar to protobuf
145
-
- still uses indexes like protobuf, unless you use structs
146
-
- to achieve what we wanted, we'd have to use just structs
- considered borsh instead of bare, but bare seemed significantly simpler and more focused
158
-
- rust options like postcard/etc (versioned: no)
159
-
- also provides self-contained binary encoding
160
-
- not cross platform
161
-
162
-
other deatils not included in this evaluation:
163
-
- number compression (ie static 64 bits vs using minimal bits)
164
-
- zero-copy ser/de
165
-
- json support & extensions
166
-
- rpc
126
+
- Extensive migration code required
127
+
- The older the version, the more migration steps needed (though these migration steps should be effectively free)
128
+
- Migration steps are not portable across languages, but only the server needs the migration steps, so this is usually only implemented once
167
129
168
130
## FAQ
169
131
170
132
### Why is copying the entire schema for every version better than using decorators for gradual migrations?
171
133
172
-
-decorators are limited and get very complicated
173
-
-it's unclear what version of the protocol a decorator takes effect -- this is helpful
174
-
-generated sdks become more and more bloated with every change
175
-
-you need a validation build step for your validators
176
-
-things you can do with manual migrations
134
+
-Decorators are limited and become very complicated over time
135
+
-It's unclear at what version of the protocol a decorator takes effect — explicit versions help clarify this
136
+
-Generated SDKs become more and more bloated with every change
137
+
-You need a validation build step for your validators
138
+
-Manual migrations provide more flexibility for complex transformations
177
139
178
140
### Why not include RPC?
179
141
180
-
RPC interfaces are trivial to implement yourself. Libraries that provide RPC interfaces tend to add extra bloat & cognitive load over things like abstracting transports, compatibility with the language's async runtime, and complex codegen to implement handlers.
181
-
182
-
Usually, you just want a `ToServer` and `ToClient` union that looks like this: [ToClient example](https://github.com/rivet-dev/rivetkit/blob/b81d9536ba7ccad4449639dd83a770eb7c353617/packages/rivetkit/schemas/client-protocol/v1.bare#L34), [ToServer example](https://github.com/rivet-dev/rivetkit/blob/b81d9536ba7ccad4449639dd83a770eb7c353617/packages/rivetkit/schemas/client-protocol/v1.bare#L56)
142
+
RPC interfaces are trivial to implement yourself. Libraries that provide RPC interfaces tend to add extra bloat and cognitive load through things like abstracting transports, compatibility with the language's async runtime, and complex codegen to implement handlers.
183
143
144
+
Usually, you just want a `ToServer` and `ToClient` union that looks like this:
### Isn't copying the schema going to result in a lot of duplicate code?
186
149
187
-
- yes. after enough pain and suffering of running production APIS, this is what you will end up doing manually, but in a much more painful way.
188
-
- having schema versions also makes it much easier to reason about how clients are connecting to your system/the state of an application. incremental migrations dno't let you consider other properties/structures.
189
-
- this also lets you reshape your structures.
150
+
Yes, but after enough pain and suffering from running production APIs, this is what you will end up doing manually anyway, but in a much more painful way. Having schema versions also makes it much easier to reason about how clients are connecting to your system and the state of an application. Incremental migrations don't let you consider other properties or structures. This approach also lets you reshape your structures more effectively.
190
151
191
152
### Don't migration steps get repetitive?
192
153
193
-
- most of the time, structures will match exactly. most languages can provide a 1:1 migration.
194
-
- the most complicated migration steps will be very deeply nested structures that changed, but that's pretty simple
154
+
Most of the time, structures will match exactly, and most languages can provide a 1:1 migration. The most complicated migration steps will be for deeply nested structures that changed, but even that is relatively straightforward.
- Number compression (e.g., static 64 bits vs using minimal bits)
5
+
- Zero-copy serialization/deserialization
6
+
- JSON support & extensions
7
+
- RPC
8
+
9
+
### Protobuf (versioned: yes)
10
+
- Poorly designed protocol in our opinion
11
+
- Makes migrations your problem at runtime by making everything optional
12
+
- Even worse, properties have default values (e.g., integers) which leads to subtle bugs with serious consequences
13
+
- Tracking field numbers in a file is tedious
14
+
15
+
### Cap'n Proto (versioned: yes)
16
+
- Includes the RPC layer as part of the library, which is outside the scope of what we want in our schema design
17
+
- Of the schema languages we evaluated, this provides by far the most flexible schema migrations
18
+
- Has poor language support — technically most major languages are supported, but the quality of the implementations is lacking. We suspect this is largely due to the complexity of Cap'n Proto itself compared to other protocols
19
+
- Generics are interesting, but we opt for simplicity with more repetition
20
+
- The learning curve seems the steepest of any other tool
21
+
22
+
### Cap'n Web (versioned: no)
23
+
- This is focused on RPC with JSON, which is not relevant to our needs
0 commit comments