- 
                Notifications
    You must be signed in to change notification settings 
- Fork 129
A Bug in TokuDB's Hot Expansion of Text and Blob Columns
Some of our customers were seeing MySQL server crashes when accessing some TokuDB tables. The stack traces in the MySQL error log showed a SIGSEGV in the Protocol::net_store_data function, which is used by MySQL to send a queries result set to the MySQL client. Some investigation showed that these crashes occurred on TokuDB tables in which a text column type was expanded. Something was wrong with TokuDB's hot text column expansion. We describe the bug in more detail and what we changed to fix it.
TokuDB stores the values for text and blob columns at the end of each row as a linked list. Each text or blob value is stored with a length followed by its value. If one wants to get the 3rd text or blob value, one needs decode the 1st and 2nd text or blob lengths to find the 3rd one. If the length fields are not coded properly, then the linked list will not be traversed properly. The TokuDB row decoder will pass invalid pointers to MySQL. These invalid pointers will cause MySQL to return the wrong data to the client (if the memory pointer is still mapped into the address space), or crash (if the memory pointer is unmapped).
The hot column expansion code for text and blob columns assumed that the length fields coded for the text and blob columns was a 4 byte integer. Since the length field was assumed to be unchanged by a text or blob column expansion, the TokuDB expansion code did not send an update message into the fractal tree to change it.
Unfortunately, the size of the text and blob column length depends on the type of the text and blob column. For example, a tinytext length field is 1 byte long, a text length field is 2 bytes long, etc. A broadcast update message is needed to change the coding of the expanded text or blob length fields so that they can be decoded properly with the new schema. Since we did not send such a message, the size of the length field in the encoded row did not match the size of the length field used for the new schema. This caused TokuDB's row decoder to break.
We added code in TokuDB's row decoder that verifies the correct coding of the text and blob linked list. If the list can not be decoded, then an error is returned to the MySQL client. We do not have a general purpose change to recover the data for the rows with text or blob columns that can not be decoded. We can make a special TokuDB build that can be used to recover the data for a specific table.
TokuDB's hot text and blob expansion software now sends broadcast update messages into the fractal tree that, when executed, expand the size of the text or blob column's length field in the row's linked list.
- The blob and text expansion feature is disabled in TokuDB 7.0.4. This version detects row's with blobs that can not be decoded and reports an error rather than crashing or returning the wrong data to the client as is the case in prior TokuDB versions.
- The blob and text expansion feature is implemented correctly on the master branch and is scheduled for the next TokuDB release.