-
Notifications
You must be signed in to change notification settings - Fork 129
Bulk Fetch
The TokuDB bulk fetch algorithm speeds up fractal tree scans by returning multiple rows per tree search rather than a single row. This amortizes the cost of a fractal tree search over several rows. We have observed a speedup of between 2x and 5x when using bulk fetch compared to not using it.
The MySQL query executor uses one of several table scanning algorithms to read rows from a table. An index scan is used to retrieve all of the rows from an index. A range scan is used to retrieve all of the rows from an index within a key range. TokuDB can use its bulk fetch algorithm for both of these types of scans. However, TokuDB must be notified by the MySQL query executor that an index or range scan will happen. The prepare_index_scan
handler method informs the storage engine that an index scan will happen. Similarly, the prepare_range_scan
handler method informs the storage engine that a range scan will happen. We placed calls to these methods in the appropriate places of the MySQL query executor code.
- Simple select
select ... from s where ...
- Partition bug documented in #261
- Create select
- Insert select
insert into t select ... from s where ...
- Fixed in #263
- Range delete
delete from t where ...
- Documented in #264
- Range update
update t set ... where ...
- Documented in #265
Bulk fetch should be used for index and range scans on simple tables as well as partitioned tables.
We recently found that bulk fetch is NOT being used for index scans on partitioned TokuDB tables. The bug is described here. Index scans on partitions do not use bulk fetch because the tokudb::prepare_index_scan
method is not currently called by the partition storage engine. We will either (1) call it from the partition storage engine, or (2) have the TokuDB figure out that an index scan is happening and then start bulk fetch.
Why not start the bulk fetch algorithm when the tokudb::index_next
is called and bulk fetch is not yet started?