According to Wikipedia, "a Fractal Tree index is a tree data structure that keeps data sorted and allows searches and sequential access in the same time as a B-tree but with insertions and deletions that are asymptotically faster than a B-tree."
It is important to mention that Fractal Tree indexes are to some extent based on B-Tree, but with key improvements to enhance performance. So how do Fractal Tree indexes achieve this better performance than B-Tree indexes? The answer lies in I/O. It is well known that for traditional storage engines using B-Tree indexes, their performance is great when their working data set fits in memory. In fact, TokuDB's fractal trees do not have any inherent advantage over InnoDB or other B-Tree based storage engines if data fits in memory.
With B-Tree indexing storage engines like InnoDB, the bottleneck hits when the data is bigger than can fit in main memory. However with TokuDB, performance remains consistence even when the data does not fit main memory anymore. This is basically where Fractal Tree indexing gives TokuDB a performance advantage over InnoDB and other engines using B-Tree indexing. According to the Percona documentation [1], "The innovation of Fractal Tree indexes is that as your working set grows larger than main memory, write performance stays consistent. With each I/O servicing many more writes, Fractal Tree indexes reduce the amount of I/O done by a LARGE factor, thereby eliminating the I/O bottleneck that B-Trees have". Because of this I/O reduction, Fractal Tree indexes don’t require indexes to fit in memory, and TokuDB is able to achieve such high sustained write performance on data that does not fit in memory. This innovation is why Fractal Tree indexes perform so well on write-heavy benchmarks [1].
It is important to note that TokuDB will not take some well running existing system and enhance its performance. TokuDB will ensure that systems will continue to perform well as data grows and eventually no-longer fits into memory.
Below are more results of my findings:
Each Fractal Tree Index is stored in it's own file. The good side about about this is that space allocated to an index is reclaimed back to the filesystem almost instantly when an index is dropped. The bad side about it is that you end up with too many created files in your datadir as demonstrated below. The datadir looks overwhelming compared to what we are used to.
george.chilumbu@db-kdb-slave-5:~$ sudo ls -l /var/lib/mysql/ total 262400 -rw-rw---- 1 mysql mysql 56 Feb 3 13:14 auto.cnf -rw-r----- 1 mysql root 38995 Feb 5 11:38 db-kdb-slave-5.err -rw-rw---- 1 mysql mysql 6 Feb 5 11:38 db-kdb-slave-5.pid -rw-rw---- 1 mysql mysql 2224 Feb 5 11:38 db-kdb-slave-5-slow.log -rw-r--r-- 1 root root 0 Feb 3 13:13 debian-5.6.flag -rw-rw---- 1 mysql mysql 67108864 Feb 5 11:38 ibdata1 -rw-rw---- 1 mysql mysql 67108864 Feb 5 11:38 ibdata2 -rw-rw---- 1 mysql mysql 67108864 Feb 5 11:38 ib_logfile0 -rw-rw---- 1 mysql mysql 67108864 Feb 5 11:38 ib_logfile1 -rw------- 1 mysql mysql 71688 Feb 5 13:02 log000000000011.tokulog29 drwx------ 2 mysql mysql 4096 Feb 3 13:14 mysql -rw-rw---- 1 mysql mysql 174 Feb 5 10:39 mysqld-bin.000001 -rw-rw---- 1 mysql mysql 174 Feb 5 10:46 mysqld-bin.000002 -rw-rw---- 1 mysql mysql 174 Feb 5 11:36 mysqld-bin.000003 -rw-rw---- 1 mysql mysql 1435 Feb 5 13:01 mysqld-bin.000004 -rw-rw---- 1 mysql mysql 80 Feb 5 11:38 mysqld-bin.index -rw------- 1 mysql mysql 11 Feb 3 13:14 mysql_upgrade_info drwx------ 2 mysql mysql 4096 Feb 3 13:14 performance_schema drwx------ 2 mysql mysql 4096 Feb 5 13:01 test -rw-rw---- 1 mysql mysql 32768 Feb 5 13:02 tokudb.directory -rw-rw---- 1 mysql mysql 16384 Feb 3 13:37 tokudb.environment -rw------- 1 mysql mysql 0 Feb 3 13:37 __tokudb_lock_dont_delete_me_data -rw------- 1 mysql mysql 0 Feb 3 13:37 __tokudb_lock_dont_delete_me_environment -rw------- 1 mysql mysql 0 Feb 3 13:37 __tokudb_lock_dont_delete_me_logs -rw------- 1 mysql mysql 0 Feb 3 13:37 __tokudb_lock_dont_delete_me_recovery -rw------- 1 mysql mysql 0 Feb 3 13:37 __tokudb_lock_dont_delete_me_temp -rw-rw---- 1 mysql mysql 32768 Feb 5 13:02 tokudb.rollback george.chilumbu@db-kdb-slave-5:~$
To avoid all these extra files created by TOkuDB and fractal Tree indexes, i have created a second datadir just for files as a result of Fractal indexes. So in the test database, i create a table called tb1 with indexes as demonstrated below and try to see what files will be created by the Fractal Tree indexes.
CREATE TABLE tb1( id NOT NULL AUTO_INCREMENT, name VARCHAR(20) NOT NULL, email VARCHAR(20) NOT NULL, PRIMARY KEY(id), KEY name(name), CLUSTERING KEY email(email) ) ENGINE=TokuDB CHARSET=utf8mb4;
And the new toku_datadir looks like this:
george.chilumbu@db-kdb-slave-5:~$ ls /var/lib/tokudb/ total 136 drwxr-xr-x 2 mysql 4096 Feb 5 13:01 ./ drwxr-xr-x 54 root 4096 Feb 4 19:48 ../ -rw-rw---- 1 mysql 16896 Feb 5 13:01 _test_tb1_key_email_22_3_1d_B_0.tokudb -rw-rw---- 1 mysql 16896 Feb 5 13:01 _test_tb1_key_name_1e_3_1d_B_0.tokudb -rw-rw---- 1 mysql 32768 Feb 5 13:00 _test_tb1_main_12_2_1d.tokudb -rw-rw---- 1 mysql 65536 Feb 5 13:01 _test_tb1_status_12_1_1d.tokudb -rw------- 1 mysql 0 Feb 4 20:02 __tokudb_lock_dont_delete_me_data -rw------- 1 mysql 0 Feb 4 20:02 __tokudb_lock_dont_delete_me_temp george.chilumbu@db-kdb-slave-5:~$
And when you delete the table, the toku_datadir
george.chilumbu@db-kdb-slave-5:~$ ls /var/lib/tokudb/ total 8 drwxr-xr-x 2 mysql 4096 Feb 5 13:40 ./ drwxr-xr-x 54 root 4096 Feb 4 19:48 ../ -rw------- 1 mysql 0 Feb 4 20:02 __tokudb_lock_dont_delete_me_data -rw------- 1 mysql 0 Feb 4 20:02 __tokudb_lock_dont_delete_me_temp
comment:5Changed 8 months ago by george.chilumbu
One of the keys to exploiting TokuDB’s strength in indexing is to make use of clustering secondary indexes. Clustering index store data by key and include the entire document as the payload (rather than pointer to the document). Percona Server parser and query optimizer support Multiple Clustering Keys when TokuDB engine is used. This means that the query optimizer will avoid primary clustered index reads and replace them by secondary clustered index reads in certain scenarios.
To create a primary key on a TOkuDB table, you can do
Since the WHERE clause is on the clustering index, and includes the column 'name', the query will perform super fast as it avoids lookups in the primary table to satisfy the query. TokuDB makes clustering indexes feasible because of its excellent compression and very high indexing rates. Some benchmarks to see if clustering indexes + high compression would increase server performance as far as reads are concerned would be something to consider.
We already know that Fractal Tree are all about start fast, stay fast as opposed to B-Tree which start fast, and slow down as data grows, especially if it grows more than the size of available memory. We also know that Fractal-Tree reduce IO significantly.
To create a primary key on a TOkuDB table, you can do
mysql> CREATE TABLE tb1 ( -> id INT NOT NULL auto_increment, -> name VARCHAR(20) NOT NULL, -> email VARCHAR(64) NOT NULL, -> PRIMARY KEY(id), -> CLUSTERING KEY (email)) ENGINE = TokuDB; Query OK, 0 rows affected (0.02 sec) mysql> show create table tb1\G *************************** 1. row *************************** Table: tb1 Create Table: CREATE TABLE `tb1` ( `id` int(11) NOT NULL AUTO_INCREMENT, `name` varchar(20) NOT NULL, `email` varchar(64) NOT NULL, PRIMARY KEY (`id`), CLUSTERING KEY `email` (`email`) ) ENGINE=TokuDB DEFAULT CHARSET=utf8 1 row in set (0.00 sec)The primary table is indexed on column id. But as we see, there is a secondary clustering index on column email. Unlike non-clustered indexes, clustering indexes include all the columns of a table and can be used as covering indexes. So the query below would run much faster on a clustering index on column email:
SELECT name FROM tb1 WHERE email IN ('johndoe@gmail.com','anndoe@yahoo.com','pgriffin@hotmail.com');
Since the WHERE clause is on the clustering index, and includes the column 'name', the query will perform super fast as it avoids lookups in the primary table to satisfy the query. TokuDB makes clustering indexes feasible because of its excellent compression and very high indexing rates. Some benchmarks to see if clustering indexes + high compression would increase server performance as far as reads are concerned would be something to consider.
We already know that Fractal Tree are all about start fast, stay fast as opposed to B-Tree which start fast, and slow down as data grows, especially if it grows more than the size of available memory. We also know that Fractal-Tree reduce IO significantly.
comment:6Changed 8 months ago by george.chilumbu
One cool feature with TOkuDB is the ability to add indexes to an existing table without locking. This feature is referred to as Hot Index creation. With Hot Index creation, inserts and queries would continue to work on that table as normal while the index is being created. But for Hot Index creation to work, the tokudb_create_index_online variables has to be enabled in the my.cnf file, or temporarily using the following command:
How fast it takes adding an index using the Hot creating an index depends on how busy the mysqld server is with other tasks. You can use the SHOW PROCESSLIST command to see the progress of the hot index creation.
If more than one hot CREATE INDEX is issued for a particular table, the indexes will be created serially. An index creation that is waiting for another to complete will be shown as Locked in SHOW PROCESSLIST. It is recommended that each CREATE INDEX be allowed to complete before the next one is started [1].
CONCLUSION:
using pt-online-schema-change also does not block the table. The above was just to highlight that an internal feature exists within TokuDB that allows adding an index without blocking the table. But with InnoDB tables, i always use pt-online-schema-change and it works great for altering tables, especially large tables.
I noticed that TokuDB does not pre-allocate memory to buffer pools as InnoDB does. The reason is that InnoDB nodes/pages are fixed size and map directly to disk block of the same size within the InnoDB data files. An InnoDB node/page never moves within the data file and can be very easily calculated exactly where it will reside within the file.
TokuDB is quite different in that its nodes can vary in their physical size and may even (temporarily) exceed the defined node size for the indices. The TokuDB data files operate much like dynamic memory complete with dead/unused space within and nodes have no permanent location within the file. The TokuDB cache is allocated as needed as TokuDB nodes and node partitions (message buffers, basement nodes) are loaded into memory and evicted.
REFERENCES:
[1] https://www.percona.com/doc/percona-server/5.6/tokudb/using_tokudb.html#compression-details
mysql> SET tokudb_create_index_online=on;Hot index creation is invoked using the CREATE INDEX command after setting tokudb_create_index_online as demonstrated below:
mysql> show create table tb1\G *************************** 1. row *************************** Table: tb1 Create Table: CREATE TABLE `tb1` ( `id` int(11) NOT NULL AUTO_INCREMENT, `name` varchar(20) NOT NULL, `email` varchar(64) NOT NULL, PRIMARY KEY (`id`), CLUSTERING KEY `email` (`email`) ) ENGINE=TokuDB DEFAULT CHARSET=utf8 1 row in set (0.00 sec) mysql> CREATE INDEX name ON tb1 (name); Query OK, 0 rows affected (0.63 sec) Records: 0 Duplicates: 0 Warnings: 0 mysql> show create table tb1\G *************************** 1. row *************************** Table: tb1 Create Table: CREATE TABLE `tb1` ( `id` int(11) NOT NULL AUTO_INCREMENT, `name` varchar(20) NOT NULL, `email` varchar(64) NOT NULL, PRIMARY KEY (`id`), CLUSTERING KEY `email` (`email`), KEY `name` (`name`) ) ENGINE=TokuDB DEFAULT CHARSET=utf8 1 row in set (0.00 sec)Using the alter table statement puts a lock on the table while creating an index, and would make a table unavailable for a long period of time if the table has a large data set.
How fast it takes adding an index using the Hot creating an index depends on how busy the mysqld server is with other tasks. You can use the SHOW PROCESSLIST command to see the progress of the hot index creation.
If more than one hot CREATE INDEX is issued for a particular table, the indexes will be created serially. An index creation that is waiting for another to complete will be shown as Locked in SHOW PROCESSLIST. It is recommended that each CREATE INDEX be allowed to complete before the next one is started [1].
CONCLUSION:
using pt-online-schema-change also does not block the table. The above was just to highlight that an internal feature exists within TokuDB that allows adding an index without blocking the table. But with InnoDB tables, i always use pt-online-schema-change and it works great for altering tables, especially large tables.
I noticed that TokuDB does not pre-allocate memory to buffer pools as InnoDB does. The reason is that InnoDB nodes/pages are fixed size and map directly to disk block of the same size within the InnoDB data files. An InnoDB node/page never moves within the data file and can be very easily calculated exactly where it will reside within the file.
TokuDB is quite different in that its nodes can vary in their physical size and may even (temporarily) exceed the defined node size for the indices. The TokuDB data files operate much like dynamic memory complete with dead/unused space within and nodes have no permanent location within the file. The TokuDB cache is allocated as needed as TokuDB nodes and node partitions (message buffers, basement nodes) are loaded into memory and evicted.
REFERENCES:
[1] https://www.percona.com/doc/percona-server/5.6/tokudb/using_tokudb.html#compression-details
[2] https://www.percona.com/blog/2013/07/02/tokumx-fractal-treer-indexes-what-are-they/
No comments:
Post a Comment