DictionaryPartitionLookup
 | Will be released soon!
Currently this partition lookup is only available in the development version. It will be released with version 0.3 of HSCALE. |
The dictionary partition lookup typically is the one to be used in real world scenarios. It combines freely configurable partitions with hashing functions to reduce the number of configured partitions.
This lookup module comes with per table configuration stored in a database and administrative commands to set up partitioned tables and add partitions at runtime.
Hashing functions
Instead of adding a partition for every possible key, you can use hashing functions to group keys together.
| Hashing function |
NONE() |
| Description |
Simply do nothing. |
| Hashing function |
MOD(x) |
| Description |
Perform a modulus x on the given key. With this function you get a fixed number of partitions (x). Only works on numeric keys. |
| Hashing function |
DIV(x) |
| Description |
Divide the key by x. Only works on numeric keys. |
| Hashing function |
PREFIX(x) |
| Description |
Use the first x characters of the key. Works on numeric and string types. |
Administrative SQL commands
Administrative SQL commands can be issued on every HSCALE proxy.
Table setup
To turn a table into a partitioned table you have to make sure that it exists and that it is only used through HSCALE proxies. Then execute the following SQL:
HSCALE SETUP_TABLE(table_name, partition_column, default_table, backend, hashing_function)
| Parameter |
Description |
| table_name |
The name of the table to be partitioned. It has to exist and must not be partitioned yet. |
| partition_column |
The name of the column that is used for partitioning. |
| default_table |
Name of the default table that is used for partition keys that are not assigned to a partition yet. The table table_name will be renamed to default_table. |
| backend |
The backend index the original table is on. |
| hashing_function |
One of the hashing functions (see above). |
Add partitions
Partitions can be added on the fly.
HSCALE ADD_PARTITION(table, partition_name, partition_key, backend);
| Parameter |
Description |
| table_name |
The name of the table. It has to be partitioned. |
| partition_name |
The name of the partition. Usually it is the same as partition_key but you can assign multiple partitions to the same table. This can be used to implement a fine grained partitioning scheme right from the start without using too many partition tables. The partitions can be moved to other tables and backends later on. |
| partition_key |
The value AFTER the hashing function has been applied. Example: Say you use MOD(3) as hashing function. Possible values for partition_key are 0, 1 and 2. Note: It is not checked whether partition_key is within the value range of the hashing function. |
| backend |
The backend index the the partition will be created on. |
This command creates the partition table (named table_partition_name) on the given backend.
Multiple instances of HSCALE and locking
Since you can have multiple instances HSCALE running in parallel on the same data, some sort of "locking" had to be implemented. So executing an administrative SQL command will lock the current HSCALE instance for a couple of seconds (reloadInterval + reloadForceWait, see configuration options). So it is best practice to use a dedicated instance of HSCALE for issuing administrative commands.
Example
# Create table on backend 2
CREATE TABLE user (
name VARCHAR(255) NOT NULL,
age INTEGER
);
# Set up the table to use the PREFIX hashing function
HSCALE SETUP_TABLE('user', 'name', 'user_default', 2, 'PREFIX(3)');
# Add some partitions
# The first two partitions go to the same table ('good_guys')
HSCALE ADD_PARTITION('user', 'good_guys', 'mar', 1);
HSCALE ADD_PARTITION('user', 'good_guys', 'pau', 1);
HSCALE ADD_PARTITION('user', 'dia', 'dia', 3);
# Add data
INSERT INTO user (name, age) VALUES ('mark', 23); # => user_good_guys on backend 1
INSERT INTO user (name, age) VALUES ('maria', 21); # => user_good_guys on backend 1
INSERT INTO user (name, age) VALUES ('diana', 32); # => user_dia on backend 3
INSERT INTO user (name, age) VALUES ('george', 42); # => user_default on backend 2
Auto create partitions
Enabling this feature will let HSCALE automatically create a new partition for every new partition key. The partition will be placed on the backend with the least number of partitions.
Locking and "nowait"
Adding a partition needs to lock the current HSCALE instance (see above). So enabling "auto create partitions" will cause the current HSCALE instance to block every time a new partition is added. This is the default behavior and is completely safe.
If you do not use full partition scans (i.e. queries scanning all partitions like SELECT * FROM table) or can afford to read dirty data for a couple of seconds (reloadInterval + reloadForceWait, see configuration options) you can enable the "nowait" mode. This will create the partition and does not wait until all other instances of HSCALE have refreshed their configuration.
If you only use a single instance of HSCALE then you can always use "nowait".
Configuration options
| Option: |
partitionLookupModule |
| Value: |
optivo.hscale.dictionaryPartitionLookup |
| Required: |
|
| Option: |
configServer |
| Value: |
Connection to the MySQL server holding the HSCALE configuration. Format:
configServer = {
host = "127.0.0.1",
port = "3306",
user = "root",
password = "pass",
db = "hscale_config",
testSql = "A SELECT statement that is used 'ping' the server. Default: SELECT 1"
initSql = {
"SET NAMES utf8",
"another SQL statement"
}
}
initSql is a list of SQL statements that will be executed every time the connection is established.
The user has to have at least CREATE, SELECT and INSERT privileges. |
| Required: |
|
| Option: |
backendAuth |
| Value: |
Credentials that are used to connect to every backend. The user has to have at least CREATE and INSERT privileges. Format:
backendAuth = {
user = "root",
password = "pass",
initSql = {
"SET NAMES utf8",
"another SQL statement"
}
}
initSql is a list of SQL statements that will be executed every time the connection is established. |
| Required: |
|
| Option: |
dictionaryTablePrefix |
| Value: |
Prefix of the internal configuration tables. |
| Default: |
"hscale_dict_" |
| Required: |
|
| Option: |
reloadInterval |
| Value: |
Number of seconds the configuration is cached. If you use a single instance of HSCALE you can set it to high value (600 or more). Configuration reload causes the whole HSCALE instance to block. |
| Default: |
10 seconds |
| Required: |
|
| Option: |
reloadForceWait |
| Value: |
The number of seconds that is waited additionally to reloadInterval whenever the partition configuration is changed. Set to 0 if you are running a single instance of HSCALE. |
| Default: |
2 seconds |
| Required: |
|
| Option: |
autoCreatePartitions |
| Value: |
"on" enables automatic creation of new partitions, "nowait" enables it too but skips locking, nil turns it off. |
| Default: |
nil |
| Required: |
|
| Option: |
forceCreateConfigTables |
| Value: |
true or false/nil. Should HSCALE try to create the internal configuration tables every time the configuration is reloaded? This option is mostly used in test environments. Don't configure it. |
| Default: |
nil |
| Required: |
|