clickhouse MaterializeMySQL 同步 mysql 数据

张映 发表于 2021-08-27

分类目录: clickhouse

标签:, ,

clickhouse不支持自增ID,primary key可以重复。这对于习惯了mysql的人来说,肯定很不爽。如果写入到mysql,mysql同步数据到clickhouse,然后从clickhouse来读,这样就很完美了。

1,MaterializeMySQL存储引擎,从20.8.*才开始支持该引擎

testjian :] select version();

SELECT version()

┌─version()─┐
│ 20.8.3.18 │
└───────────┘

2,配置mysql,支持同步

# vim /etc/my.cnf  //改完重启

server-id=400101
log-bin=mysql-bin
gtid_mode = on
enforce_gtid_consistency = on
binlog_format = ROW

gtid_mode = on
enforce_gtid_consistency = on
binlog_format = ROW

mysql> show global variables like '%gtid%';  //查看配置
select @@binlog_format;   //查看配置

3,配置clickhouse allow_experimental_database_materialize_mysql

testtiflash :] set allow_experimental_database_materialize_mysql = 1;

//查看
testtiflash :] select value from system.settings where name = 'allow_experimental_database_materialize_mysql';

SELECT value
FROM system.settings
WHERE name = 'allow_experimental_database_materialize_mysql'

┌─value─┐
│ 1 │
└───────┘

1 rows in set. Elapsed: 0.001 sec.

注意,select @@allow_experimental_database_materialize_mysql;查看一直都是0,如果不改会报以下错误

Code: 336, e.displayText() = DB::Exception: MaterializeMySQL is an experimental database engine. Enable allow_experimental_database_materialize_mysql to use it. (version 20.8.3.18)

4,mysql准备测试数据

create database test;

CREATE TABLE `tank` (
 `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
 `name` varchar(30) NULL COMMENT '姓名',
 `create_time` datetime NOT NULL DEFAULT CURRENT_TIMESTAMP COMMENT '创建时间',
 PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_bin AUTO_INCREMENT=1 COMMENT='测试';

insert into test.tank (name) values('tank');
insert into test.tank (name) values('tank1');
insert into test.tank (name) values('tank2');
insert into test.tank (name) values('tank3');

5,clickhouse中创建同步表

testtiflash :] CREATE DATABASE test ON CLUSTER clickhouse_test_netjoy ENGINE = MaterializeMySQL('10.0.40.200:3306', 'test', 'usename', 'password');

CREATE DATABASE test ON CLUSTER clickhouse_test_netjoy
ENGINE = MaterializeMySQL('10.0.40.200:3306', 'test', 'usename', 'password')

┌─host───────┬─port─┬─status─┬─error─┬─num_hosts_remaining─┬─num_hosts_active─┐
│ 10.0.55.17 │ 9000 │      0 │       │                   3 │                1 │
│ 10.0.10.23 │ 9000 │      0 │       │                   2 │                1 │
│ 10.0.10.24 │ 9000 │      0 │       │                   1 │                1 │
└────────────┴──────┴────────┴───────┴─────────────────────┴──────────────────┘
┌─host───────┬─port─┬─status─┬─error─┬─num_hosts_remaining─┬─num_hosts_active─┐
│ 10.0.55.16 │ 9000 │      0 │       │                   0 │                0 │
└────────────┴──────┴────────┴───────┴─────────────────────┴──────────────────┘

4 rows in set. Elapsed: 0.466 sec.

MaterializeMySQL语法:

CREATE DATABASE [IF NOT EXISTS] db_name [ON CLUSTER cluster]
ENGINE = MaterializeMySQL('host:port', ['database' | database], 'user', 'password')
SETTINGS
include_tables ='a,b,c...',
exclude_tables ='e,f,g...',

结果:增,删,改都能同步,增加表,都能同步。将test同步到4个节点,每个节点数据都一样,这样就没什么意思了。如果用MaterializeMySQL引擎,就不能发挥clickhouse集群优势。MaterializeMySQL新的特性,稳定性有待考查。

 

 



转载请注明
作者:海底苍鹰
地址:http://blog.51yip.com/clickhouse/2529.html