背景
关于 clickhouse 是什么请自行查阅官网:
https://clickhouse.yandex/
clickhouse 官方性能测试:
https://clickhouse.yandex/benchmark.html
单机部署(以前的文章也有写过单机部署)
参考:
https://clickhouse.yandex/docs/en/getting_started/ https://github.com/Altinity/clickhouse-rpm-install
测试环境:
操作系统:centos 7.1
安装服务:clickhouse,mysql
安装 mysql 是测试 clickhouse 从 mysql 同步数据。
clickhouse 安装:
[root@ck-server-01 sync]# clickhouse-client -h 127.0.0.1ClickHouse client version 19.9.2.4.Connecting to 127.0.0.1:9000 as user default.Connected to ClickHouse server version 19.9.2 revision 54421.ck-server-01 :) show databases;SHOW DATABASES┌─name────┐│ default ││ system │└─────────┘2 rows in set. Elapsed: 0.003 sec.
补充一点,在官方的文档里面有几点建议:
关闭大页
调整内存使用
关闭 cpu 节能模式
echo 'performance' | sudo tee /sys/devices/system/cpu/cpu*/cpufreq/scaling_governorecho 0 > /proc/sys/vm/overcommit_memoryecho 'never' > /sys/kernel/mm/transparent_hugepage/enabled
mysql 部署请自行部署。
安装同步程序依赖的包
yum -y install pypy-libs pypy pypy-develwget https://bootstrap.pypa.io/get-pip.pypypy get-pip.py/usr/lib64/pypy-5.0.1/bin/pip install MySQL-python/usr/lib64/pypy-5.0.1/bin/pip install mysql-replication/usr/lib64/pypy-5.0.1/bin/pip install clickhouse-driver/usr/lib64/pypy-5.0.1/bin/pip install redis
proxysql 安装及配置
mysql -uadmin -padmin -h127.0.0.1 -P6032INSERT INTO clickhouse_users VALUES ('clicku','clickp',1,100);LOAD CLICKHOUSE USERS TO RUNTIME;SAVE CLICKHOUSE USERS TO DISK;
[root@ck-server-01 sync]# mysql -u clicku -pclickp -h 127.0.0.1 -P6090mysql: [Warning] Using a password on the command line interface can be insecure.Welcome to the MySQL monitor. Commands end with ; or \g.Your MySQL connection id is 28356Server version: 5.5.30 (ProxySQL ClickHouse Module)Copyright (c) 2000, 2019, Oracle and/or its affiliates. All rights reserved.Oracle is a registered trademark of Oracle Corporation and/or itsaffiliates. Other names may be trademarks of their respectiveowners.Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.mysql> show databases;+---------+| name |+---------+| default || system |+---------+
mysql 同步数据到 clickhouse
mysql> use yayun;Database changedmysql> show create table tb1\G*************************** 1. row ***************************Table: tb1Create Table: CREATE TABLE `tb1` (`id` int(10) unsigned NOT NULL AUTO_INCREMENT,`pay_money` decimal(20,2) NOT NULL DEFAULT '0.00',`pay_day` date NOT NULL,`pay_time` datetime NOT NULL DEFAULT '0000-00-00 00:00:00',PRIMARY KEY (`id`)) ENGINE=InnoDB DEFAULT CHARSET=utf8mb41 row in set (0.00 sec)1. clickhouse 里面建库,建表。
ck-server-01 :) create database yayun;CREATE DATABASE yayunOk.0 rows in set. Elapsed: 0.021 sec.
GRANT REPLICATION SLAVE, REPLICATION CLIENT, SELECT ON *.* TO 'ch_repl'@'127.0.0.1' identified by '123';
ck-server-01 :) use yayun;USE yayunOk.0 rows in set. Elapsed: 0.001 sec.ck-server-01 :) CREATE TABLE tb1:-] ENGINE = MergeTree:-] PARTITION BY toYYYYMM(pay_time):-] ORDER BY (pay_time) AS:-] SELECT *:-] FROM mysql('127.0.0.1:3306', 'yayun', 'tb1', 'ch_repl', '123') ;CREATE TABLE tb1ENGINE = MergeTreePARTITION BY toYYYYMM(pay_time)ORDER BY pay_time ASSELECT *FROM mysql('127.0.0.1:3306', 'yayun', 'tb1', 'ch_repl', '123')Ok.0 rows in set. Elapsed: 0.031 sec.
ck-server-01 :) show create table tb1;SHOW CREATE TABLE tb1┌─statement────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐│ CREATE TABLE yayun.tb1 (`id` UInt32, `pay_money` String, `pay_day` Date, `pay_time` DateTime) ENGINE = MergeTree PARTITION BY toYYYYMM(pay_time) ORDER BY pay_time SETTINGS index_granularity = 8192 │└──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘1 rows in set. Elapsed: 0.002 sec.
[root@ck-server-01 sync]# cat metainfo.conf# 从这里同步数据[master_server]host='127.0.0.1'port=3306user='ch_repl'passwd='123'server_id=101# redis配置信息,用于存放pos点[redis_server]host='127.0.0.1'port=6379passwd='12345'log_pos_prefix='log_pos_'#把log_position记录到文件[log_position]file='./repl_pos.log'# ch server信息,数据同步以后写入这里[clickhouse_server]host=127.0.0.1port=9000passwd=''user='default'#字段大小写. 1是大写,0是小写column_lower_upper=0# 需要同步的数据库[only_schemas]schemas='yayun'# 需要同步的表[only_tables]tables='tb1'# 指定库表跳过DML语句(update,delete可选)[skip_dmls_sing]skip_delete_tb_name = ''skip_update_tb_name = ''#跳过所有表的DML语句(update,delete可选)[skip_dmls_all]#skip_type = 'delete'#skip_type = 'delete,update'skip_type = ''[bulk_insert_nums]#多少记录提交一次insert_nums=10#选择每隔多少秒同步一次,负数表示不启用,单位秒interval=60# 同步失败告警收件人[failure_alarm]mail_host= 'xxx'mail_port= 25mail_user= 'xxx'mail_pass= 'xxx'mail_send_from = 'xxx'alarm_mail = 'xxx'#日志存放路径[repl_log]log_dir="/tmp/relication_mysql_clickhouse.log"
设置 pos 点:
mysql> show master status;+------------------+----------+--------------+------------------+-------------------+| File | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set |+------------------+----------+--------------+------------------+-------------------+| mysql-bin.000069 | 4024404 | | | |+------------------+----------+--------------+------------------+-------------------+1 row in set (0.00 sec)
[root@ck-server-01 sync]# cat repl_pos.log[log_position]filename = mysql-bin.000069position = 4024404
启动同步程序:
[root@ck-server-01 sync]# pypy mysql-clickhouse-replication.py --helpusage: Data Replication to clikhouse [-h] [-c CONF] [-d] [-l]mysql data is copied to clikhouseoptional arguments:-h, --help show this help message and exit-c CONF, --conf CONF Data synchronization information file-d, --debug Display SQL information-l, --logtoredis log position to redis ,default fileBy dengyayun @2019[root@ck-server-01 sync]#默认 pos 点就是记录文件,无需再指定记录 binlog pos 方式
[root@ck-server-01 sync]# pypy mysql-clickhouse-replication.py --conf metainfo.conf --debug11:59:54 INFO 开始同步数据时间 2019-07-17 11:59:5411:59:54 INFO 从服务器 127.0.0.1:3306 同步数据11:59:54 INFO 读取binlog: mysql-bin.000069:402440411:59:54 INFO 同步到clickhouse server 127.0.0.1:900011:59:54 INFO 同步到clickhouse的数据库: ['yayun']11:59:54 INFO 同步到clickhouse的表: ['tb1']
mysql 插入 10 条数据:
mysql> insert into tb1 (pay_money,pay_day,pay_time)values('66.22','2019-06-29','2019-06-29 14:00:00'),('66.22','2019-06-29','2019-06-29 14:00:00'),('66.22','2019-06-29','2019-06-29 14:00:00'),('66.22','2019-06-29','2019-06-29 14:00:00'),('66.22','2019-06-29','2019-06-29 14:00:00'),('66.22','2019-06-29','2019-06-29 14:00:00'),('66.22','2019-06-29','2019-06-29 14:00:00'),('66.22','2019-06-29','2019-06-29 14:00:00'),('66.22','2019-06-29','2019-06-29 14:00:00'),('66.22','2019-06-29','2019-06-29 14:00:00') ;Query OK, 10 rows affected (0.01 sec)Records: 10 Duplicates: 0 Warnings: 0mysql> select * from tb1;+----+-----------+------------+---------------------+| id | pay_money | pay_day | pay_time |+----+-----------+------------+---------------------+| 1 | 66.22 | 2019-06-29 | 2019-06-29 14:00:00 || 3 | 66.22 | 2019-06-29 | 2019-06-29 14:00:00 || 5 | 66.22 | 2019-06-29 | 2019-06-29 14:00:00 || 7 | 66.22 | 2019-06-29 | 2019-06-29 14:00:00 || 9 | 66.22 | 2019-06-29 | 2019-06-29 14:00:00 || 11 | 66.22 | 2019-06-29 | 2019-06-29 14:00:00 || 13 | 66.22 | 2019-06-29 | 2019-06-29 14:00:00 || 15 | 66.22 | 2019-06-29 | 2019-06-29 14:00:00 || 17 | 66.22 | 2019-06-29 | 2019-06-29 14:00:00 || 19 | 66.22 | 2019-06-29 | 2019-06-29 14:00:00 |
同步程序日志输出:
[root@ck-server-01 sync]# pypy mysql-clickhouse-replication.py --conf metainfo.conf --debug12:12:09 INFO 开始同步数据时间 2019-07-17 12:12:0912:12:09 INFO 从服务器 127.0.0.1:3306 同步数据12:12:09 INFO 读取binlog: mysql-bin.000069:402440412:12:09 INFO 同步到clickhouse server 127.0.0.1:900012:12:09 INFO 同步到clickhouse的数据库: ['yayun']12:12:09 INFO 同步到clickhouse的表: ['tb1']12:12:09 INFO INSERT 数据插入SQL: INSERT INTO yayun.tb1 VALUES, [{u'id': 1, u'pay_money': '66.22', u'pay_day': datetime.date(2019, 6, 29), u'pay_time': datetime.datetime(2019, 6, 29, 14, 0)}, {u'id': 3, u'pay_money': '66.22', u'pay_day': datetime.date(2019, 6, 29), u'pay_time': datetime.datetime(2019, 6, 29, 14, 0)}, {u'id': 5, u'pay_money': '66.22', u'pay_day': datetime.date(2019, 6, 29), u'pay_time': datetime.datetime(2019, 6, 29, 14, 0)}, {u'id': 7, u'pay_money': '66.22', u'pay_day': datetime.date(2019, 6, 29), u'pay_time': datetime.datetime(2019, 6, 29, 14, 0)}, {u'id': 9, u'pay_money': '66.22', u'pay_day': datetime.date(2019, 6, 29), u'pay_time': datetime.datetime(2019, 6, 29, 14, 0)}, {u'id': 11, u'pay_money': '66.22', u'pay_day': datetime.date(2019, 6, 29), u'pay_time': datetime.datetime(2019, 6, 29, 14, 0)}, {u'id': 13, u'pay_money': '66.22', u'pay_day': datetime.date(2019, 6, 29), u'pay_time': datetime.datetime(2019, 6, 29, 14, 0)}, {u'id': 15, u'pay_money': '66.22', u'pay_day': datetime.date(2019, 6, 29), u'pay_time': datetime.datetime(2019, 6, 29, 14, 0)}, {u'id': 17, u'pay_money': '66.22', u'pay_day': datetime.date(2019, 6, 29), u'pay_time': datetime.datetime(2019, 6, 29, 14, 0)}, {u'id': 19, u'pay_money': '66.22', u'pay_day': datetime.date(2019, 6, 29), u'pay_time': datetime.datetime(2019, 6, 29, 14, 0)}]
clickhouse 数据查询:
ck-server-01 :) select * from tb1;SELECT *FROM tb1┌─id─┬─pay_money─┬────pay_day─┬────────────pay_time─┐│ 1 │ 66.22 │ 2019-06-29 │ 2019-06-29 14:00:00 ││ 3 │ 66.22 │ 2019-06-29 │ 2019-06-29 14:00:00 ││ 5 │ 66.22 │ 2019-06-29 │ 2019-06-29 14:00:00 ││ 7 │ 66.22 │ 2019-06-29 │ 2019-06-29 14:00:00 ││ 9 │ 66.22 │ 2019-06-29 │ 2019-06-29 14:00:00 ││ 11 │ 66.22 │ 2019-06-29 │ 2019-06-29 14:00:00 ││ 13 │ 66.22 │ 2019-06-29 │ 2019-06-29 14:00:00 ││ 15 │ 66.22 │ 2019-06-29 │ 2019-06-29 14:00:00 ││ 17 │ 66.22 │ 2019-06-29 │ 2019-06-29 14:00:00 ││ 19 │ 66.22 │ 2019-06-29 │ 2019-06-29 14:00:00 │└────┴───────────┴────────────┴─────────────────────┘10 rows in set. Elapsed: 0.005 sec.
mysql 数据更新:
mysql> update tb1 set pay_money='88.88';Query OK, 10 rows affected (0.00 sec)Rows matched: 10 Changed: 10 Warnings: 0mysql> select * from tb1;+----+-----------+------------+---------------------+| id | pay_money | pay_day | pay_time |+----+-----------+------------+---------------------+| 1 | 88.88 | 2019-06-29 | 2019-06-29 14:00:00 || 3 | 88.88 | 2019-06-29 | 2019-06-29 14:00:00 || 5 | 88.88 | 2019-06-29 | 2019-06-29 14:00:00 || 7 | 88.88 | 2019-06-29 | 2019-06-29 14:00:00 || 9 | 88.88 | 2019-06-29 | 2019-06-29 14:00:00 || 11 | 88.88 | 2019-06-29 | 2019-06-29 14:00:00 || 13 | 88.88 | 2019-06-29 | 2019-06-29 14:00:00 || 15 | 88.88 | 2019-06-29 | 2019-06-29 14:00:00 || 17 | 88.88 | 2019-06-29 | 2019-06-29 14:00:00 || 19 | 88.88 | 2019-06-29 | 2019-06-29 14:00:00 |+----+-----------+------------+---------------------+10 rows in set (0.00 sec)
clickhouse 数据查询:
ck-server-01 :) select * from tb1;SELECT *FROM tb1┌─id─┬─pay_money─┬────pay_day─┬────────────pay_time─┐│ 1 │ 88.88 │ 2019-06-29 │ 2019-06-29 14:00:00 ││ 3 │ 88.88 │ 2019-06-29 │ 2019-06-29 14:00:00 ││ 5 │ 88.88 │ 2019-06-29 │ 2019-06-29 14:00:00 ││ 7 │ 88.88 │ 2019-06-29 │ 2019-06-29 14:00:00 ││ 9 │ 88.88 │ 2019-06-29 │ 2019-06-29 14:00:00 ││ 11 │ 88.88 │ 2019-06-29 │ 2019-06-29 14:00:00 ││ 13 │ 88.88 │ 2019-06-29 │ 2019-06-29 14:00:00 ││ 15 │ 88.88 │ 2019-06-29 │ 2019-06-29 14:00:00 ││ 17 │ 88.88 │ 2019-06-29 │ 2019-06-29 14:00:00 ││ 19 │ 88.88 │ 2019-06-29 │ 2019-06-29 14:00:00 │└────┴───────────┴────────────┴─────────────────────┘10 rows in set. Elapsed: 0.009 sec.
代码地址:
https://github.com/yymysql/mysql-clickhouse-replication
总结
参考资料
关于 ClickHouse 的技术内容,你们还有什么想知道的吗?赶紧留言告诉小编吧!