谈谈clickhouse优化

2024-10-14 运维 clickhouse 0 评论字数统计: 791(字) 阅读时长: 3(分)

ClickHouse版本

23.10.1.396

问题描述

情况如上图，项目运行一段时间，随着数据的累计，服务器内存占用率已经接近100%，通过top命令查看内存占用情况，发现clickhouse进程占用了80%的内存，导致服务器内存占用率接近100%

参考资料

[1] Clickhouse常见配置解析
[2] Clickhouse 参数优化配置

一轮优化

比较匆忙的看了资料，配置文件中有这样一个描述 ,于是将其改成了8G看看效果

<!-- Maximum memory usage (resident set size) for server process.
     Zero value or unset means default. Default is "max_server_memory_usage_to_ram_ratio" of available physical RAM.
     If the value is larger than "max_server_memory_usage_to_ram_ratio" of available physical RAM, it will be cut down.

     The constraint is checked on query execution time.
     If a query tries to allocate memory and the current memory usage plus allocation is greater
      than specified threshold, exception will be thrown.

     It is not practical to set this constraint to small values like just a few gigabytes,
      because memory allocator will keep this amount of memory in caches and the server will deny service of queries.
  -->
<max_server_memory_usage>8G</max_server_memory_usage>

二轮优化

在运行了大概2个月之后，情况没有改善，内存仍然继续增长，但是本次的提示如上图超过的数值跟我文件中配置的数值有偏差，当然中间也尝试过配置
使用类似 SET max_memory_usage = 3 * 1024 * 1024 * 1024; 这样的语句去动态配置，但是使用如下语句查询返回还是没变化

select  *  from `system`.settings  where name in(
        'max_memory_usage_for_user', 
        'max_memory_usage', 
        'max_bytes_before_external_group_by', 
        'max_bytes_before_external_sort', 
        'max_memory_usage_for_all_queries', 
        'max_concurrent_queries_for_user',
        'max_concurrent_queries_for_all_users',
        'max_concurrent_queries',
        'max_server_memory_usage',
        'max_server_memory_usage_to_ram_ratio',
        'max_thread_pool_size',
        'distributed_aggregation_memory_efficient'
);

后来增加了配置项

config.xml

max_server_memory_usage_to_ram_ratio

这里按照总内存的百分比计算

<clickhouse>
  <!-- On memory constrained environments you may have to set this to value larger than 1.-->
  <max_server_memory_usage_to_ram_ratio>0.2</max_server_memory_usage_to_ram_ratio>
</clickhouse>

users.xml

max_bytes_before_external_group_by

max_bytes_before_external_sort

这里设置分组查询和排序时内存的最大使用阈值1G，超过使用硬盘

<clickhouse>
  <profiles>
      <default>
        <max_bytes_before_external_group_by>1073741824</max_bytes_before_external_group_by>
        <max_bytes_before_external_sort>1073741824</max_bytes_before_external_sort>
      </default>
  <profiles>
</clickhouse>

2024-10-14 15:08:09 后面跟进看看优化效果

三轮优化

控制内存有点治标不治本，这次优化方面如下

控制集群节点的数据存储权重

config.xml

<clickhouse>
    <remote_servers>
        <cluster_3S_1R>
             <shard>
                <internal_replication>true</internal_replication>
                #权重
                <weight>10</weight>
                <replica>
                      <host>...</host>
                      <port>...</port>
                      <user>...</user>          
                      <password>...</password>
                </replica>
             </shard>

             <shard>
                <internal_replication>true</internal_replication>
                #权重
                <weight>45</weight>
                <replica>
                      <host>...</host>
                      <port>...</port>
                      <user>...</user>          
                      <password>...</password>
                </replica>
             </shard>

             <shard>
                <internal_replication>true</internal_replication>
                #权重
                <weight>45</weight>
                <replica>
                      <host>...</host>
                      <port>...</port>
                      <user>...</user>          
                      <password>...</password>
                </replica>
             </shard>
        </cluster_3S_1R>
    </remote_servers>
</clickhouse>

控制数据的保存时间

1 2	alter table table_data on cluster cluster_3S_1R modify TTL create_time + toIntervalDay(45) ;

2024-11-02 09:00:09 后面跟进看看优化效果

完结

本文链接： https://qdteng.online/2024/10/14/谈谈clickhouse优化/

版权声明： 本博客所有文章除特别声明外，均采用 CC BY 4.0 CN协议许可协议。转载请注明出处！

holdenJava & Linux

聊聊？

谈谈clickhouse优化

ClickHouse版本

问题描述

参考资料

一轮优化

二轮优化

config.xml

users.xml

三轮优化

控制集群节点的数据存储权重

控制数据的保存时间

holdenJava & Linux