本次案例的数据库为 Percona-Server-5.7.17-11
起因是 导入大量数据的时候, 数据库被oom kill 了
查看 /var/log/message 发现如下信息 :
Dec 12 14:27:21 localhost kernel: Out of memory: Kill process 2904 (mysqld) score 958 or sacrifice childDec 12 14:27:21 localhost kernel: Killed process 2904 (mysqld) total-vm:8670416kB, anon-rss:7580372kB, file-rss:0kB, shmem-rss:0kB
最开始是怀疑是内存不足,在磁盘性能充足的情况下打开了swap ,但结果还是OOM:
Dec 12 14:49:05 localhost kernel: Out of memory: Kill process 8009 (mysqld) score 975 or sacrifice childDec 12 14:49:05 localhost kernel: Killed process 8009 (mysqld) total-vm:19620452kB, anon-rss:7566848kB, file-rss:0kB, shmem-rss:0kB
由于数据库使用量很少, 应该不至于OOM的, 通过检查 /etc/my.cnf 最终发现一个可疑配置导致了内存不足
open_files_limit=65535
这个配置本来是控制文件打开数的
文件打开数 open_files_limitmysql> show variables like 'open%';+------------------+-------+| Variable_name | Value |+------------------+-------+| open_files_limit | 1024 | #mysql总共能够打开的文件的数量+------------------+-------+mysql> show global status like 'open%file%';+---------------+-------+| Variable_name | Value |+---------------+-------+| Open_files | 79 | # 系统当前打开的文件数| Opened_files | 278 | # 系统打开过的文件总数+---------------+-------+比较合适的设置:Open_files / open_files_limit * 100% <= 75%
最后将配置设置为10000后 ,问题消失 ,数据库启动后占用的内存也没这么多了
官方文档说明如下
open_files_limitSystem Variable Name open_files_limit Scope Global Dynamic No Permitted Values Type integer Default 5000, with possible adjustment Minimum 0 Maximum platform dependentThe number of files that the operating system permits mysqld to open. The value of this variable at runtime is the real value permitted by the system and might be different from the value you specify at server startup. The value is 0 on systems where MySQL cannot change the number of open files.The effective open_files_limit value is based on the value specified at system startup (if any) and the values of max_connections and table_open_cache, using these formulas:1) 10 + max_connections + (table_open_cache * 2)2) max_connections * 53) operating system limit if positive4) if operating system limit is Infinity: open_files_limit value specified at startup, 5000 if noneThe server attempts to obtain the number of file descriptors using the maximum of those three values. If that many descriptors cannot be obtained, the server attempts to obtain as many as the system will permit.