《Dubbo 实现原理与源码解析 —— 精品合集》 《Netty 实现原理与源码解析 —— 精品合集》
《Spring 实现原理与源码解析 —— 精品合集》 《MyBatis 实现原理与源码解析 —— 精品合集》
《Spring MVC 实现原理与源码解析 —— 精品合集》 《数据库实体设计合集》
《Spring Boot 实现原理与源码解析 —— 精品合集》 《Java 面试题 + Java 学习指南》

摘要: 原创出处 blog.csdn.net/yeweiouyang/article/details/54948846 「yeweiouyang」欢迎转载,保留摘要,谢谢!


🙂🙂🙂关注**微信公众号:【芋道源码】**有福利:

  1. RocketMQ / MyCAT / Sharding-JDBC 所有源码分析文章列表
  2. RocketMQ / MyCAT / Sharding-JDBC 中文注释源码 GitHub 地址
  3. 您对于源码的疑问每条留言将得到认真回复。甚至不知道如何读源码也可以请教噢
  4. 新的源码解析文章实时收到通知。每周更新一篇左右
  5. 认真的源码交流微信群。

ELK(Elasticsearch+LogStash+Kibana),最近使用ELK处理了一些平台日志,下面以**「mysql连接数监控」**记录部署流程。

背景

平台缺失针对mysql连接数的告警,一旦mysql连接数打满,将直接影响平台的使用。另外,对于日志信息既没有可视化界面进行操作,也没有一套有效的实时监控策略。

收益

  1. 当异常触发时能够及时通过短信、邮件等方式通知相关负责人员
  2. 建立日志可视化界面,使得日志分析更加便捷

1. 软件版本

软件 版本
Logstash v2.3.4
Filebeat v1.3.1
ElasticSearch v2.3.3
Kibana v4.5.1
ElastAlert v0.1.4

2. 解决方案

2.1. 监控架构图

mysql_connection_monitor

2.2. mysql连接数查询

mysql的连接通常是一个请求占用一个连接,如果该请求(insert,delete,update,select)长时间没有执行完毕,则会造成连接的堆积,迅速地消耗完数据库的连接数,目前ph平台线上数据库的最大连接数是1000个。

这里使用一个shell脚本来持续监控mysql连接数情况,每分钟查询一次mysql的连接数,并写入到日志文件

日志样例参考:mysql连接数日志样例 shell脚本: mysql连接数查询脚本 轮询机制: crontab任务,每分钟轮询一次

# query mysql connection
* * * * * /bin/sh /home/disk5/query_mysql_connection_log.sh > /dev/null

mysql连接数日志样例

2017-01-20 00:01:01 machine_0001=4
2017-01-20 00:01:01 machine_0002=56
2017-01-20 00:01:01 machine_0003=13
2017-01-20 00:01:01 machine_0004=87
2017-01-20 00:01:01 total_connection_number=160
==========

2.3. FileBeat配置

FileBeat配置文件请参见:附录-filebeat配置文件

FileBeat`负责监控**mysql连接数查询**产生的log(参考[mysql连接数日志样例](https://blog.csdn.net/yeweiouyang/article/details/54948846#mysql连接数日志样例)),并将**不以===开头**的内容上报到`LogStash

配置信息

配置项 配置值
是否合并多行 No
轮询时间间隔 120s
文档类型 mysql_connection_log
监控路径 /home/disk5/logs/mysql_connection_*
筛选规则 不以===开头的log

2.4. LogStash配置

LogStash配置文件请参见:附录-logstash配置文件

正则匹配使用grok debug工具进行调试(grok debug

描述

收集FileBeat发送过来的log信息,获取日志时间和错误信息

输入

2017-01-20 10:18:01 machine_0001=62

正则匹配

%{TIMESTAMP_ISO8601:time}\s+%{USER:machine}=%{NUMBER:connection_num}

其中:TIMESTAMP_ISO8601、USER、NUMBER是LogStash的grok pattern变量

将得到: *time字段:日志时间* *machine字段:机器host* *connection_num字段:机器持有mysql的连接数*

输出

time = 2017-01-20 10:18:01
machine = machine_0001
connection_num = 62

3.5. Elasticsearch配置

Elasticsearch模板请参见:附录-elasticsearch模板

模板名称:template_mysql_connection_log ES索引(index): mysql-connection-log-%{+YYYY.MM.dd} ES类型(type): mysql_connection_log

字段信息

字段名 字段类型 备注
message string 原始log信息
tags string
@timestamp date log产生时间
host string
count long
source string
input_type string
type string
offset long
@version string
machine string 机器host
connection_num long 机器持有mysql的连接数

3.6. Kibana查看各机器连接数趋势

趋势图 ph_mysql_connection_line_chart_sample

3.7. ElastAlert配置

ElastAlert配置文件请参见:附录-elastalert配置文件

每10秒轮询Elasticsearch的mysql-connection-log-*索引,若在10分钟内mysql总连接数超过750个的次数超过2次,则向相关人员发送告警短信

附录

mysql连接数查询脚本

#!/bin/bash
source /etc/profile

# Title: Online Query Mysql Connection
# Author: ouyangyewei
#
# Create: ouyangyewei, 2017/01/18
# Update: ouyangyewei, 2017/01/19, add total_connection_number

FID=`readlink -f $0 | md5sum | awk '{print $1}'`
LOG_FILE=/home/disk5/logs/mysql_connection_$(date +"%Y-%m-%d").log
# ----------------------------------------------

function get_process_list() {
mysql -uroot \
-pxxx \
-hxxx \
-P3306 \
-e 'show processlist' \
--silent \
--skip-column-names | awk '
{
if ($3=="user" && $4!="NULL") {
split($4, machine, ":");
print machine[1];
}
if ($3!="user" && $4!="user"){
split($3, machine, ":");
print machine[1];
}
}' | sort | uniq -c > /tmp/$FID
}

function run() {
# get current mysql connection status
get_process_list;

TIMESTAMP=`date +"%F %T"`
if [[ -f /tmp/$FID ]]; then
sum=0
while read line
do
machine=`echo $line | awk '{print $2}'`
connect_number=`echo $line | awk '{print $1}'`
sum=$(($sum+$connect_number))
echo "$TIMESTAMP $machine=$connect_number" >> $LOG_FILE
done < /tmp/$FID
echo "$TIMESTAMP total_connection_number=$sum" >> $LOG_FILE
echo "---------------------------------------" >> $LOG_FILE

# remove tmp file
rm -rf /tmp/$FID
fi
}
# ----------------------------------------------

# starup
run


FileBeat配置文件

filebeat:
prospectors:
-
paths:
- /home/disk5/logs/mysql_connection_*
input_type: log
document_type: mysql_connection_log
ignore_older: 84h
scan_frequency: 120s
exclude_lines: ["^==="]

output:
logstash:
hosts: ["xxx:8044"]

logging:
level: debug
to_files: true
to_syslog: false
files:
path: /var/log/mybeat
name: mybeat.log
keepfiles: 7


LogStash配置文件

input {
beats {
port => 8044
}
}
filter {
if [type] == "mysql_connection_log" {
grok {
patterns_dir => ["/conf/patterns"]
match => {
"message" => "%{TIMESTAMP_ISO8601:time}\s+%{USER:machine}=%{NUMBER:connection_num}"
}
remove_field => ["beat"]
}
date {
match => ["time", "yy-MM-dd HH:mm:ss"]
remove_field => ["time"]
}
}
}
output {
if [type]=="mysql_connection_log" {
elasticsearch {
hosts => ["xxx:8096","xxx:8096"]
index => "mysql-connection-log-%{+YYYY.MM.dd}"
}
}
}


Elasticsearch模板

curl -XPUT 'localhost:9200/_template/template_mysql_connection_log?pretty' -d'
{
"order": 0,
"template": "mysql-connection-log-*",
"settings": {},
"mappings": {
"palo-log": {
"properties": {
"tags": {
"index": "not_analyzed",
"type": "string"
},
"message": {
"index": "not_analyzed",
"type": "string"
},
"@version": {
"type": "string"
},
"@timestamp": {
"format": "strict_date_optional_time||epoch_millis",
"type": "date"
},
"source": {
"index": "not_analyzed",
"type": "string"
},
"offset": {
"type": "long"
},
"type": {
"index": "not_analyzed",
"type": "string"
},
"input_type": {
"index": "not_analyzed",
"type": "string"
},
"count": {
"type": "long"
},
"host": {
"index": "not_analyzed",
"type": "string"
},
"machine": {
"index": "not_analyzed",
"type": "string"
},
"connection_num": {
"index": "not_analyzed",
"type": "long"
}
}
}
},
"aliases": {}
}'


ElastAlert配置文件

# Alert when the rate of events exceeds a threshold

# (Optional)
# Elasticsearch host
es_host: xx.xx.xx.xx

# (Optional)
# Elasticsearch port
es_port: 8096

# (OptionaL) Connect with SSL to Elasticsearch
#use_ssl: True

# (Optional) basic-auth username and password for Elasticsearch
#es_username: someusername
#es_password: somepassword

# (Required)
# Rule name, must be unique
name: MysqlConnectionRule

# (Required)
# Type of alert.
# the frequency rule type alerts when num_events events occur with timeframe time
type: frequency
# type: any

# (Required)
# Index to search, wildcard supported
index: mysql-connection-log-*

# (Required, frequency specific)
# Alert when this many documents matching the query occur within a timeframe
num_events: 3

# (Required, frequency specific)
# num_events must occur within this amount of time to trigger an alert
timeframe:
minutes: 10
# hours: 1

# (Required)
# A list of Elasticsearch filters used for find events
# These filters are joined with AND and nested in a filtered query
# For more info: http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl.html
filter:
- range:
connection_num:
from: 750

# (Required)
# The alert is use when a match is found
alert:
- command
command: [
"curl",
"-X POST",
"-d",
'{"appId":"xxx", "token":"xxx", "alertList":[{"channel":"sms", "description":"Mysql连接数告警:当前总连接数为%(connection_num)s!", "receiver":"ouyangyew"}]}',
"http://xxx.baidu.com/alert/push"
]

# (required, email specific)
# a list of email addresses to send alerts to
# email:
# - "elastalert@example.com"

文章目录
  1. 1. 1. 软件版本
  2. 2. 2. 解决方案
    1. 2.1. 2.1. 监控架构图
    2. 2.2. 2.2. mysql连接数查询
      1. 2.2.1. mysql连接数日志样例
    3. 2.3. 2.3. FileBeat配置
    4. 2.4. 2.4. LogStash配置
    5. 2.5. 3.5. Elasticsearch配置
    6. 2.6. 3.6. Kibana查看各机器连接数趋势
    7. 2.7. 3.7. ElastAlert配置
  3. 3. 附录
    1. 3.1. mysql连接数查询脚本
    2. 3.2. FileBeat配置文件
    3. 3.3. LogStash配置文件
    4. 3.4. Elasticsearch模板
    5. 3.5. ElastAlert配置文件