ELK(Elasticsearch+LogStash+Kibana),最近使用ELK處理了一些平臺日志,下面以「MySQL連接數監控」記錄部署流程。
背景
平臺缺失針對mysql連接數的告警,一旦mysql連接數打滿,將直接影響平臺的使用。另外,對于日志信息既沒有可視化界面進行操作,也沒有一套有效的實時監控策略。
收益 1. 當異常觸發時能夠及時通過短信、郵件等方式通知相關負責人員 2. 建立日志可視化界面,使得日志分析更加便捷
1. 軟件版本
軟件 | 版本 |
Logstash | v2.3.4 |
Filebeat | v1.3.1 |
ElasticSearch | v2.3.3 |
Kibana | v4.5.1 |
ElastAlert | v0.1.4 |
2. 解決方案
2.1. 監控架構圖

2.2. mysql連接數查詢
mysql的連接通常是一個請求占用一個連接,如果該請求(insert,delete,update,select)長時間沒有執行完畢,則會造成連接的堆積,迅速地消耗完數據庫的連接數,目前ph平臺線上數據庫的最大連接數是1000個。
這里使用一個shell腳本來持續監控mysql連接數情況,每分鐘查詢一次mysql的連接數,并寫入到日志文件
日志樣例參考:mysql連接數日志樣例 shell腳本: mysql連接數查詢腳本 輪詢機制: crontab任務,每分鐘輪詢一次
# query mysql connection* * * * * /bin/sh /home/disk5/query_mysql_connection_log.sh > /dev/nullmysql連接數日志樣例
2017-01-20 00:01:01 machine_0001=42017-01-20 00:01:01 machine_0002=562017-01-20 00:01:01 machine_0003=132017-01-20 00:01:01 machine_0004=872017-01-20 00:01:01 total_connection_number=160==========2.3. FileBeat
配置
FileBeat配置文件請參見:附錄-filebeat配置文件
FileBeat
負責監控mysql連接數查詢產生的log(參考mysql連接數日志樣例),并將不以===開頭的內容上報到LogStash
配置信息
配置項 | 配置值 |
是否合并多行 | No |
輪詢時間間隔 | 120s |
文檔類型 | mysql_connection_log |
監控路徑 | /home/disk5/logs/mysql_connection_* |
篩選規則 | 不以===開頭的log |
2.4. LogStash
配置
LogStash配置文件請參見:附錄-logstash配置文件
正則匹配使用grok debug
工具進行調試(grok debug)
描述
收集FileBeat發送過來的log信息,獲取日志時間和錯誤信息輸入
2017-01-20 10:18:01 machine_0001=62正則匹配
%{TIMESTAMP_ISO8601:time}/s+%{USER:machine}=%{NUMBER:connection_num}其中:TIMESTAMP_ISO8601、USER、NUMBER是LogStash的grok pattern變量將得到: time字段:日志時間 machine字段:機器host connection_num字段:機器持有mysql的連接數
輸出
time = 2017-01-20 10:18:01machine = machine_0001connection_num = 623.5. Elasticsearch
配置
Elasticsearch模板請參見:附錄-elasticsearch模板
模板名稱:template_mysql_connection_log
ES索引(index): mysql-connection-log-%{+YYYY.MM.dd}
ES類型(type): mysql_connection_log
字段信息
字段名 | 字段類型 | 備注 |
message | string | 原始log信息 |
tags | string | |
@timestamp | date | log產生時間 |
host | string | |
count | long | |
source | string | |
input_type | string | |
type | string | |
offset | long | |
@version | string | |
machine | string | 機器host |
connection_num | long | 機器持有mysql的連接數 |
3.6. Kibana
查看各機器連接數趨勢
趨勢圖 
3.7. ElastAlert
配置
ElastAlert配置文件請參見:附錄-elastalert配置文件
每10秒輪詢Elasticsearch的mysql-connection-log-*
索引,若在10分鐘內mysql總連接數超過750個的次數超過2次,則向相關人員發送告警短信
附錄
mysql連接數查詢腳本
#!/bin/bashsource /etc/PRofile# Title: Online Query Mysql Connection# Author: ouyangyewei## Create: ouyangyewei, 2017/01/18# Update: ouyangyewei, 2017/01/19, add total_connection_numberFID=`readlink -f $0 | md5sum | awk '{print $1}'`LOG_FILE=/home/disk5/logs/mysql_connection_$(date +"%Y-%m-%d").log# ----------------------------------------------function get_process_list() { mysql -uroot / -pxxx / -hxxx / -P3306 / -e 'show processlist' / --silent / --skip-column-names | awk ' { if ($3=="user" && $4!="NULL") { split($4, machine, ":"); print machine[1]; } if ($3!="user" && $4!="user"){ split($3, machine, ":"); print machine[1]; } }' | sort | uniq -c > /tmp/$FID}function run() { # get current mysql connection status get_process_list; TIMESTAMP=`date +"%F %T"` if [[ -f /tmp/$FID ]]; then sum=0 while read line do machine=`echo $line | awk '{print $2}'` connect_number=`echo $line | awk '{print $1}'` sum=$(($sum+$connect_number)) echo "$TIMESTAMP $machine=$connect_number" >> $LOG_FILE done < /tmp/$FID echo "$TIMESTAMP total_connection_number=$sum" >> $LOG_FILE echo "---------------------------------------" >> $LOG_FILE # remove tmp file rm -rf /tmp/$FID fi}# ----------------------------------------------# staruprun
FileBeat配置文件
filebeat: prospectors: - paths: - /home/disk5/logs/mysql_connection_* input_type: log document_type: mysql_connection_log ignore_older: 84h scan_frequency: 120s exclude_lines: ["^==="]output: logstash: hosts: ["xxx:8044"]logging: level: debug to_files: true to_syslog: false files: path: /var/log/mybeat name: mybeat.log keepfiles: 7
LogStash配置文件
input { beats { port => 8044 }}filter { if [type] == "mysql_connection_log" { grok { patterns_dir => ["/conf/patterns"] match => { "message" => "%{TIMESTAMP_ISO8601:time}/s+%{USER:machine}=%{NUMBER:connection_num}" } remove_field => ["beat"] } date { match => ["time", "yy-MM-dd HH:mm:ss"] remove_field => ["time"] } }}output { if [type]=="mysql_connection_log" { elasticsearch { hosts => ["xxx:8096","xxx:8096"] index => "mysql-connection-log-%{+YYYY.MM.dd}" } }}
Elasticsearch模板
curl -XPUT 'localhost:9200/_template/template_mysql_connection_log?pretty' -d'{ "order": 0, "template": "mysql-connection-log-*", "settings": {}, "mappings": { "palo-log": { "properties": { "tags": { "index": "not_analyzed", "type": "string" }, "message": { "index": "not_analyzed", "type": "string" }, "@version": { "type": "string" }, "@timestamp": { "format": "strict_date_optional_time||epoch_millis", "type": "date" }, "source": { "index": "not_analyzed", "type": "string" }, "offset": { "type": "long" }, "type": { "index": "not_analyzed", "type": "string" }, "input_type": { "index": "not_analyzed", "type": "string" }, "count": { "type": "long" }, "host": { "index": "not_analyzed", "type": "string" }, "machine": { "index": "not_analyzed", "type": "string" }, "connection_num": { "index": "not_analyzed", "type": "long" } } } }, "aliases": {}}'
ElastAlert配置文件
# Alert when the rate of events exceeds a threshold# (Optional)# Elasticsearch hostes_host: xx.xx.xx.xx# (Optional)# Elasticsearch portes_port: 8096# (OptionaL) Connect with SSL to Elasticsearch#use_ssl: True# (Optional) basic-auth username and passWord for Elasticsearch#es_username: someusername#es_password: somepassword# (Required)# Rule name, must be uniquename: MysqlConnectionRule# (Required)# Type of alert.# the frequency rule type alerts when num_events events occur with timeframe timetype: frequency# type: any# (Required)# Index to search, wildcard supportedindex: mysql-connection-log-*# (Required, frequency specific)# Alert when this many documents matching the query occur within a timeframenum_events: 3# (Required, frequency specific)# num_events must occur within this amount of time to trigger an alerttimeframe: minutes: 10 # hours: 1# (Required)# A list of Elasticsearch filters used for find events# These filters are joined with AND and nested in a filtered query# For more info: http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl.htmlfilter:- range: connection_num: from: 750# (Required)# The alert is use when a match is foundalert:- commandcommand: [ "curl", "-X POST", "-d", '{"appId":"xxx", "token":"xxx", "alertList":[{"channel":"sms", "description":"Mysql連接數告警:當前總連接數為%(connection_num)s!", "receiver":"ouyangyew"}]}', "http://xxx.baidu.com/alert/push"]# (required, email specific)# a list of email addresses to send alerts to# email:# - "elastalert@example.com"