Ambari 自定义告警通知

  1. 创建一个自定义告警通知项:
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
curl -i -u admin:admin -H 'X-Requested-By: ambari' -X POST  "http://ambari-server:8080/api/v1/alert_targets"  -d '
  {
    "AlertTarget": 
      {
        "name": "test_dispatcher", 
        "description": "Custom Notification Dispatcher", 
        "notification_type": "ALERT_SCRIPT", 
        "global": true, 
        "alert_states": ["CRITICAL","WARNING","UNKNOWN","OK"], 
        "properties": { 
          "ambari.dispatch-property.script": "notification.dispatch.alert.script"
        }
      }
  }
  1. 编辑ambari.properties文件,添加一行:
1
notification.dispatch.alert.script=/var/lib/ambari-alerts/scripts/scaler-notification.py
  1. 编写scaler-notification.py文件:
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
#!/usr/bin/env python
from datetime import datetime
import sys
def test_notification():
  definitionName = sys.argv[1]
  definitionLabel = sys.argv[2]
  serviceName = sys.argv[3]
  alertState = sys.argv[4]
  alertText = sys.argv[5]
  timestamp = datetime.now().strftime('%Y-%m-%d %H:%M:%S')
  notification_data = str(timestamp + ' ************* My Alert Dispatcher Logic Here ************' + " -- " + definitionName + " -- " + definitionLabel + " -- " + serviceName + " -- " + alertState + " -- " + alertText + " -- ")

  ## Writing notification to a file. Use your own logic here.
  file = open("/var/log/ambari-server/custom_notification.log", "a+")
  file.write(notification_data)
  file.close()

if __name__ == '__main__':
  test_notification()
  1. 修改scaler-notification.py文件的权限为可执行文件:
1
chmod +x /var/lib/ambari-alerts/scripts/scaler-notification.py
  1. 重启ambari服务
1
ambari-server restart
  1. 停掉一个Service/Component来触发告警条件

  2. 查看自定义告警通知所写入的日志文件:

1
tail -f /var/log/ambari-server/custom_notification.log

可以看到实际的告警日志了。

问题排查

  1. 如果在配置后,没有生成告警的日志文件,则需要查看/var/log/ambari-server/ambari-server.log,应该有类似错误日志:

此时,只需要按照路径,把对应文件拷贝过去,然后修改一下权限chmod +x filename即可。