Nagios 監控通知概念

host 的 check-host-alive 使用 check_interval 5,每五分鐘間隔檢查,當發生第一次不  ok 時,檢查間隔變為每10秒一次,檢查十次不ok,變為 HARD 狀態發送通知

check_interval                  5
max_check_attempts       10
notification_interval          25

nagios.log

2008-12-02.10:11:45 [1228183905] HOST ALERT: ssorc.tw;DOWN;SOFT;1;CRITICAL - Plugin timed out after 10 seconds
2008-12-02.10:11:55 [1228183915] HOST ALERT: ssorc.tw;DOWN;SOFT;2;CRITICAL - Plugin timed out after 10 seconds
2008-12-02.10:12:05 [1228183925] HOST ALERT: ssorc.tw;DOWN;SOFT;3;CRITICAL - Plugin timed out after 10 seconds
2008-12-02.10:12:15 [1228183935] HOST ALERT: ssorc.tw;DOWN;SOFT;4;CRITICAL - Plugin timed out after 10 seconds
2008-12-02.10:12:25 [1228183945] HOST ALERT: ssorc.tw;DOWN;SOFT;5;CRITICAL - Plugin timed out after 10 seconds
2008-12-02.10:12:35 [1228183955] HOST ALERT: ssorc.tw;DOWN;SOFT;6;CRITICAL - Plugin timed out after 10 seconds
2008-12-02.10:12:45 [1228183965] HOST ALERT: ssorc.tw;DOWN;SOFT;7;CRITICAL - Plugin timed out after 10 seconds
2008-12-02.10:12:55 [1228183975] HOST ALERT: ssorc.tw;DOWN;SOFT;8;CRITICAL - Plugin timed out after 10 seconds
2008-12-02.10:13:05 [1228183985] HOST ALERT: ssorc.tw;DOWN;SOFT;9;CRITICAL - Plugin timed out after 10 seconds
2008-12-02.10:13:15 [1228183995] HOST ALERT: ssorc.tw;DOWN;HARD;10;CRITICAL - Plugin timed out after 10 seconds
2008-12-02.10:13:15 [1228183995] HOST NOTIFICATION: nagios-admin-email-cross;ssorc.tw;DOWN;host-notify-by-email;CRITICAL - Plugin timed out after 10 seconds

當 Host DOWN 狀態,有其它監控服務時,比方說是 HTTP,此時也是 Critical,它只有呈現紅色顯示,並不會發送通知,

2008-12-02.10:13:15 [1228183995] SERVICE ALERT: ssorc.tw;HTTP;CRITICAL;HARD;1;CRITICAL - Socket timeout after 10 seconds

HOst UP 狀態,僅發送 Host UP 通知

2008-12-02.10:32:25 [1228185145] HOST ALERT: ssorc.tw;UP;HARD;1;PING OK - Packet loss = 0%, RTA = 29.41 ms
2008-12-02.10:32:25 [1228185145] HOST NOTIFICATION: nagios-admin-email-cross;ssorc.tw;UP;host-notify-by-email;PING OK - Packet loss = 0%, RTA = 29.41 ms
2008-12-02.10:32:25 [1228185145] SERVICE ALERT: ssorc.tw;HTTP;OK;HARD;1;HTTP OK HTTP/1.1 200 OK - 69255 bytes in 2.491 seconds
那 check-host-alive 與 check_ping,它們是一樣的東西,只是判斷的標準不一樣

在監控 service
max_check_attempts         6
normal_check_interval       5
retry_check_interval          1
notification_interval          25
每五分鐘檢查一次,當發生第一次不OK時,間隔改為每一分鐘,檢查六次都不ok時,發送第一次通知,此時隔間檢查改為每五分鐘檢查,一直經過25分鐘後仍不ok,發送第二次通知,

                                                                                                             ----------------------------25 分------------------------------------->
OK          不OK            不OK          不OK           不OK           不OK          Alert1                                                                                Alert2
       5分              1分              1分            1分              1分            1分              5分            5分            5分              5分              5分
。------------。------------。------------。------------。------------。------------。------------。------------。------------。------------。------------。------------。
                soft1          soft2          soft3           soft4         soft5     soft6/Hard      Hard          Hard           Hard          Hard           Hard 

圖片
attachments/200812/7123476032.png
記錄
2008-12-02.11:33:35 [1228188815] SERVICE ALERT: ssorc.tw;HTTP;CRITICAL;SOFT;1;CRITICAL - Socket timeout after 10 seconds
2008-12-02.11:34:45 [1228188885] SERVICE ALERT: ssorc.tw;HTTP;CRITICAL;SOFT;2;CRITICAL - Socket timeout after 10 seconds
2008-12-02.11:35:55 [1228188955] SERVICE ALERT: ssorc.tw;HTTP;CRITICAL;SOFT;3;CRITICAL - Socket timeout after 10 seconds
2008-12-02.11:36:55 [1228189015] SERVICE ALERT: ssorc.tw;HTTP;CRITICAL;SOFT;4;CRITICAL - Socket timeout after 10 seconds
2008-12-02.11:38:05 [1228189085] SERVICE ALERT: ssorc.tw;HTTP;CRITICAL;SOFT;5;CRITICAL - Socket timeout after 10 seconds
2008-12-02.11:38:55 [1228189135] SERVICE ALERT: ssorc.tw;HTTP;CRITICAL;HARD;6;CRITICAL - Socket timeout after 10 seconds
2008-12-02.11:38:55 [1228189135] SERVICE NOTIFICATION: nagios-admin-email-cross;ssorc.tw;HTTP;CRITICAL;notify-by-email;CRITICAL - Socket timeout after 10 seconds
2008-12-02.12:04:05 [1228190645] SERVICE NOTIFICATION: nagios-admin-email-cross;ssorc.tw;HTTP;CRITICAL;notify-by-email;CRITICAL - Socket timeout after 10 seconds

2010-09-10 補充 host 的通知週期圖:  下載檔案nagios.txt (2.22 KB , 下載:23次)

attachments/201009/5052152668.jpg

nagios 3 的 host 參數有 retry_interval 可以設定 soft 態狀時間隔為多久 check 一次,而 nagios 2 看樣子只能 10 秒吧!

標籤: nagios 監控
評論: 0 | 引用: 0 | 閱讀: 3504 | 列印 | 文件 | 轉發

發表評論
暱 稱: 密 碼:
網 址: E - mail:
驗證碼: 驗證碼圖片 選 項:
頭 像:
內 容:
  • 粗體
  • 斜體
  • 底線
  • 插入圖片
  • 超連結
  • 電子郵件
  • 插入引用
  • 表情符號