Guardian Configuration
From Wiki
Contents |
Guardian Configuration overview
After you have completed all the steps from the Guardian installation process you may want to apply some custom configurations on the software. You may configure through console or through the WHM Plugin/Local Interface Settings. The following options can be altered according to your needs:
Restart interval
Guardian checks if each service is up every 0.5 seconds. If a service is down the Guardian will automatically restart it. However, the restart process itself may take longer than 0.5 seconds and thus the service will still be down on the next check although the restart is actually in progress. To avoid multiple unnecessary restarts Guardian will wait 30 seconds before initiating a second restart for the same service. This default 30 seconds period can be customized by the user. We will show you how below.
Values for Normal, High and Critical load levels
Guardian monitors server load level and depending on the values takes measures to prevent overload. It has a sophisticated system that applies hierarchy of different solutions (renice, ionice, pause, kill) for different load levels.
There are three values of load normal, high and critical based on which the measures are applied. The values are set by default to 8, 15 and 20 respectively, but can be customized. The default logic is as follows
Any Load
Guardian monitors all processes running on the server and kills the ones, that are running more than 120 seconds, unless the processes are owned by protected users. This setting is default and can be configured.
Normal Load
If the average load on the server is above the Normal Load value: * PHP processes(child of suexec) that are running more than 15 seconds are killed * ionice and renice are applied to the currently running rsync and archive processes - those are automatically removed when the load drops below the normal value
High Load
If the average load on the server is above the High Load value: * PHP processes(child of suexec) that are running more than 15 seconds are killed * IMAP processes that are running more than 360 seconds are killed * All rsync and archive processes are PAUSED
Critical Load
If the average load on the server is above the Critical Load value: * All PHP processes are killed, no matter how much time they are running * IMAP processes that are running more than 360 seconds are killed * All archive processes are killed
These settings are Guardian defaults. You can change them at any time. To see details refer to Guardian kill logic article
Monitored services
By default Guardian will be monitoring all the services listed below, unless you configure it to exclude any of them :
The most common services ran on a hosting server using cPanel: * cpanel - The cPanel for the server * directadmin - The DirectAdmin control panel * cpanellogd - The cPanel log daemon * crond - The Cron Jobs daemon * dovecot - The Dovecot secure IMAP server * courier - The Courier IMAP server * exim - The exim mail server * ftp - The pureftpd daemon * httpd - The web server: Apache, LiteSpeed, nginx * ionotify - The daemon used to catch received callbacks about IO activity. * klogd - The Kernel Logging Daemon * mysql - The MySQL server * mailquotad - The Mailquota Daemon * named - The Domain Name System (DNS) server * nscd - The Name Service Cache Daemon * postgres - The PostgreSql server * pop3/imap - The POP3 and IMAP mail services * plesk - The Plesk control panel * proftpd - The proftpd daemon * qmail - The qmail SMTP service * smtp - The SMTP (Simple Mail Transfer Protocol) service * syslogd - The Linux system logging utilities * lfd - ConfigServer Firewal (CSF) Login Failure Daemon * clamav - Antivirus engine for detecting Trojans, viruses, malware and other malicious threats * cdp-agent - R1Soft back agent * cdp-server - R1Soft back server
1H Software services: * lifesigns - a daemon part of the1H Guardian * cpustatsd - a statistics daemon part of 1H Hive * hawk - The 1H Hawk software * licd - The 1H Licensing Daemon
Important: Other services can be added to the list by 1H per customer request.
Important: Note that if any of the services is not detected on your hosting server it will be automatically excluded from the list by the Guardian.
Protected users
By default there are several system users that are protected, which means that processes ran by those users will remain untouched regardless of the current server load. You can remove and add users to this protected list, but it is highly recommended to add only system users. The default list of protected users include:
protected users: root, mysql, nscd, named, mailnull, postgres, cpanel
Configuration through console
To configure Guardian through console you have to manually edit this file:
/usr/local/1h/etc/guardian.conf
It includes the following:
pidfile=/usr/local/1h/var/run/guardian.pid status_file=/tmp/guardian.status logfile=/usr/local/1h/var/log/guardian.log kills_log=/usr/local/1h/var/log/guardian-kills.log error_log=/usr/local/1h/var/log/guardian-error.log restart_dir=/usr/local/1h/lib/guardian/services init_dir=/usr/local/1h/lib/guardian/init stop_dir=/usr/local/1h/lib/guardian/svcstop protected_users=root,mysql,nscd,named,mailnull,postgres,cpanel archivers_re=rar|tar |gzip|bzip2|zip|zcat check_services=apache,mysql,exim,dovecot,ftp,postgres,crond,cpanel,named,zendaemon,hawk,nscd,cpustatsd,mailquotad,cpanellogd,lifesigns load_vars=20,15,8 debug=0 time_between_restarts=30 mysql_idle_check=1 long_process_time=120 long_imap_time=360 long_php_time=15 pause_arch=1 normal_kill_php=1 high_kill_arch=0 high_kill_imap=1 high_kill_smtp=1 critical_kill_ftp=1 critical_kill_php=1 critical_kill_arch=1 critical_kill_mail=1 long_procs_exclude=0 exclude_long_re=
- The variables that contain path must not contain trailing / in the values.
- The values of protected_users and check_services must not contain any spaces between the commas and should not contain any trailing spaces on the lines. The users added here are protected and their processes are not touched during high and critical loads.
- archivers_re is regular expression which is copied from the line directly to the guardian. With this RE you can easily add more commands to be paused and/or renice & ionice.
- load_vars can contain 3 numbers, either integer of floats, separated by commas. It should not contain any trailing spaces on the line. The load vars define critical, high and normal load values.
- debug has only two values 0(disabled) and 1(enabled). This variable enables or disables the debugging information.
- time_between_restarts this variable must be integer. This variable controls the time which guardian waits between next restart attempt, after 3 failed attempts. This is to prevent server overload with constant service restarts.
- long_process_time this variable must be integer. The value set here is time in seconds. If a process is running for more then the set amount of time, it is killed and logged to /usr/local/1h/var/log/guardian-kills.log. Note that this excludes PHP processes ran by the SuExec as well as IMAP processes. Limitations for those two are defined as follows:
- long_imap_time this variable must be integer. The value set here is time in seconds. If an IMAP process is running for more then the set amount of time, it is killed and logged to /usr/local/1h/var/log/guardian-kills.log. This rules apply only if the average server load is above the Normal value
- long_php_time this variable must be integer. The value set here is time in seconds. If a php process ran by the SuExec has not finished within the specified period and the server load is above Normal load specified in the Guardian configuration - The process is killed and logged to /usr/local/1h/var/log/guardian-kills.log
- long_procs_exclude this variable must be 1 or 0. It tells the Guardian whether there are any protected processes, that should not be killed upon reaching the long_process_time limit
- exclude_long_re is regular expression which is copied from the line directly to the guardian. With this RE you say which processes should not be killed by Guardian no matter for how log they are running.
Note: If you change the location of the status file. Be sure to change it in lifesigns.conf also. Failure to do so, will cause LifeSigns to consider guardian as down.
Configuration through Local Interface Settings/WHM Plugin
Guardian configuration can be adjusted via both the 1H WHM Plugin and the Settings section for the 1H Local Interface.
Both interfaces will provide you with the same options:
In this panel you will be able to modify the restart interval, the normal,high and critical load values, as well as enable/disable monitoring for the predefined services in the configuration and add/remove protected users.
Guardian Articles
Here you can find a list of relevant articles regarding the 1H Guardian Software.
- Guardian Introduction
- Guardian Requirements
- Guardian Installation
- Guardian Post Installation
- Guardian Configuration
- Guardian System Daemon
- Email Notifications
- Guardian kill logic
- Real Time Status
- Downtime Stats
- Availability Reports
- Enable or disable service monitoring
- Server Groups management and How to add a new Server to an existing Server Group