monitor.git
12 years agoSetting tag monitor-3.1-3 monitor-3.1-3
Stephen Soltesz [Fri, 27 May 2011 23:58:51 +0000 (19:58 -0400)]
Setting tag monitor-3.1-3
fixing syntax errors

12 years agoFix syntax errors in python files.
Stephen Soltesz [Fri, 27 May 2011 23:51:07 +0000 (19:51 -0400)]
Fix syntax errors in python files.

12 years agoSetting tag monitor-3.1-2 monitor-3.1-2
Stephen Soltesz [Fri, 27 May 2011 23:29:36 +0000 (19:29 -0400)]
Setting tag monitor-3.1-2
Add better requirements list, work with TurboGears packaged by fedora,
Remove some zabbix files
Add a controllers_local.py for custom extensions

12 years agoAdded commands for interacting with google's spreadsheets:
Stephen Soltesz [Fri, 27 May 2011 22:40:15 +0000 (22:40 +0000)]
Added commands for interacting with google's spreadsheets:
    statistics/add-google-record.py
    statistics/get-records.py
    Added commands for parsing RT database and generating figure.
    Added command for dumping monitor db to CSV (dump_db_m3_raw.py)
    minor fixes

12 years agoAdd a command-line tool for creating google spreadsheets, and appending values
Stephen Soltesz [Sun, 22 May 2011 20:34:51 +0000 (20:34 +0000)]
Add a command-line tool for creating google spreadsheets, and appending values

12 years agoSeveral updates to policy, repair, and automate script.
Stephen Soltesz [Sun, 22 May 2011 01:37:13 +0000 (01:37 +0000)]
Several updates to policy, repair, and automate script.
Add a check for the number of sshkeys added to the current agent
Separate police.py and repair.py for failboot nodes

12 years agoAdd dependency on myplc
root [Fri, 13 May 2011 19:20:52 +0000 (19:20 +0000)]
Add dependency on myplc

12 years agoSeveral fixes to configuration, dependencies, and support tools.
root [Fri, 13 May 2011 19:00:19 +0000 (19:00 +0000)]
Several fixes to configuration, dependencies, and support tools.
   Added missing dependencies to .spec
   Added configurable dbuser, dbname, and dbhost  >= myplc-5.0-18
   Added external commands to fetch.py
   Added googlevis java script template

12 years agoadd default email
Stephen Soltesz [Sat, 7 May 2011 05:13:54 +0000 (05:13 +0000)]
add default email

12 years agoMerge branch 'master' of ssh://soltesz@git.planet-lab.org/git/monitor
root [Sat, 7 May 2011 00:21:21 +0000 (00:21 +0000)]
Merge branch 'master' of ssh://soltesz@git.planet-lab.org/git/monitor

12 years agoMerge branch 'master' of git://git.planet-lab.org/monitor
Stephen Soltesz [Fri, 6 May 2011 20:51:53 +0000 (20:51 +0000)]
Merge branch 'master' of git://git.planet-lab.org/monitor

12 years agoMerge branch 'master' of git://git.planet-lab.org/monitor
root [Fri, 6 May 2011 20:51:53 +0000 (20:51 +0000)]
Merge branch 'master' of git://git.planet-lab.org/monitor

12 years agoSetting tag monitor-3.1-1 monitor-3.1-1
Stephen Soltesz [Fri, 6 May 2011 20:42:43 +0000 (16:42 -0400)]
Setting tag monitor-3.1-1
last tag before some more major changes

12 years agoFirst of a series of significant changes to how monitor is organized.
root [Fri, 6 May 2011 20:33:44 +0000 (20:33 +0000)]
First of a series of significant changes to how monitor is organized.
   mailer.py -- uses CC rather than AdminCC to filter messages that are copied
            to our support list.  Requires additional scrips in RT.
    controllers_local.py -- supports local extensions to the web interface.

13 years agoMany small updates and fixes:
Stephen Soltesz [Wed, 13 Apr 2011 19:31:43 +0000 (19:31 +0000)]
Many small updates and fixes:
better logging in plc.py

13 years agocheck agg['plc_node_stats']. this was causing the monitor deployment on PLE fail...
Barış Metin [Wed, 10 Nov 2010 15:54:09 +0000 (15:54 +0000)]
check agg['plc_node_stats']. this was causing the monitor deployment on PLE fail every now and again.

13 years agoprevent lost data in case upload fails
Stephen Soltesz [Thu, 7 Oct 2010 18:24:24 +0000 (18:24 +0000)]
prevent lost data in case upload fails

13 years ago(no commit message)
Stephen Soltesz [Tue, 28 Sep 2010 23:50:09 +0000 (23:50 +0000)]

13 years agorun daily, with collection log script
Stephen Soltesz [Tue, 28 Sep 2010 23:48:25 +0000 (23:48 +0000)]
run daily, with collection log script

13 years agosimple incremental collection script, and environment variable for ssh
Stephen Soltesz [Tue, 28 Sep 2010 23:48:02 +0000 (23:48 +0000)]
simple incremental collection script, and environment variable for ssh

13 years agolog all bash-command line commands and upload them centrally
Stephen Soltesz [Tue, 28 Sep 2010 18:13:12 +0000 (18:13 +0000)]
log all bash-command line commands and upload them centrally

13 years agoadd a directory for running nagios scale/performance tests
Stephen Soltesz [Wed, 15 Sep 2010 20:27:12 +0000 (20:27 +0000)]
add a directory for running nagios scale/performance tests
add 'testing' support to plc_hosts_to_nagios and plc_users_to_nagios
multiple pattern checks in checkrt.py

13 years agorename db data collection
Stephen Soltesz [Tue, 27 Jul 2010 20:53:16 +0000 (20:53 +0000)]
rename db data collection

13 years agonew files for dumping and parsing logs
Stephen Soltesz [Mon, 26 Jul 2010 16:49:17 +0000 (16:49 +0000)]
new files for dumping and parsing logs

13 years agoadd support for monitoring the plc servers and api
Stephen Soltesz [Tue, 20 Jul 2010 18:05:05 +0000 (18:05 +0000)]
add support for monitoring the plc servers and api
print more descriptive status messasges from checkpcu
enable notifications for SiteOnline status for sites

13 years agoadd areSlicesEnabled and isSiteEnabled convenience checks
Stephen Soltesz [Tue, 29 Jun 2010 22:23:14 +0000 (22:23 +0000)]
add areSlicesEnabled and isSiteEnabled convenience checks

13 years agoadd rt3 dependency,
Stephen Soltesz [Tue, 29 Jun 2010 22:10:54 +0000 (22:10 +0000)]
add rt3 dependency,

13 years agomoved Time() class to generic.py
Stephen Soltesz [Tue, 29 Jun 2010 22:09:48 +0000 (22:09 +0000)]
moved Time() class to generic.py

13 years agoadded rtcheck & escalation commands to plc_hosts_*
Stephen Soltesz [Tue, 29 Jun 2010 22:04:34 +0000 (22:04 +0000)]
added rtcheck & escalation commands to plc_hosts_*
changed hostescalation to serviceescalation for site cluster, to make it
    depend on the rtcheck status.  Now if there are open tickets, the
    escalation will stop
added new code to actions/escalation.py to mirror actual behavior.

13 years agoadd checkrt to indicate when a site has new or open tickets
Stephen Soltesz [Tue, 29 Jun 2010 22:01:36 +0000 (22:01 +0000)]
add checkrt to indicate when a site has new or open tickets
add checkescalation to infer the penalty applied to a site based on the state
    of it's site and slices
add extra RT configuration fields to auth.py

13 years agoadd support for the myops object tags. Applies to sites, slices, and persons.
Stephen Soltesz [Tue, 29 Jun 2010 20:54:48 +0000 (20:54 +0000)]
add support for the myops object tags.  Applies to sites, slices, and persons.
    Sites with 'exempt_site_until' are not disabled
    Persons with 'exempt_site_until' are not emailed
    Slices with 'exempt_slice_until' are not suspended

    This feature will replace the 'blacklist' command line tool.

    Currently, there is no GUI support for Person or Site Tags.

13 years agoadd a warning when given loginbase returns nothing
Stephen Soltesz [Mon, 28 Jun 2010 15:47:47 +0000 (15:47 +0000)]
add a warning when given loginbase returns nothing
add two time functions to convert strings to timestamp or datetime objects

13 years agoadd real checks for RebootNodeWithPCU. Report errors returned by API
Stephen Soltesz [Fri, 25 Jun 2010 21:17:43 +0000 (21:17 +0000)]
add real checks for RebootNodeWithPCU.  Report errors returned by API
add notes_url to pcu service

13 years agoadd comon_analysis graph
Stephen Soltesz [Fri, 25 Jun 2010 15:40:50 +0000 (15:40 +0000)]
add comon_analysis graph

13 years ago(no commit message)
Stephen Soltesz [Mon, 21 Jun 2010 20:59:37 +0000 (20:59 +0000)]

13 years agoa simple auth file for accessing remote plc
Stephen Soltesz [Mon, 21 Jun 2010 20:37:41 +0000 (20:37 +0000)]
a simple auth file for accessing remote plc

13 years agosimplify plc_users_to_nagios imports as with plc_hosts...
Stephen Soltesz [Mon, 21 Jun 2010 20:30:59 +0000 (20:30 +0000)]
simplify plc_users_to_nagios imports as with plc_hosts...

13 years agotypo
Stephen Soltesz [Mon, 21 Jun 2010 20:27:16 +0000 (20:27 +0000)]
typo

13 years agomake plc.py simpler to reduce the dependencies for plc_hosts_to_nagios.
Stephen Soltesz [Mon, 21 Jun 2010 20:26:05 +0000 (20:26 +0000)]
make plc.py simpler to reduce the dependencies for plc_hosts_to_nagios.
add cron script to regenerate config files daily.
add dependencies and setup to monitor-nagios rpm
improve monitor-nagios.init script (I still think it may need to only be run once).

13 years agoadd an escalation for a bad pcu status.
Stephen Soltesz [Mon, 21 Jun 2010 18:13:46 +0000 (18:13 +0000)]
add an escalation for a bad pcu status.
every observed service has an associated action

13 years agoadd check to see if mysqld is running in init script
Stephen Soltesz [Fri, 18 Jun 2010 23:05:43 +0000 (23:05 +0000)]
add check to see if mysqld is running in init script

13 years agocreate a skeleton init script for monitor-nagios. not sure if this really
Stephen Soltesz [Fri, 18 Jun 2010 22:57:02 +0000 (22:57 +0000)]
create a skeleton init script for monitor-nagios.  not sure if this really
needs to run every time, since setup only needs to happen once.

13 years agotypo
Stephen Soltesz [Fri, 18 Jun 2010 22:11:35 +0000 (22:11 +0000)]
typo

13 years agoattempting to separate server and nagios packages explicitly
Stephen Soltesz [Fri, 18 Jun 2010 22:09:43 +0000 (22:09 +0000)]
attempting to separate server and nagios packages explicitly

13 years agoupdate nagios scripts with new paths
Stephen Soltesz [Fri, 18 Jun 2010 21:55:13 +0000 (21:55 +0000)]
update nagios scripts with new paths
add monitor-nagios package to spec file
remove pcucontrol from setup.py

13 years agomove files into function-specific directories
Stephen Soltesz [Fri, 18 Jun 2010 21:44:49 +0000 (21:44 +0000)]
move files into function-specific directories

13 years agomove nagios files to nagios dir
Stephen Soltesz [Fri, 18 Jun 2010 21:43:17 +0000 (21:43 +0000)]
move nagios files to nagios  dir

13 years agoadd a nagios dir to the monitor tree
Stephen Soltesz [Fri, 18 Jun 2010 21:40:16 +0000 (21:40 +0000)]
add a nagios dir to the monitor tree

13 years agoadd a module for generating nagios configuration objects from python objects
Stephen Soltesz [Fri, 18 Jun 2010 21:24:39 +0000 (21:24 +0000)]
add a module for generating nagios configuration objects from python objects
improved generation for plc sites/hosts
  separated site escalation from notification
  host reboot stubs
  host pcu service check stubs

13 years agomove some routines from plccache to generic to avoid pulling in db routines
Stephen Soltesz [Fri, 18 Jun 2010 21:21:08 +0000 (21:21 +0000)]
move some routines from plccache to generic to avoid pulling in db routines

13 years agoadd external commands as stubs for the nagios plugins
Stephen Soltesz [Fri, 18 Jun 2010 21:19:44 +0000 (21:19 +0000)]
add external commands as stubs for the nagios plugins

13 years agoconvert some sites and users into nagios a configuration
Stephen Soltesz [Fri, 4 Jun 2010 23:16:01 +0000 (23:16 +0000)]
convert some sites and users into nagios a configuration
added hostescalation, automated reboot, custom notify commands
needs more testing

13 years agoadd logging to reboot.py
Stephen Soltesz [Fri, 4 Jun 2010 21:56:10 +0000 (21:56 +0000)]
add logging to reboot.py

13 years agorename and split plc2nagios file
Stephen Soltesz [Thu, 3 Jun 2010 18:31:01 +0000 (18:31 +0000)]
rename and split plc2nagios file

13 years agoadd some service escalation templates
Stephen Soltesz [Thu, 3 Jun 2010 17:35:30 +0000 (17:35 +0000)]
add some service escalation templates

13 years agoadd generic routines for manipulating lists from PLCAPI
Stephen Soltesz [Tue, 25 May 2010 21:15:27 +0000 (21:15 +0000)]
add generic routines for manipulating lists from PLCAPI

13 years agofcdistro -> distroname
Barış Metin [Fri, 21 May 2010 08:39:49 +0000 (08:39 +0000)]
fcdistro -> distroname

13 years agoBranch 3.0 for module Monitor created (as new trunk) from tag Monitor-3.0-35
Stephen Soltesz [Thu, 20 May 2010 19:26:57 +0000 (19:26 +0000)]
Branch 3.0 for module Monitor created (as new trunk) from tag Monitor-3.0-35

13 years agoSetting tag Monitor-3.0-35
Stephen Soltesz [Thu, 20 May 2010 19:25:55 +0000 (19:25 +0000)]
Setting tag Monitor-3.0-35
Add CSV link on Advanced query
Preparing to branch

13 years agoadd a CVS Format link to the advanced query page
Stephen Soltesz [Thu, 20 May 2010 17:46:14 +0000 (17:46 +0000)]
add a CVS Format link to the advanced query page

13 years agoSetting tag Monitor-3.0-34
Barış Metin [Wed, 12 May 2010 15:00:59 +0000 (15:00 +0000)]
Setting tag Monitor-3.0-34
* copy selections to clipbord on Advanced Query page
* RPM Pattern as regexp
* scan ipmi port

13 years agomatch rpm pattern with regexp
Barış Metin [Tue, 11 May 2010 20:02:10 +0000 (20:02 +0000)]
match rpm pattern with regexp

13 years agomove "copy to clipboard" button to table header
Barış Metin [Tue, 11 May 2010 19:37:27 +0000 (19:37 +0000)]
move "copy to clipboard" button to table header

13 years agoscan ipmi port too
Barış Metin [Mon, 10 May 2010 17:33:44 +0000 (17:33 +0000)]
scan ipmi port too

13 years agoin Advanced Query, select rows and copy values to clipboard in csv format.
Barış Metin [Thu, 6 May 2010 10:26:36 +0000 (10:26 +0000)]
in Advanced Query, select rows and copy values to clipboard in csv format.

13 years agoset default pagesize on all views to 999
Thierry Parmentelat [Wed, 5 May 2010 07:55:43 +0000 (07:55 +0000)]
set default pagesize on all views to 999

13 years agoSetting tag Monitor-3.0-33
Barış Metin [Tue, 27 Apr 2010 10:46:24 +0000 (10:46 +0000)]
Setting tag Monitor-3.0-33
handle hostname changes

13 years agohandle hostname changes
Barış Metin [Mon, 26 Apr 2010 08:59:53 +0000 (08:59 +0000)]
handle hostname changes

14 years agoSetting tag Monitor-3.0-32
Thierry Parmentelat [Tue, 20 Apr 2010 08:27:27 +0000 (08:27 +0000)]
Setting tag Monitor-3.0-32
from this version, suitable for 5.0
requires bootcd with the new 5.0 naming style 3-part nodefamily

14 years agofor 5.0, requires bootcd with new 3-part nodefamily
Thierry Parmentelat [Tue, 20 Apr 2010 08:25:12 +0000 (08:25 +0000)]
for 5.0, requires bootcd with new 3-part nodefamily

14 years agoSetting tag Monitor-3.0-31
Stephen Soltesz [Mon, 12 Apr 2010 14:50:50 +0000 (14:50 +0000)]
Setting tag Monitor-3.0-31
added fix for node delete/add causing conflicts in MyOps db.
added statistics scripts

14 years agofixes bug in myops for a node with different node_id. This occurs when
Stephen Soltesz [Thu, 8 Apr 2010 19:34:35 +0000 (19:34 +0000)]
fixes bug in myops for a node with different node_id.  This occurs when
    deleting and then adding a node with the same name in plc.

14 years agofix path
Barış Metin [Tue, 6 Apr 2010 13:15:50 +0000 (13:15 +0000)]
fix path

14 years agoadd myops_restoration
Stephen Soltesz [Thu, 25 Mar 2010 19:51:00 +0000 (19:51 +0000)]
add myops_restoration

14 years agofixed typo on logger name for exceptions.
Stephen Soltesz [Sat, 13 Mar 2010 20:00:27 +0000 (20:00 +0000)]
fixed typo on logger name for exceptions.

14 years agoadd new scripts
Stephen Soltesz [Tue, 2 Mar 2010 19:30:13 +0000 (19:30 +0000)]
add new scripts

14 years agoops... fix path.
Barış Metin [Tue, 16 Feb 2010 14:28:43 +0000 (14:28 +0000)]
ops... fix path.

14 years agoR routines for printing some statistics
Stephen Soltesz [Thu, 11 Feb 2010 20:14:07 +0000 (20:14 +0000)]
R routines for printing some statistics

14 years ago(no commit message)
Stephen Soltesz [Thu, 11 Feb 2010 20:12:52 +0000 (20:12 +0000)]

14 years agotest
Stephen Soltesz [Thu, 11 Feb 2010 20:08:28 +0000 (20:08 +0000)]
test

14 years agoadd more info to sliceavg
Stephen Soltesz [Thu, 21 Jan 2010 20:22:14 +0000 (20:22 +0000)]
add more info to sliceavg
parserpms does a better job of sorting and converting entries with multiple versions

14 years agoadd a conversion class for datetime and time stamps, since I need this all the time.
Stephen Soltesz [Thu, 21 Jan 2010 20:15:57 +0000 (20:15 +0000)]
add a conversion class for datetime and time stamps, since I need this all the time.
'Created' value in mailer.py is causing problems for PLE
move print statements to stderr in plccache.py and comon.py
add an 'escapeName' routine in dbpickle to allow filepaths in output names
fix bug in scanapi that missed debug node if there was no bootmanager.log
add checks for yum.config files

14 years agoreplace some print statements to stderr
Stephen Soltesz [Thu, 21 Jan 2010 19:47:29 +0000 (19:47 +0000)]
replace some print statements to stderr
add HistorySiteRecord to checksync

14 years agoSetting tag Monitor-3.0-30
Barış Metin [Thu, 21 Jan 2010 10:50:38 +0000 (10:50 +0000)]
Setting tag Monitor-3.0-30
* fix paths for automate script

14 years agofix paths
Barış Metin [Wed, 20 Jan 2010 14:40:11 +0000 (14:40 +0000)]
fix paths

14 years agoSetting tag Monitor-3.0-29
Barış Metin [Tue, 22 Dec 2009 17:12:17 +0000 (17:12 +0000)]
Setting tag Monitor-3.0-29
- separate pcucontrol as an svn module
- restore easy_instal back into post install stage of server-deps
- template imporovements for web interface

14 years agomove easy_install calls back to post install.
Barış Metin [Tue, 22 Dec 2009 15:54:28 +0000 (15:54 +0000)]
move easy_install calls back to post install.
running easy_install didn't work as I thought it would, every now and
again it fails and break our build.

14 years agorequire pcucontrol.
Barış Metin [Tue, 22 Dec 2009 12:14:39 +0000 (12:14 +0000)]
require pcucontrol.

14 years agoremove pcucontrol from Monitor.spec
Barış Metin [Tue, 22 Dec 2009 12:03:57 +0000 (12:03 +0000)]
remove pcucontrol from Monitor.spec

14 years agomove pcucontrol package into pcucontrol module.
Barış Metin [Tue, 22 Dec 2009 12:02:27 +0000 (12:02 +0000)]
move pcucontrol package into pcucontrol module.

14 years agomove nodelist.kid headers into node_template.kid to remove redundancy.
Stephen Soltesz [Fri, 18 Dec 2009 21:13:30 +0000 (21:13 +0000)]
move nodelist.kid headers into node_template.kid to remove redundancy.
comment-out the boot/down summary at the top of the nodelist.kid page;  ...

14 years agowork around the lack of libm.a on f12
Thierry Parmentelat [Fri, 18 Dec 2009 18:17:36 +0000 (18:17 +0000)]
work around the lack of libm.a on f12

14 years agomerged pcucontrol into monitor-server. although monitor-pcucontrol may
Barış Metin [Fri, 18 Dec 2009 16:08:30 +0000 (16:08 +0000)]
merged pcucontrol into monitor-server. although monitor-pcucontrol may
be utilized as a seperate package it makes managing the %files more
complicated for the moment. if we had need to generalize it at some
point, we can manage it in a separate rpm (and/or svn module?)

14 years agook, don't break anything on f8 too :)
Barış Metin [Thu, 17 Dec 2009 21:12:39 +0000 (21:12 +0000)]
ok, don't break anything on f8 too :)

14 years agofix f12 build
Barış Metin [Thu, 17 Dec 2009 21:02:10 +0000 (21:02 +0000)]
fix f12 build

14 years agoSetting tag Monitor-3.0-28
Barış Metin [Thu, 17 Dec 2009 16:27:38 +0000 (16:27 +0000)]
Setting tag Monitor-3.0-28
do not need buildrequires. a new tag to fix centos builds

14 years agocomment out buildrequires
Barış Metin [Thu, 17 Dec 2009 14:40:38 +0000 (14:40 +0000)]
comment out buildrequires

14 years agoSetting tag Monitor-3.0-27
Barış Metin [Thu, 17 Dec 2009 11:52:31 +0000 (11:52 +0000)]
Setting tag Monitor-3.0-27
fix rpm build issues

14 years agosetuptools don't really care about --build-directory.
Barış Metin [Thu, 17 Dec 2009 11:42:59 +0000 (11:42 +0000)]
setuptools don't really care about --build-directory.
It's just easier to export TMPDIR. Thanks to Thierry.

14 years agoadd *egg/ directories to the package. easy_install can bring in
Barış Metin [Thu, 17 Dec 2009 09:59:00 +0000 (09:59 +0000)]
add *egg/ directories to the package. easy_install can bring in
additional dependencies (that's the case for f12 build).