monitor.git
15 years agoAM nagios/plc2nagios.py
Stephen Soltesz [Thu, 31 Jul 2008 20:40:22 +0000 (20:40 +0000)]
AM   nagios/plc2nagios.py
a script I wrote a while ago to translate the plc db into a nagios
configuration file.  might be helpful for someone else trying a better
approach with nagios

M    syncplcdb.py
fixed a bug to avoid an inconsistency in the PLCDB wrt federation
migration.

AM   kill.cmd.sh
continue running even if a command fails.

_M   bootcd
renamed, and added to the repository. also added the ignore set property.

M    getconf.py
renamed to look in bootcd dir.

A    docs
AM   docs/ipalprotocol.pdf
A    docs/ilo2-auto-export-buffer-setup.pdf
documents that might be helpful for others maintaining the PCUs

AM   rtinfo.py
sketch of code to read through a rt db cache and show useful info like
'last updated by email', which is not visible through the gui.

M    reboot.py
updated to include custom code for the new PCU in plab1-itec.uni-klu.ac.at

_M   ssh
A    nodediff.py
template for comparing the nodes up or down between two time periods.

15 years ago(no commit message)
Stephen Soltesz [Thu, 31 Jul 2008 20:26:41 +0000 (20:26 +0000)]

15 years agoAdded the AMT sample app from the IntelAMTSDK. It pulls in all cpp and
Stephen Soltesz [Wed, 30 Jul 2008 22:01:08 +0000 (22:01 +0000)]
Added the AMT sample app from the IntelAMTSDK.  It pulls in all cpp and
include files necessary to compile it.

15 years agoI will try to get the rpm to work with lower-case name
Stephen Soltesz [Wed, 30 Jul 2008 20:55:58 +0000 (20:55 +0000)]
I will try to get the rpm to work with lower-case name

15 years agoMassive commit of all changes, and added files for the Monitor-server package.
Stephen Soltesz [Wed, 30 Jul 2008 20:55:23 +0000 (20:55 +0000)]
Massive commit of all changes, and added files for the Monitor-server package.

15 years agoAdding third-party module used for Monitor's web pages.
Stephen Soltesz [Wed, 30 Jul 2008 20:05:07 +0000 (20:05 +0000)]
Adding third-party module used for Monitor's web pages.

15 years agoadded for the first time
Stephen Soltesz [Wed, 30 Jul 2008 20:02:24 +0000 (20:02 +0000)]
added for the first time

15 years agoadd spec files for the server-side rpm package of monitor
Stephen Soltesz [Wed, 30 Jul 2008 19:36:04 +0000 (19:36 +0000)]
add spec files for the server-side rpm package of monitor

15 years agoThe most current version of everything.
Stephen Soltesz [Mon, 21 Jul 2008 16:30:31 +0000 (16:30 +0000)]
The most current version of everything.

15 years agoTagging module Monitor - Monitor-1.0-5
Stephen Soltesz [Fri, 18 Jul 2008 18:00:30 +0000 (18:00 +0000)]
Tagging module Monitor - Monitor-1.0-5
Incremental improvements

15 years agoCompletes support for the ePowerSwitch series.
Stephen Soltesz [Thu, 10 Jul 2008 18:16:07 +0000 (18:16 +0000)]
Completes support for the ePowerSwitch series.

Does not support the 8XM, from site 'fem'.

15 years agoIncludes support for IntelAMT as well as better support for existing IPAL over
Stephen Soltesz [Thu, 3 Jul 2008 22:53:24 +0000 (22:53 +0000)]
Includes support for IntelAMT as well as better support for existing IPAL over
a proprietary interface at port 9100.

15 years agoTake out pcu handling in this file, since it is handled separately by
Stephen Soltesz [Mon, 30 Jun 2008 20:44:30 +0000 (20:44 +0000)]
Take out pcu handling in this file, since it is handled separately by
grouprins.py now

15 years agoScript designed to help transfer the 'power-users' from the public plc into a
Stephen Soltesz [Tue, 24 Jun 2008 21:05:40 +0000 (21:05 +0000)]
Script designed to help transfer the 'power-users' from the public plc into a
private plc, complete with all their sites, slices, and pre-registered ssh
keys.  The goal was to make their experience of the test-plc equal to the
public-plc, such that all they needed to do was log into the node without
visiting the test-plc's interface.

15 years agoTool to find stray node network entries in the PLC db. There were currently
Stephen Soltesz [Tue, 24 Jun 2008 19:24:24 +0000 (19:24 +0000)]
Tool to find stray node network entries in the PLC db.  There were currently
289 nn entires that were not associated with a valid node.  This seems like an
error to me.

15 years agotext sketch of the sqlobject model to be designed for monitor
Stephen Soltesz [Mon, 23 Jun 2008 18:22:39 +0000 (18:22 +0000)]
text sketch of the sqlobject  model to be designed for monitor

15 years ago(no commit message)
Stephen Soltesz [Mon, 23 Jun 2008 17:20:55 +0000 (17:20 +0000)]

15 years agoInclude other options for the iLO, since 'reset' doesn't work when the machine
Stephen Soltesz [Mon, 23 Jun 2008 17:05:42 +0000 (17:05 +0000)]
Include other options for the iLO, since 'reset' doesn't work when the machine
is powered off.  TODO: add the check to power the host On if it is off.

15 years agoa template for a tool that will spit out the configuration for a node to see
Stephen Soltesz [Mon, 23 Jun 2008 17:04:48 +0000 (17:04 +0000)]
a template for a tool that will spit out the configuration for a node to see
if it has any errors.

15 years agocommit of tools I use, but are not documented or guaranteed to work for anyone
Stephen Soltesz [Mon, 23 Jun 2008 17:04:08 +0000 (17:04 +0000)]
commit of tools I use, but are not documented or guaranteed to work for anyone
else.

15 years agosimple script to collect the info Scott requested when a site leaves PL.
Stephen Soltesz [Mon, 23 Jun 2008 17:00:06 +0000 (17:00 +0000)]
simple script to collect the info Scott requested when a site leaves PL.

15 years agoMassive commit. Just put all local changes into svn.
Stephen Soltesz [Mon, 23 Jun 2008 16:57:53 +0000 (16:57 +0000)]
Massive commit.  Just put all local changes into svn.

15 years agoadd timeout
Stephen Soltesz [Mon, 16 Jun 2008 18:48:34 +0000 (18:48 +0000)]
add timeout

15 years agoFor dumping the diagnose_out file.
Stephen Soltesz [Tue, 20 May 2008 19:43:20 +0000 (19:43 +0000)]
For dumping the diagnose_out file.

15 years agoallow RT module to be removed.
Stephen Soltesz [Tue, 20 May 2008 19:42:15 +0000 (19:42 +0000)]
allow RT module to be removed.

15 years agoThese modules are not used.
Stephen Soltesz [Tue, 20 May 2008 19:37:20 +0000 (19:37 +0000)]
These modules are not used.

15 years agofor access to the www.printbadnodes module
Stephen Soltesz [Tue, 20 May 2008 19:34:03 +0000 (19:34 +0000)]
for access to the www.printbadnodes module

15 years agoclean kernel parsing.
Stephen Soltesz [Mon, 19 May 2008 18:45:23 +0000 (18:45 +0000)]
clean kernel parsing.

15 years agoAdding the model for log records
Stephen Soltesz [Mon, 19 May 2008 18:43:26 +0000 (18:43 +0000)]
Adding the model for log records

15 years agoupdate
Stephen Soltesz [Mon, 19 May 2008 18:37:48 +0000 (18:37 +0000)]
update

15 years agoadding files
Stephen Soltesz [Mon, 19 May 2008 18:36:27 +0000 (18:36 +0000)]
adding files

15 years agoTagging module Monitor - Monitor-1.0-4
Stephen Soltesz [Mon, 19 May 2008 17:54:33 +0000 (17:54 +0000)]
Tagging module Monitor - Monitor-1.0-4
tagging everything for OneLab tech-transfer.

15 years agonew messages for alpha node groups, etc.
Stephen Soltesz [Mon, 19 May 2008 17:53:26 +0000 (17:53 +0000)]
new messages for alpha node groups, etc.

15 years agomass commit
Stephen Soltesz [Mon, 19 May 2008 17:52:56 +0000 (17:52 +0000)]
mass commit

15 years agoRun process with timeout, and allow an arbitrary path for the source of the
Stephen Soltesz [Tue, 13 May 2008 18:16:11 +0000 (18:16 +0000)]
Run process with timeout, and allow an arbitrary path for the source of the
pickle files, instead of the default PICKLE_PATH

15 years agofixed call to hpilo script. I think added a timeout too.
Stephen Soltesz [Tue, 13 May 2008 18:13:55 +0000 (18:13 +0000)]
fixed call to hpilo script.  I think added a timeout too.
now works correctly with findbad.py cron job.  Doesn't hang indefinitely now.

15 years agoRead nodes from a given file, for batch updates when using nodequery and
Stephen Soltesz [Tue, 13 May 2008 18:11:59 +0000 (18:11 +0000)]
Read nodes from a given file, for batch updates when using nodequery and
nodereboot or grouprins.py

15 years ago(no commit message)
Stephen Soltesz [Tue, 13 May 2008 18:10:44 +0000 (18:10 +0000)]

15 years agoImprovements for older records. Consolidated code related to ending a
Stephen Soltesz [Tue, 13 May 2008 18:09:47 +0000 (18:09 +0000)]
Improvements for older records.  Consolidated code related to ending a
record.

15 years agoTagging module Monitor - Monitor-1.0-3
Stephen Soltesz [Fri, 9 May 2008 21:31:19 +0000 (21:31 +0000)]
Tagging module Monitor - Monitor-1.0-3

15 years agoA few changes to improve upon the script:
Marc Fiuczynski [Tue, 6 May 2008 02:55:18 +0000 (02:55 +0000)]
A few changes to improve upon the script:

- try to make it stand alone python script
  - uses xmlrpc directly; no longer needs to import plc module

- fetches nodenetworks for all hosts and caches it locally
  to avoid having to invoke the API n times (where n is the
  # of nodes at the PLC).

Still needs:

- a proper help/usage message printed

- a way to export full functionality (e.g., delete)

- a way to specify XMLRPC_SERVER as a command line option, as
  now it by default assumes www.planet-lab.org/PLCAPI

15 years agoTagging module Monitor - Monitor-1.0-2
Stephen Soltesz [Mon, 5 May 2008 17:58:09 +0000 (17:58 +0000)]
Tagging module Monitor - Monitor-1.0-2

15 years agolast typo
Stephen Soltesz [Mon, 5 May 2008 17:01:20 +0000 (17:01 +0000)]
last typo

15 years agofixes to make them more stand-alone and general.
Stephen Soltesz [Mon, 5 May 2008 16:58:42 +0000 (16:58 +0000)]
fixes to make them more stand-alone and general.

15 years agocheck consistency of specfiles:
Thierry Parmentelat [Mon, 5 May 2008 12:09:39 +0000 (12:09 +0000)]
check consistency of specfiles:
* set pldistro in release when needed (Monitor)
* remove it when already part of the rpm name (bootcd, noderepo)

15 years agoMajor improvements. Actually useful for daily operations.
Stephen Soltesz [Fri, 2 May 2008 19:18:20 +0000 (19:18 +0000)]
Major improvements.  Actually useful for daily operations.

16 years agoTagging module Monitor - Monitor-1.0-1
Stephen Soltesz [Wed, 23 Apr 2008 21:00:10 +0000 (21:00 +0000)]
Tagging module Monitor - Monitor-1.0-1
This should be ready for 4.2rc2

16 years agoAdd a field for the currently observed status as well as the PLC db
Stephen Soltesz [Mon, 14 Apr 2008 17:59:45 +0000 (17:59 +0000)]
Add a field for the currently observed status as well as the PLC db
configuration.

16 years agoAdd an option to end a monitor record for a node. This results in the
Stephen Soltesz [Mon, 14 Apr 2008 17:59:17 +0000 (17:59 +0000)]
Add an option to end a monitor record for a node.  This results in the
accounting starting over.

16 years agoAdded a convenience script for making a single command line call.
Stephen Soltesz [Mon, 14 Apr 2008 17:58:36 +0000 (17:58 +0000)]
Added a convenience script for making a single command line call.

16 years agoinstructs user how to create the 'auth.py' file.
Stephen Soltesz [Fri, 11 Apr 2008 21:02:53 +0000 (21:02 +0000)]
instructs user how to create the 'auth.py' file.

16 years agoThis is a template script for adding the 'Site Assistant' user into the myPLC
Stephen Soltesz [Fri, 11 Apr 2008 20:59:45 +0000 (20:59 +0000)]
This is a template script for adding the 'Site Assistant' user into the myPLC
db, creating an rsa key, uploading it to the user account, and eventually
doing some other post-processing setup for monitor.

16 years ago- additional functions for displaying the pcu.
Stephen Soltesz [Wed, 9 Apr 2008 17:19:37 +0000 (17:19 +0000)]
- additional functions for displaying the pcu.

16 years ago- add reporting of pcu state
Stephen Soltesz [Wed, 9 Apr 2008 17:17:59 +0000 (17:17 +0000)]
- add reporting of pcu state

16 years ago- add a checked time to each record.
Stephen Soltesz [Wed, 9 Apr 2008 17:16:13 +0000 (17:16 +0000)]
- add a checked time to each record.

16 years ago- some code cleaning.
Stephen Soltesz [Wed, 9 Apr 2008 17:15:53 +0000 (17:15 +0000)]
- some code cleaning.
- fixed the bug that missed entries in act_all without no previous records.
- take RT tickets into account better.

16 years ago- tweaks.
Stephen Soltesz [Wed, 9 Apr 2008 17:14:48 +0000 (17:14 +0000)]
- tweaks.

16 years ago- added a checked time value
Stephen Soltesz [Wed, 9 Apr 2008 17:13:58 +0000 (17:13 +0000)]
- added a checked time value
- added new kernel version.  need a better way to do this.

16 years ago- additional regular rotations.
Stephen Soltesz [Wed, 9 Apr 2008 17:13:05 +0000 (17:13 +0000)]
- additional regular rotations.

16 years ago-added commands to get and set the ticket status so this can be done automatically...
Stephen Soltesz [Wed, 9 Apr 2008 17:09:33 +0000 (17:09 +0000)]
-added commands to get and set the ticket status so this can be done automatically on node restoration.

16 years agoindent change
Stephen Soltesz [Wed, 9 Apr 2008 17:08:58 +0000 (17:08 +0000)]
indent change

16 years ago- cleaning of code.
Stephen Soltesz [Wed, 9 Apr 2008 17:08:38 +0000 (17:08 +0000)]
- cleaning of code.
- save all of the RT db.

16 years ago-Some code cleaning to remove old ipal implementation.
Stephen Soltesz [Wed, 9 Apr 2008 17:06:36 +0000 (17:06 +0000)]
-Some code cleaning to remove old ipal implementation.
-Better pcuid mappings to different pcus.
-takes command line argument when run as a program 'reboot.py <hostname>'

16 years agoadd simple command line tools for manipulating node groups, and for querying
Stephen Soltesz [Wed, 9 Apr 2008 16:56:35 +0000 (16:56 +0000)]
add simple command line tools for manipulating node groups, and for querying
the information collected by monitor for a given node.

16 years agotake away the lowercase 'm'onitor.spec.
Stephen Soltesz [Wed, 9 Apr 2008 13:58:37 +0000 (13:58 +0000)]
take away the lowercase 'm'onitor.spec.

16 years agoAdded two requirements.
Stephen Soltesz [Tue, 8 Apr 2008 21:20:31 +0000 (21:20 +0000)]
Added two requirements.

16 years agocapitalize for build?
Stephen Soltesz [Tue, 8 Apr 2008 20:59:25 +0000 (20:59 +0000)]
capitalize for build?

16 years agoused the wrong spec file as a template.
Stephen Soltesz [Tue, 8 Apr 2008 20:50:19 +0000 (20:50 +0000)]
used the wrong spec file as a template.

16 years agoInitial add of monitor spec, init, and cron file for the monitor root account scripts
Stephen Soltesz [Tue, 8 Apr 2008 20:30:58 +0000 (20:30 +0000)]
Initial add of monitor spec, init, and cron file for the monitor root account scripts

16 years agoSimpler interface to api. Given a single object, it preserves the auth
Stephen Soltesz [Mon, 7 Apr 2008 20:49:49 +0000 (20:49 +0000)]
Simpler interface to api.  Given a single object, it preserves the auth
variable and passes it to all subsequent calls transparently.

16 years agoa key for the monitor user.
Stephen Soltesz [Fri, 4 Apr 2008 20:18:59 +0000 (20:18 +0000)]
a key for the monitor user.

16 years agoBasic script to collect ssh_rsa_keys for all nodes and dump into a known_hosts
Stephen Soltesz [Fri, 21 Mar 2008 17:26:49 +0000 (17:26 +0000)]
Basic script to collect ssh_rsa_keys for all nodes and dump into a known_hosts
file.  Problems:

 * needs to be updated periodically.
 * needs to co-exist with a user's non-pl entries in known_hosts
 * there doesn't seem to be a way to configure ssh to read two known_hosts files.

16 years agoAdd a new BayTech prompt type.
Stephen Soltesz [Thu, 28 Feb 2008 21:14:40 +0000 (21:14 +0000)]
Add a new BayTech prompt type.

16 years agoThis should be a global view of all things Monitor is doing, with
Stephen Soltesz [Tue, 11 Dec 2007 22:50:16 +0000 (22:50 +0000)]
This should be a global view of all things Monitor is doing, with
instantanious view for the health of a Site, Node, it's PCUs, and what actions
have been taken by Monitor or what external states are blocking it's progres..

16 years agoAdded a variety of filters to limit the nodes displayed. Also, added a 'nodesonly...
Stephen Soltesz [Tue, 11 Dec 2007 22:49:15 +0000 (22:49 +0000)]
Added a variety of filters to limit the nodes displayed.  Also, added a 'nodesonly' option

16 years agoadded some minor status message at the end
Stephen Soltesz [Tue, 11 Dec 2007 22:48:37 +0000 (22:48 +0000)]
added some minor status message at the end

16 years agojust assume that the host is up by using -P0 arg to nmap. Without this, nmap missed...
Stephen Soltesz [Tue, 11 Dec 2007 22:47:47 +0000 (22:47 +0000)]
just assume that the host is up by using -P0 arg to nmap. Without this, nmap missed some hosts that really were up.

16 years agobetter support for PCUs
Stephen Soltesz [Tue, 11 Dec 2007 22:47:04 +0000 (22:47 +0000)]
better support for PCUs

16 years agoadd a global version of getListFromFile()
Stephen Soltesz [Tue, 11 Dec 2007 22:46:14 +0000 (22:46 +0000)]
add a global version of getListFromFile()

16 years agoCache more stuff from plc in local files.
Stephen Soltesz [Tue, 11 Dec 2007 22:45:50 +0000 (22:45 +0000)]
Cache more stuff from plc in local files.

16 years agorecord a node's boot_state according to PLC's db.
Stephen Soltesz [Tue, 11 Dec 2007 22:45:09 +0000 (22:45 +0000)]
record a node's boot_state according to PLC's db.

16 years agoAdded support for sending Ctrl-C to some of the BayTechs, with the help of pexpect.py
Stephen Soltesz [Tue, 11 Dec 2007 22:44:32 +0000 (22:44 +0000)]
Added support for sending Ctrl-C to some of the BayTechs, with the help of pexpect.py

16 years agoThis is a better module for dealing with SSH logins using 'expect' like
Stephen Soltesz [Tue, 11 Dec 2007 22:43:55 +0000 (22:43 +0000)]
This is a better module for dealing with SSH logins using 'expect' like
processing.  This should replace pyssh eventually.

16 years agoDisabled the print when loading a pkl file
Stephen Soltesz [Tue, 11 Dec 2007 22:40:46 +0000 (22:40 +0000)]
Disabled the print when loading a pkl file

16 years agomade the 'get' function global to allow calls from other modules.
Stephen Soltesz [Tue, 11 Dec 2007 22:40:16 +0000 (22:40 +0000)]
made the 'get' function global to allow calls from other modules.

16 years agoadded getPersons() wrapper
Stephen Soltesz [Tue, 11 Dec 2007 22:39:15 +0000 (22:39 +0000)]
added getPersons() wrapper

16 years ago(no commit message)
Stephen Soltesz [Tue, 11 Dec 2007 22:38:51 +0000 (22:38 +0000)]

16 years agoAdded a fix for HPiLO that got lost some how.
Stephen Soltesz [Wed, 28 Nov 2007 22:24:48 +0000 (22:24 +0000)]
Added a fix for HPiLO that got lost some how.

16 years agoAdded two more APC models for brazil and berlin.
Stephen Soltesz [Wed, 28 Nov 2007 18:40:43 +0000 (18:40 +0000)]
Added two more APC models for brazil and berlin.

16 years agoTake out code to show passwords
Stephen Soltesz [Tue, 27 Nov 2007 20:22:46 +0000 (20:22 +0000)]
Take out code to show passwords

16 years agotypo. Missed a shell=True arg to Popen.
Stephen Soltesz [Tue, 27 Nov 2007 20:22:22 +0000 (20:22 +0000)]
typo.  Missed a shell=True arg to Popen.

16 years agosteps to make racadm workable.
Stephen Soltesz [Tue, 27 Nov 2007 18:33:23 +0000 (18:33 +0000)]
steps to make racadm workable.

16 years agoBetter code for PCU types. Class based with specific exceptions for different
Stephen Soltesz [Tue, 27 Nov 2007 18:30:46 +0000 (18:30 +0000)]
Better code for PCU types.  Class based with specific exceptions for different
error conditions.  Support for HPiLO, DRAC via racadm, and special cases for a
variety of weird configurations.

16 years agoUpdated findbadpcu.py with changes made in reboot.py. Simpler interface and
Stephen Soltesz [Tue, 27 Nov 2007 18:29:50 +0000 (18:29 +0000)]
Updated findbadpcu.py with changes made in reboot.py.  Simpler interface and
return values.

16 years agoupdated readme for DELL RAC3/4
Stephen Soltesz [Mon, 26 Nov 2007 23:51:09 +0000 (23:51 +0000)]
updated readme for DELL RAC3/4

16 years agoAdding subdirectories for remote commands to control ILO and DRAC cards over
Stephen Soltesz [Mon, 12 Nov 2007 21:21:05 +0000 (21:21 +0000)]
Adding subdirectories for remote commands to control ILO and DRAC cards over
HTTPS.  The iloxml should probably be a subdirectory of cmdhttps...

16 years agoPolicy.py includes updates to better handle PCUs
Stephen Soltesz [Wed, 7 Nov 2007 21:22:38 +0000 (21:22 +0000)]
Policy.py includes updates to better handle PCUs

emailTxt includes new messages related to PCUs

16 years agoAdded 'FORCED' to handle some special actions
Stephen Soltesz [Wed, 7 Nov 2007 21:21:28 +0000 (21:21 +0000)]
Added 'FORCED' to handle some special actions

16 years agoAdd a retry to the apc_reboot() for which there are different models.
Stephen Soltesz [Wed, 7 Nov 2007 20:33:44 +0000 (20:33 +0000)]
Add a retry to the apc_reboot() for which there are different models.

16 years agoAdded new squence for apc_reboot()
Stephen Soltesz [Wed, 7 Nov 2007 19:52:12 +0000 (19:52 +0000)]
Added new squence for apc_reboot()