monitor.git
15 years agoTake out pcu handling in this file, since it is handled separately by
Stephen Soltesz [Mon, 30 Jun 2008 20:44:30 +0000 (20:44 +0000)]
Take out pcu handling in this file, since it is handled separately by
grouprins.py now

15 years agoScript designed to help transfer the 'power-users' from the public plc into a
Stephen Soltesz [Tue, 24 Jun 2008 21:05:40 +0000 (21:05 +0000)]
Script designed to help transfer the 'power-users' from the public plc into a
private plc, complete with all their sites, slices, and pre-registered ssh
keys.  The goal was to make their experience of the test-plc equal to the
public-plc, such that all they needed to do was log into the node without
visiting the test-plc's interface.

15 years agoTool to find stray node network entries in the PLC db. There were currently
Stephen Soltesz [Tue, 24 Jun 2008 19:24:24 +0000 (19:24 +0000)]
Tool to find stray node network entries in the PLC db.  There were currently
289 nn entires that were not associated with a valid node.  This seems like an
error to me.

15 years agotext sketch of the sqlobject model to be designed for monitor
Stephen Soltesz [Mon, 23 Jun 2008 18:22:39 +0000 (18:22 +0000)]
text sketch of the sqlobject  model to be designed for monitor

15 years ago(no commit message)
Stephen Soltesz [Mon, 23 Jun 2008 17:20:55 +0000 (17:20 +0000)]

15 years agoInclude other options for the iLO, since 'reset' doesn't work when the machine
Stephen Soltesz [Mon, 23 Jun 2008 17:05:42 +0000 (17:05 +0000)]
Include other options for the iLO, since 'reset' doesn't work when the machine
is powered off.  TODO: add the check to power the host On if it is off.

15 years agoa template for a tool that will spit out the configuration for a node to see
Stephen Soltesz [Mon, 23 Jun 2008 17:04:48 +0000 (17:04 +0000)]
a template for a tool that will spit out the configuration for a node to see
if it has any errors.

15 years agocommit of tools I use, but are not documented or guaranteed to work for anyone
Stephen Soltesz [Mon, 23 Jun 2008 17:04:08 +0000 (17:04 +0000)]
commit of tools I use, but are not documented or guaranteed to work for anyone
else.

15 years agosimple script to collect the info Scott requested when a site leaves PL.
Stephen Soltesz [Mon, 23 Jun 2008 17:00:06 +0000 (17:00 +0000)]
simple script to collect the info Scott requested when a site leaves PL.

15 years agoMassive commit. Just put all local changes into svn.
Stephen Soltesz [Mon, 23 Jun 2008 16:57:53 +0000 (16:57 +0000)]
Massive commit.  Just put all local changes into svn.

15 years agoadd timeout
Stephen Soltesz [Mon, 16 Jun 2008 18:48:34 +0000 (18:48 +0000)]
add timeout

15 years agoFor dumping the diagnose_out file.
Stephen Soltesz [Tue, 20 May 2008 19:43:20 +0000 (19:43 +0000)]
For dumping the diagnose_out file.

15 years agoallow RT module to be removed.
Stephen Soltesz [Tue, 20 May 2008 19:42:15 +0000 (19:42 +0000)]
allow RT module to be removed.

15 years agoThese modules are not used.
Stephen Soltesz [Tue, 20 May 2008 19:37:20 +0000 (19:37 +0000)]
These modules are not used.

15 years agofor access to the www.printbadnodes module
Stephen Soltesz [Tue, 20 May 2008 19:34:03 +0000 (19:34 +0000)]
for access to the www.printbadnodes module

16 years agoclean kernel parsing.
Stephen Soltesz [Mon, 19 May 2008 18:45:23 +0000 (18:45 +0000)]
clean kernel parsing.

16 years agoAdding the model for log records
Stephen Soltesz [Mon, 19 May 2008 18:43:26 +0000 (18:43 +0000)]
Adding the model for log records

16 years agoupdate
Stephen Soltesz [Mon, 19 May 2008 18:37:48 +0000 (18:37 +0000)]
update

16 years agoadding files
Stephen Soltesz [Mon, 19 May 2008 18:36:27 +0000 (18:36 +0000)]
adding files

16 years agoTagging module Monitor - Monitor-1.0-4
Stephen Soltesz [Mon, 19 May 2008 17:54:33 +0000 (17:54 +0000)]
Tagging module Monitor - Monitor-1.0-4
tagging everything for OneLab tech-transfer.

16 years agonew messages for alpha node groups, etc.
Stephen Soltesz [Mon, 19 May 2008 17:53:26 +0000 (17:53 +0000)]
new messages for alpha node groups, etc.

16 years agomass commit
Stephen Soltesz [Mon, 19 May 2008 17:52:56 +0000 (17:52 +0000)]
mass commit

16 years agoRun process with timeout, and allow an arbitrary path for the source of the
Stephen Soltesz [Tue, 13 May 2008 18:16:11 +0000 (18:16 +0000)]
Run process with timeout, and allow an arbitrary path for the source of the
pickle files, instead of the default PICKLE_PATH

16 years agofixed call to hpilo script. I think added a timeout too.
Stephen Soltesz [Tue, 13 May 2008 18:13:55 +0000 (18:13 +0000)]
fixed call to hpilo script.  I think added a timeout too.
now works correctly with findbad.py cron job.  Doesn't hang indefinitely now.

16 years agoRead nodes from a given file, for batch updates when using nodequery and
Stephen Soltesz [Tue, 13 May 2008 18:11:59 +0000 (18:11 +0000)]
Read nodes from a given file, for batch updates when using nodequery and
nodereboot or grouprins.py

16 years ago(no commit message)
Stephen Soltesz [Tue, 13 May 2008 18:10:44 +0000 (18:10 +0000)]

16 years agoImprovements for older records. Consolidated code related to ending a
Stephen Soltesz [Tue, 13 May 2008 18:09:47 +0000 (18:09 +0000)]
Improvements for older records.  Consolidated code related to ending a
record.

16 years agoTagging module Monitor - Monitor-1.0-3
Stephen Soltesz [Fri, 9 May 2008 21:31:19 +0000 (21:31 +0000)]
Tagging module Monitor - Monitor-1.0-3

16 years agoA few changes to improve upon the script:
Marc Fiuczynski [Tue, 6 May 2008 02:55:18 +0000 (02:55 +0000)]
A few changes to improve upon the script:

- try to make it stand alone python script
  - uses xmlrpc directly; no longer needs to import plc module

- fetches nodenetworks for all hosts and caches it locally
  to avoid having to invoke the API n times (where n is the
  # of nodes at the PLC).

Still needs:

- a proper help/usage message printed

- a way to export full functionality (e.g., delete)

- a way to specify XMLRPC_SERVER as a command line option, as
  now it by default assumes www.planet-lab.org/PLCAPI

16 years agoTagging module Monitor - Monitor-1.0-2
Stephen Soltesz [Mon, 5 May 2008 17:58:09 +0000 (17:58 +0000)]
Tagging module Monitor - Monitor-1.0-2

16 years agolast typo
Stephen Soltesz [Mon, 5 May 2008 17:01:20 +0000 (17:01 +0000)]
last typo

16 years agofixes to make them more stand-alone and general.
Stephen Soltesz [Mon, 5 May 2008 16:58:42 +0000 (16:58 +0000)]
fixes to make them more stand-alone and general.

16 years agocheck consistency of specfiles:
Thierry Parmentelat [Mon, 5 May 2008 12:09:39 +0000 (12:09 +0000)]
check consistency of specfiles:
* set pldistro in release when needed (Monitor)
* remove it when already part of the rpm name (bootcd, noderepo)

16 years agoMajor improvements. Actually useful for daily operations.
Stephen Soltesz [Fri, 2 May 2008 19:18:20 +0000 (19:18 +0000)]
Major improvements.  Actually useful for daily operations.

16 years agoTagging module Monitor - Monitor-1.0-1
Stephen Soltesz [Wed, 23 Apr 2008 21:00:10 +0000 (21:00 +0000)]
Tagging module Monitor - Monitor-1.0-1
This should be ready for 4.2rc2

16 years agoAdd a field for the currently observed status as well as the PLC db
Stephen Soltesz [Mon, 14 Apr 2008 17:59:45 +0000 (17:59 +0000)]
Add a field for the currently observed status as well as the PLC db
configuration.

16 years agoAdd an option to end a monitor record for a node. This results in the
Stephen Soltesz [Mon, 14 Apr 2008 17:59:17 +0000 (17:59 +0000)]
Add an option to end a monitor record for a node.  This results in the
accounting starting over.

16 years agoAdded a convenience script for making a single command line call.
Stephen Soltesz [Mon, 14 Apr 2008 17:58:36 +0000 (17:58 +0000)]
Added a convenience script for making a single command line call.

16 years agoinstructs user how to create the 'auth.py' file.
Stephen Soltesz [Fri, 11 Apr 2008 21:02:53 +0000 (21:02 +0000)]
instructs user how to create the 'auth.py' file.

16 years agoThis is a template script for adding the 'Site Assistant' user into the myPLC
Stephen Soltesz [Fri, 11 Apr 2008 20:59:45 +0000 (20:59 +0000)]
This is a template script for adding the 'Site Assistant' user into the myPLC
db, creating an rsa key, uploading it to the user account, and eventually
doing some other post-processing setup for monitor.

16 years ago- additional functions for displaying the pcu.
Stephen Soltesz [Wed, 9 Apr 2008 17:19:37 +0000 (17:19 +0000)]
- additional functions for displaying the pcu.

16 years ago- add reporting of pcu state
Stephen Soltesz [Wed, 9 Apr 2008 17:17:59 +0000 (17:17 +0000)]
- add reporting of pcu state

16 years ago- add a checked time to each record.
Stephen Soltesz [Wed, 9 Apr 2008 17:16:13 +0000 (17:16 +0000)]
- add a checked time to each record.

16 years ago- some code cleaning.
Stephen Soltesz [Wed, 9 Apr 2008 17:15:53 +0000 (17:15 +0000)]
- some code cleaning.
- fixed the bug that missed entries in act_all without no previous records.
- take RT tickets into account better.

16 years ago- tweaks.
Stephen Soltesz [Wed, 9 Apr 2008 17:14:48 +0000 (17:14 +0000)]
- tweaks.

16 years ago- added a checked time value
Stephen Soltesz [Wed, 9 Apr 2008 17:13:58 +0000 (17:13 +0000)]
- added a checked time value
- added new kernel version.  need a better way to do this.

16 years ago- additional regular rotations.
Stephen Soltesz [Wed, 9 Apr 2008 17:13:05 +0000 (17:13 +0000)]
- additional regular rotations.

16 years ago-added commands to get and set the ticket status so this can be done automatically...
Stephen Soltesz [Wed, 9 Apr 2008 17:09:33 +0000 (17:09 +0000)]
-added commands to get and set the ticket status so this can be done automatically on node restoration.

16 years agoindent change
Stephen Soltesz [Wed, 9 Apr 2008 17:08:58 +0000 (17:08 +0000)]
indent change

16 years ago- cleaning of code.
Stephen Soltesz [Wed, 9 Apr 2008 17:08:38 +0000 (17:08 +0000)]
- cleaning of code.
- save all of the RT db.

16 years ago-Some code cleaning to remove old ipal implementation.
Stephen Soltesz [Wed, 9 Apr 2008 17:06:36 +0000 (17:06 +0000)]
-Some code cleaning to remove old ipal implementation.
-Better pcuid mappings to different pcus.
-takes command line argument when run as a program 'reboot.py <hostname>'

16 years agoadd simple command line tools for manipulating node groups, and for querying
Stephen Soltesz [Wed, 9 Apr 2008 16:56:35 +0000 (16:56 +0000)]
add simple command line tools for manipulating node groups, and for querying
the information collected by monitor for a given node.

16 years agotake away the lowercase 'm'onitor.spec.
Stephen Soltesz [Wed, 9 Apr 2008 13:58:37 +0000 (13:58 +0000)]
take away the lowercase 'm'onitor.spec.

16 years agoAdded two requirements.
Stephen Soltesz [Tue, 8 Apr 2008 21:20:31 +0000 (21:20 +0000)]
Added two requirements.

16 years agocapitalize for build?
Stephen Soltesz [Tue, 8 Apr 2008 20:59:25 +0000 (20:59 +0000)]
capitalize for build?

16 years agoused the wrong spec file as a template.
Stephen Soltesz [Tue, 8 Apr 2008 20:50:19 +0000 (20:50 +0000)]
used the wrong spec file as a template.

16 years agoInitial add of monitor spec, init, and cron file for the monitor root account scripts
Stephen Soltesz [Tue, 8 Apr 2008 20:30:58 +0000 (20:30 +0000)]
Initial add of monitor spec, init, and cron file for the monitor root account scripts

16 years agoSimpler interface to api. Given a single object, it preserves the auth
Stephen Soltesz [Mon, 7 Apr 2008 20:49:49 +0000 (20:49 +0000)]
Simpler interface to api.  Given a single object, it preserves the auth
variable and passes it to all subsequent calls transparently.

16 years agoa key for the monitor user.
Stephen Soltesz [Fri, 4 Apr 2008 20:18:59 +0000 (20:18 +0000)]
a key for the monitor user.

16 years agoBasic script to collect ssh_rsa_keys for all nodes and dump into a known_hosts
Stephen Soltesz [Fri, 21 Mar 2008 17:26:49 +0000 (17:26 +0000)]
Basic script to collect ssh_rsa_keys for all nodes and dump into a known_hosts
file.  Problems:

 * needs to be updated periodically.
 * needs to co-exist with a user's non-pl entries in known_hosts
 * there doesn't seem to be a way to configure ssh to read two known_hosts files.

16 years agoAdd a new BayTech prompt type.
Stephen Soltesz [Thu, 28 Feb 2008 21:14:40 +0000 (21:14 +0000)]
Add a new BayTech prompt type.

16 years agoThis should be a global view of all things Monitor is doing, with
Stephen Soltesz [Tue, 11 Dec 2007 22:50:16 +0000 (22:50 +0000)]
This should be a global view of all things Monitor is doing, with
instantanious view for the health of a Site, Node, it's PCUs, and what actions
have been taken by Monitor or what external states are blocking it's progres..

16 years agoAdded a variety of filters to limit the nodes displayed. Also, added a 'nodesonly...
Stephen Soltesz [Tue, 11 Dec 2007 22:49:15 +0000 (22:49 +0000)]
Added a variety of filters to limit the nodes displayed.  Also, added a 'nodesonly' option

16 years agoadded some minor status message at the end
Stephen Soltesz [Tue, 11 Dec 2007 22:48:37 +0000 (22:48 +0000)]
added some minor status message at the end

16 years agojust assume that the host is up by using -P0 arg to nmap. Without this, nmap missed...
Stephen Soltesz [Tue, 11 Dec 2007 22:47:47 +0000 (22:47 +0000)]
just assume that the host is up by using -P0 arg to nmap. Without this, nmap missed some hosts that really were up.

16 years agobetter support for PCUs
Stephen Soltesz [Tue, 11 Dec 2007 22:47:04 +0000 (22:47 +0000)]
better support for PCUs

16 years agoadd a global version of getListFromFile()
Stephen Soltesz [Tue, 11 Dec 2007 22:46:14 +0000 (22:46 +0000)]
add a global version of getListFromFile()

16 years agoCache more stuff from plc in local files.
Stephen Soltesz [Tue, 11 Dec 2007 22:45:50 +0000 (22:45 +0000)]
Cache more stuff from plc in local files.

16 years agorecord a node's boot_state according to PLC's db.
Stephen Soltesz [Tue, 11 Dec 2007 22:45:09 +0000 (22:45 +0000)]
record a node's boot_state according to PLC's db.

16 years agoAdded support for sending Ctrl-C to some of the BayTechs, with the help of pexpect.py
Stephen Soltesz [Tue, 11 Dec 2007 22:44:32 +0000 (22:44 +0000)]
Added support for sending Ctrl-C to some of the BayTechs, with the help of pexpect.py

16 years agoThis is a better module for dealing with SSH logins using 'expect' like
Stephen Soltesz [Tue, 11 Dec 2007 22:43:55 +0000 (22:43 +0000)]
This is a better module for dealing with SSH logins using 'expect' like
processing.  This should replace pyssh eventually.

16 years agoDisabled the print when loading a pkl file
Stephen Soltesz [Tue, 11 Dec 2007 22:40:46 +0000 (22:40 +0000)]
Disabled the print when loading a pkl file

16 years agomade the 'get' function global to allow calls from other modules.
Stephen Soltesz [Tue, 11 Dec 2007 22:40:16 +0000 (22:40 +0000)]
made the 'get' function global to allow calls from other modules.

16 years agoadded getPersons() wrapper
Stephen Soltesz [Tue, 11 Dec 2007 22:39:15 +0000 (22:39 +0000)]
added getPersons() wrapper

16 years ago(no commit message)
Stephen Soltesz [Tue, 11 Dec 2007 22:38:51 +0000 (22:38 +0000)]

16 years agoAdded a fix for HPiLO that got lost some how.
Stephen Soltesz [Wed, 28 Nov 2007 22:24:48 +0000 (22:24 +0000)]
Added a fix for HPiLO that got lost some how.

16 years agoAdded two more APC models for brazil and berlin.
Stephen Soltesz [Wed, 28 Nov 2007 18:40:43 +0000 (18:40 +0000)]
Added two more APC models for brazil and berlin.

16 years agoTake out code to show passwords
Stephen Soltesz [Tue, 27 Nov 2007 20:22:46 +0000 (20:22 +0000)]
Take out code to show passwords

16 years agotypo. Missed a shell=True arg to Popen.
Stephen Soltesz [Tue, 27 Nov 2007 20:22:22 +0000 (20:22 +0000)]
typo.  Missed a shell=True arg to Popen.

16 years agosteps to make racadm workable.
Stephen Soltesz [Tue, 27 Nov 2007 18:33:23 +0000 (18:33 +0000)]
steps to make racadm workable.

16 years agoBetter code for PCU types. Class based with specific exceptions for different
Stephen Soltesz [Tue, 27 Nov 2007 18:30:46 +0000 (18:30 +0000)]
Better code for PCU types.  Class based with specific exceptions for different
error conditions.  Support for HPiLO, DRAC via racadm, and special cases for a
variety of weird configurations.

16 years agoUpdated findbadpcu.py with changes made in reboot.py. Simpler interface and
Stephen Soltesz [Tue, 27 Nov 2007 18:29:50 +0000 (18:29 +0000)]
Updated findbadpcu.py with changes made in reboot.py.  Simpler interface and
return values.

16 years agoupdated readme for DELL RAC3/4
Stephen Soltesz [Mon, 26 Nov 2007 23:51:09 +0000 (23:51 +0000)]
updated readme for DELL RAC3/4

16 years agoAdding subdirectories for remote commands to control ILO and DRAC cards over
Stephen Soltesz [Mon, 12 Nov 2007 21:21:05 +0000 (21:21 +0000)]
Adding subdirectories for remote commands to control ILO and DRAC cards over
HTTPS.  The iloxml should probably be a subdirectory of cmdhttps...

16 years agoPolicy.py includes updates to better handle PCUs
Stephen Soltesz [Wed, 7 Nov 2007 21:22:38 +0000 (21:22 +0000)]
Policy.py includes updates to better handle PCUs

emailTxt includes new messages related to PCUs

16 years agoAdded 'FORCED' to handle some special actions
Stephen Soltesz [Wed, 7 Nov 2007 21:21:28 +0000 (21:21 +0000)]
Added 'FORCED' to handle some special actions

16 years agoAdd a retry to the apc_reboot() for which there are different models.
Stephen Soltesz [Wed, 7 Nov 2007 20:33:44 +0000 (20:33 +0000)]
Add a retry to the apc_reboot() for which there are different models.

16 years agoAdded new squence for apc_reboot()
Stephen Soltesz [Wed, 7 Nov 2007 19:52:12 +0000 (19:52 +0000)]
Added new squence for apc_reboot()

16 years agoadded some new cr
Stephen Soltesz [Wed, 7 Nov 2007 18:22:05 +0000 (18:22 +0000)]
added some new cr

16 years agotrying to get ipal_reboot() to funciton properly for cambridge nodes.
Stephen Soltesz [Wed, 7 Nov 2007 18:06:17 +0000 (18:06 +0000)]
trying to get ipal_reboot() to funciton properly for cambridge nodes.

16 years agoIgnore empty 'portstatus' dicts. This just means the ports are down.
Stephen Soltesz [Mon, 5 Nov 2007 22:33:08 +0000 (22:33 +0000)]
Ignore empty 'portstatus' dicts.  This just means the ports are down.

16 years agoAllow queries using sitefilter regular expressions, rather than a single
Stephen Soltesz [Mon, 5 Nov 2007 22:32:19 +0000 (22:32 +0000)]
Allow queries using sitefilter regular expressions, rather than a single
loginbase.  Allows displaying common sites like 'cernet*'.

16 years agoTake PCUs into account. Need to test.
Stephen Soltesz [Mon, 5 Nov 2007 22:30:35 +0000 (22:30 +0000)]
Take PCUs into account.  Need to test.

16 years agoremoves function definitions consolidated in reboot.py
Stephen Soltesz [Mon, 5 Nov 2007 22:29:50 +0000 (22:29 +0000)]
removes function definitions consolidated in reboot.py

16 years agoNew message for PCU errors. Referrs to the pl-virtual-03 pcu status page
Stephen Soltesz [Mon, 5 Nov 2007 22:29:28 +0000 (22:29 +0000)]
New message for PCU errors.  Referrs to the pl-virtual-03 pcu status page

16 years agoadded several utility functions for rebooting nodes from Monitor's diagnose and
Stephen Soltesz [Mon, 5 Nov 2007 22:28:53 +0000 (22:28 +0000)]
added several utility functions for rebooting nodes from Monitor's diagnose and
action scripts.

16 years agominor changes to reflect the new Dupal-Book format for the Tech Guide
Stephen Soltesz [Mon, 5 Nov 2007 19:17:54 +0000 (19:17 +0000)]
minor changes to reflect the new Dupal-Book format for the Tech Guide

16 years agocollects all nodes associated with a list of loginbase patterns
Stephen Soltesz [Mon, 5 Nov 2007 17:16:28 +0000 (17:16 +0000)]
collects all nodes associated with a list of loginbase patterns

16 years agoMinor description of the dependencies that Monitor has for connecting to:
Stephen Soltesz [Fri, 2 Nov 2007 21:51:59 +0000 (21:51 +0000)]
Minor description of the dependencies that Monitor has for connecting to:

  * RT
  * MySQL
  * and local database output formats.

16 years agoRun the findbad* commands and copy the files to the appropriate locations.
Stephen Soltesz [Fri, 2 Nov 2007 21:48:37 +0000 (21:48 +0000)]
Run the findbad* commands and copy the files to the appropriate locations.