Structure: monitor module plc wrapper util functions pkl database access database models third-party data sources pcucontrol maps types to code reboot.py interface.py transport: pyssh ssh telnetlib models: hpilo cmds intelamt cmds racadm cmd ipmitool cmd web cgi scripts tgweb project... cmds py scripts node site pcu query grouprins bootman rpyc ############################### for each node: Check Status -> if Pass Threshold -> Create Issue -> Take Action -> email bm pcu plc reset apply penalties flag for admin for each issue check issue.status if issue.status is "open": issue.take_next_action() if issue.closed: issue.shutdown() if issue.paused: pass action_list for issuetype (pcudown) send email yield send email, apply penalty yield send email, apply second penalty yield send email action_list for issuetype (badhardware) action_list for issuetype (dnserror) action_list for issuetype (nodeconfig) action_list for issuetype (oldbootcd) action_list for issuetype (nodedown) if pcuok, reboot yield if pcuok, and reboot failed, set rins, reboot yield create_issue pcubroken send email yield send email, apply penalty yield send email, apppy second penalty yield send email TOOLS: * add a '--nocache' to the default set of options. * add a cache parameter in the monitor.conf file. TODO: * install openssh-server, passwd, perl-libwww-perl (for rt), rt-3.4.1, MySQL-python * had to mount -t devpts devpts /dev/pts to get ssh to work inside the chroot. also, disable the pam modules in /etc/pam.d/sshd * blue * auto configuration for php configuration. maybe run translation of monitor.conf before loading monitorconfig.php? * blue2 * A setup script of some kind would be nice that walked through : - writing monitorconfig.py - creation of monitorconfig.php - run syncplcdb.py - testapi.py - findbad.py on sample site. - nodebad.py - findbadpcus.py - nodequery.py - nodegroups.py - loads webpage for those retreived values to confirm setup succeeded. * reimplement the config.py / .config mechanism. I'd like for many commands to share very similar argument or argument sets, as well as have some common config options. I'm not sure the best way to do this. - features of config.py * parse arguments and return an object with attributes equal to the parser values. * maintain values consistently across modules at run time. * have default values that are not specified at each run time. * easy to import and use - config module is available via 'import config' or as returned by parsermodule.parse_args() - python supports load-once modules, so subsequent imports refer to the same module object. * have package pull in threadpool from easy_install * place PKL files in a real database * clean up plc.py; there's a lot of redundent code. * figure out python paths for user commands. - directories for pickle files. - add user in rpm install - user permissions for data files for day-to-day operations. * fix BayTechCtrlCUnibe expect script. * separate modules into different, logical categories, and create a python module as part of the install: command line, configuration, policy, data model, data access, object interfaces. Lower priority: * Add a more structured, 'automate' library of scripts and means of making batch calls, etc. * add a third package for user tools that will interact with the Monitor service. Mostly, I'm guessing this would be queries for the live status of nodes and a more reliable 'reboot' and 'reinstall' mechanism than currently availble with PLC. Done: * Find a better location to place and pull the PKL files currently in the pdb directory. Ultimately, these should be stored in a real DB. Until then, they should sit in a location that is accessible from the www scripts, backend scripts, and user utilities. * nodebad loads plc_hn2lb unconditionally * nodeinfo loads act_all unconditionally * change findbad.py default db name * remove deps on www.printbadnodes * reboot.py loads findbadpcus unconditionally. * nodequery loads findbad unconditionally * unified_model loads findbad unconditionally * threadpool package. * build cmdamt with g++ prior to packaging * www/*.py need appropriate access to database.py, config.py, monitorconfig.py, etc. - need to convert monitor.conf into monitorconf.sh and monitorconf.php * pull out global configuration information from various files, like rt_db, mailer.py, auth.py, and any others. Create a single configuration file from which all others pull. - convert plc and other files to use the new monitorconfig.py rather than auth, or plc.* - need to alter all import 'auth' statements.