Stephen Soltesz [Tue, 28 Jul 2009 22:47:16 +0000 (22:47 +0000)]
add -ldl, missing from original patch
Stephen Soltesz [Mon, 27 Jul 2009 16:52:07 +0000 (16:52 +0000)]
no tabs...
Stephen Soltesz [Mon, 27 Jul 2009 16:15:39 +0000 (16:15 +0000)]
Add additional messages regarding kinds of boot failures due to
notinstalled
filesystem corrupted
mount failed
missing kernel
Each of these events occurs with enough frequency that differentiating them
is helpful both for operators and for the user.
Stephen Soltesz [Sat, 25 Jul 2009 05:19:48 +0000 (05:19 +0000)]
added file missing from last commit to use 'runlevel' rather than 'bootstate'
Stephen Soltesz [Fri, 24 Jul 2009 20:42:29 +0000 (20:42 +0000)]
remove 'failboot' from possible boot states
update run level rather than the boot state
only upudate boot state for manual calls and reinstall->boot
Stephen Soltesz [Wed, 22 Jul 2009 23:44:30 +0000 (23:44 +0000)]
also return if fsck fails; it's a bad bad idea to try to mount if fsck has just failed.
Stephen Soltesz [Wed, 22 Jul 2009 23:05:47 +0000 (23:05 +0000)]
missed the 'merge' conflict stuff
Stephen Soltesz [Wed, 22 Jul 2009 23:02:31 +0000 (23:02 +0000)]
svn merge -c 9766 https://svn.planet-lab.org/svn/BootManager/branches/3.2 into trunk
Stephen Soltesz [Fri, 17 Jul 2009 22:06:30 +0000 (22:06 +0000)]
sync sys clock to hardware to prevent fsck reboot bug for systems with wrong
hardware clocks. there may be a better place for this code in bm.
Stephen Soltesz [Mon, 13 Jul 2009 17:36:03 +0000 (17:36 +0000)]
disable runlevelagent when it is installed by bootmanager as part of myplc.
Marc Fiuczynski [Mon, 29 Jun 2009 18:17:52 +0000 (18:17 +0000)]
Tagging module BootManager - BootManager-4.3-9
Special handling for "forcedeth" ethernet NIC.
Marc Fiuczynski [Sat, 20 Jun 2009 16:39:27 +0000 (16:39 +0000)]
https://bugzilla.redhat.com/show_bug.cgi?id=178557
lspci says that the builtin ethernet on nVidia nForce3 chipset is a
Bridge. It should be a Communications device.
This specifically was a problem on planetlab2.icu.ac.kr, but there
were a few others as well for which we worked around the problem by
inserting a known-to-work NIC card.
With this little special exception we basically treat this device as a
network device and can therefore use this NIC.
Stephen Soltesz [Mon, 15 Jun 2009 18:56:22 +0000 (18:56 +0000)]
Tagging module BootManager - BootManager-4.3-8
include a fix for public pl dealing with old/new boot images and root
environments
Stephen Soltesz [Mon, 8 Jun 2009 17:40:03 +0000 (17:40 +0000)]
clear immutible attribute before writing file, to address wide-spread issues
seen on public pl related to 2.6.12 boot cd and 2.6.22 root context.
Thierry Parmentelat [Fri, 15 May 2009 13:39:30 +0000 (13:39 +0000)]
Tagging module BootManager - BootManager-4.3-7
review selection nodefamily at bootstrapfs install-time
now based on (1) tags (2) nodefamily and (3) defaults
this is required on very old bootcd
Thierry Parmentelat [Fri, 15 May 2009 13:01:13 +0000 (13:01 +0000)]
use arch and pldistro tag first, then /etc/planetlab/nodefamily, then defaults
this is required for very old bootCD's that don't have a nodefamily
also this is compliant with what BetBootMedium is currently doing
Marc Fiuczynski [Wed, 29 Apr 2009 20:49:39 +0000 (20:49 +0000)]
Use modprobe module to write out /etc/modprobe.conf.
Tagging module BootManager - BootManager-4.3-6
Thierry Parmentelat [Wed, 22 Apr 2009 17:55:28 +0000 (17:55 +0000)]
Tagging module BootManager - BootManager-4.3-5
minor updates - using the new modprobe module *not* in this tag
Thierry Parmentelat [Wed, 22 Apr 2009 17:24:20 +0000 (17:24 +0000)]
for centos5.3 - workaround for mkinitrd, as our kernel lacks the dm-mem-cache module
Faiyaz Ahmed [Fri, 17 Apr 2009 18:52:10 +0000 (18:52 +0000)]
if i find the guy who thought that was a good idea....
Marc Fiuczynski [Wed, 15 Apr 2009 18:26:11 +0000 (18:26 +0000)]
Uses modprobe to write out /etc/modprobe.conf properly
Thierry Parmentelat [Wed, 8 Apr 2009 19:53:06 +0000 (19:53 +0000)]
Tagging module BootManager - BootManager-4.3-4
load device mapper if needed, for centos5-based bootcd variant
Thierry Parmentelat [Wed, 8 Apr 2009 19:51:06 +0000 (19:51 +0000)]
fix for centos5-based variant bootCD, where device mapper needs some help
Thierry Parmentelat [Wed, 25 Mar 2009 05:53:41 +0000 (05:53 +0000)]
Tagging module BootManager - BootManager-4.3-3
renumbered 4.3
New step StartRunLevelAgent
various other tweaks
Barış Metin [Mon, 23 Mar 2009 18:19:17 +0000 (18:19 +0000)]
fix the typo in the second line
Barış Metin [Mon, 23 Mar 2009 16:28:02 +0000 (16:28 +0000)]
move the comment as it causes problems with e100 and e1000.
this fixes the interface problem we had with plc 4.3 / centos5
Barış Metin [Thu, 19 Mar 2009 17:11:29 +0000 (17:11 +0000)]
create SYSIMG_PATH/{vservers,proc} before trying to mount them.
Thierry Parmentelat [Mon, 16 Mar 2009 20:45:08 +0000 (20:45 +0000)]
more renumbering 5.0 into 4.3
Thierry Parmentelat [Mon, 16 Mar 2009 14:21:11 +0000 (14:21 +0000)]
mass-renaming 5.0 into 4.3 - db still named planetlab5 and planetlab5.sql
Thierry Parmentelat [Mon, 16 Mar 2009 13:58:43 +0000 (13:58 +0000)]
svn-keywords
Stephen Soltesz [Fri, 13 Mar 2009 15:59:08 +0000 (15:59 +0000)]
merged from branch, to remove 'sorted' which is not supported in older
versions of python.
Stephen Soltesz [Thu, 26 Feb 2009 16:06:22 +0000 (16:06 +0000)]
same as on branch.
Thierry Parmentelat [Sun, 22 Feb 2009 23:26:24 +0000 (23:26 +0000)]
review dependencies globally : fewer are attached to myplc directly, and more are attached to the other PL subcomponents
Stephen Soltesz [Wed, 18 Feb 2009 23:47:38 +0000 (23:47 +0000)]
Patch to BootManager to implement the proposed run_level
http://lists.planet-lab.org/pipermail/devel/2009-February/003283.html
basically:
BM will safe it's session key to make it available to the RunlevelAgent.py
exported by monitor-runlevelagent package. This script is started and
runs continuously or until the system enter production.
build.sh copies the necessary files from the monitor-runlevelagent package
Stephen Soltesz [Sat, 14 Feb 2009 02:05:08 +0000 (02:05 +0000)]
preserve the 'session' variable across BootManager runs. This also makes the
file accessible to the RunlevelAgent.
Stephen Soltesz [Wed, 4 Feb 2009 16:10:41 +0000 (16:10 +0000)]
merge 'sorted' change into trunk.
Thierry Parmentelat [Fri, 30 Jan 2009 18:47:55 +0000 (18:47 +0000)]
another module to blacklist for poweredge 175
Thierry Parmentelat [Wed, 28 Jan 2009 22:58:09 +0000 (22:58 +0000)]
Tagging module BootManager - BootManager-5.0-2
most of the actual network config job moved to (py)plnet
support for RAWDISK
network interfaces deterministically sorted
does not use nodegroups anymore for getting node arch and other extensions
drop yum-based extensions
debug sshd started as early as possible
timestamped and uploadable logs (requires upload-bmlog.php from nodeconfig/)
cleaned up (drop support for bootcdv2)
still needs testing
Daniel Hokka Zakrisson [Mon, 19 Jan 2009 21:24:52 +0000 (21:24 +0000)]
Return the values from the API.
Stephen Soltesz [Sat, 10 Jan 2009 02:04:51 +0000 (02:04 +0000)]
run ValidateNodeInstall in debug/disabled/diagnose mode to fsck/mount the fs
before leaving the system. this is handy for admins who visit the node.
Stephen Soltesz [Sat, 10 Jan 2009 01:15:40 +0000 (01:15 +0000)]
merged from branch. run fsck before mounts.
Thierry Parmentelat [Thu, 18 Dec 2008 09:47:51 +0000 (09:47 +0000)]
a bit more explicit/helpful error message
Daniel Hokka Zakrisson [Wed, 17 Dec 2008 16:42:41 +0000 (16:42 +0000)]
Add support for a rawdisk model option, to let a slice use the drives which aren't required for the node/slices to function.
Daniel Hokka Zakrisson [Mon, 15 Dec 2008 22:05:06 +0000 (22:05 +0000)]
Use the correct variable.
Daniel Hokka Zakrisson [Mon, 15 Dec 2008 22:00:37 +0000 (22:00 +0000)]
Commit 11339 for the trunk.
Thierry Parmentelat [Fri, 28 Nov 2008 14:36:02 +0000 (14:36 +0000)]
renaming SliceAttribute into SliceTag and InterfaceSetting into InterfaceTag
Thierry Parmentelat [Tue, 25 Nov 2008 11:28:48 +0000 (11:28 +0000)]
fix how bm gets tags - test node now installs in 1'40'' for download+extract
Thierry Parmentelat [Mon, 24 Nov 2008 20:28:40 +0000 (20:28 +0000)]
download plain bootstrapfs tarball if the plain-bootstrapfs tag is set
Daniel Hokka Zakrisson [Thu, 20 Nov 2008 20:48:16 +0000 (20:48 +0000)]
Use the PCI slot to order the interfaces.
Thierry Parmentelat [Mon, 17 Nov 2008 14:07:59 +0000 (14:07 +0000)]
* improve availability - reliability : start a fallback sshd very early in the bm logic
* bm log upload (formerly known as alpina-logs) is avail. back again (requires nodeconfig-5.0-2)
* code cleanup : remove support for bootCD-2.x
see http://svn.planet-lab.org/ticket/427
Marc Fiuczynski [Wed, 5 Nov 2008 13:59:35 +0000 (13:59 +0000)]
Daniel rightly pointed out that my change to use
interface.get('hostname',hostname) instead of an if/then/else on
interface['hostname'] was not semantically equivalent. Reverted to
the original code for this case.
Marc Fiuczynski [Tue, 4 Nov 2008 21:01:48 +0000 (21:01 +0000)]
take close() out of conditional
Marc Fiuczynski [Tue, 4 Nov 2008 21:01:28 +0000 (21:01 +0000)]
Use more python native coding convention
Thierry Parmentelat [Sun, 2 Nov 2008 03:11:56 +0000 (03:11 +0000)]
the extensions node tag
Thierry Parmentelat [Tue, 21 Oct 2008 15:53:12 +0000 (15:53 +0000)]
uses the extension tag rather than nodegroups
Marc Fiuczynski [Tue, 21 Oct 2008 10:28:43 +0000 (10:28 +0000)]
PCI_* now defined in pypci module
Thierry Parmentelat [Tue, 21 Oct 2008 08:58:08 +0000 (08:58 +0000)]
not all platforms have the python2 hook
Thierry Parmentelat [Tue, 21 Oct 2008 08:54:23 +0000 (08:54 +0000)]
cosmetic
Thierry Parmentelat [Tue, 21 Oct 2008 05:41:02 +0000 (05:41 +0000)]
quick fix for broken build
Marc Fiuczynski [Fri, 17 Oct 2008 21:38:06 +0000 (21:38 +0000)]
change handling of dynamically loaded drivers
Marc Fiuczynski [Fri, 17 Oct 2008 19:29:13 +0000 (19:29 +0000)]
Minor clean up:
- moved definitions of PCI_BASE_CLASS_NETWORK, PCI_BASE_CLASS_STORAGE,
and PCI_ANY out to the pypci module. This way the pl_hwinit code
in the BootCD can use those definitions.
- fixed variable reference in print statement for "Unable to read".
- cleaned up code to print device modules in the main() function
Stephen Soltesz [Thu, 2 Oct 2008 20:56:40 +0000 (20:56 +0000)]
merged changes from branch
Stephen Soltesz [Tue, 30 Sep 2008 21:50:47 +0000 (21:50 +0000)]
change order so old boot cds don't keep pulling the boot manager from plc.
and successfully write out 'CANCEL_BOOT' flag.
Stephen Soltesz [Tue, 30 Sep 2008 21:08:33 +0000 (21:08 +0000)]
Update MINIMUM_BOOT_VERSION to 3,0
Thierry Parmentelat [Thu, 25 Sep 2008 20:14:41 +0000 (20:14 +0000)]
imports fixed and revisited
Thierry Parmentelat [Thu, 25 Sep 2008 14:43:35 +0000 (14:43 +0000)]
attempt to display timestamps during boot manager steps
Thierry Parmentelat [Wed, 10 Sep 2008 15:52:57 +0000 (15:52 +0000)]
Tagging module BootManager - BootManager-5.0-1
reflects new names from the data model
Thierry Parmentelat [Fri, 22 Aug 2008 16:28:49 +0000 (16:28 +0000)]
repaired - thanks fred
Stephen Soltesz [Tue, 19 Aug 2008 21:33:21 +0000 (21:33 +0000)]
update to modules. catches other e1000-like modules in newer hpdc7800
hardware, also ignore the i82875p_edac module, since it seems to cause
planetlab-3.cs.princeton.edu to hang
Thierry Parmentelat [Thu, 14 Aug 2008 09:58:55 +0000 (09:58 +0000)]
fix build
Stephen Soltesz [Fri, 25 Jul 2008 21:02:56 +0000 (21:02 +0000)]
Change all boot_state names based on new names, also add dependency of
bootmanager on 5.0 version of PLCAPI.
Stephen Soltesz [Tue, 22 Jul 2008 00:15:38 +0000 (00:15 +0000)]
Remove the .sgn file also. Otherwise the symlinks will break the signature
Stephen Soltesz [Mon, 21 Jul 2008 23:41:36 +0000 (23:41 +0000)]
Accept an argument that designates a node group name. Based on this name
modify the configuration file to look for the bootstrap fs images in
/boot/NODEGROUP/
This is easier to do automatically, rather than by hand.
Marc Fiuczynski [Fri, 27 Jun 2008 20:12:04 +0000 (20:12 +0000)]
move the UpdateNodeConfiguration step after the NodeUpdate step in ChainBoot
Thierry Parmentelat [Wed, 28 May 2008 10:20:23 +0000 (10:20 +0000)]
tmp fix before we use node tags
Thierry Parmentelat [Tue, 27 May 2008 09:40:21 +0000 (09:40 +0000)]
...
Thierry Parmentelat [Mon, 26 May 2008 14:13:19 +0000 (14:13 +0000)]
moving towards 5.0
Thierry Parmentelat [Mon, 26 May 2008 13:07:12 +0000 (13:07 +0000)]
Branch 5.0 for module BootManager created from tag BootManager-3.2-7
Thierry Parmentelat [Sat, 24 May 2008 16:19:50 +0000 (16:19 +0000)]
Tagging module BootManager - BootManager-3.2-7
dont unload cpqphp
Thierry Parmentelat [Sat, 24 May 2008 16:18:31 +0000 (16:18 +0000)]
dont unload cpqphp
Thierry Parmentelat [Thu, 24 Apr 2008 17:01:27 +0000 (17:01 +0000)]
Tagging module BootManager - BootManager-3.2-6
changes in the state automaton logic
root+swap = 7G
usb-key threshhold increased to 17 G
bootstrafs selection logic altered - uses /etc/planetlab/nodefamily instead of GetPlcRelease
Thierry Parmentelat [Thu, 24 Apr 2008 07:47:51 +0000 (07:47 +0000)]
simpler heuristic, same as for creating bootcd - less code, fewer bugs
Thierry Parmentelat [Wed, 23 Apr 2008 15:35:06 +0000 (15:35 +0000)]
review how to figure nodefamily:
* use /etc/planetlab/nodefamily from the bootcd if present
* no more hard-coded list ok known pldistros; has i386 and x86_64 hard-coded
* change semantics of nodegroup names:
** may be an arch, like 'x86_64'
** or a nodefamily as-is, e.g. 'onelab-x86_64'
Stephen Soltesz [Mon, 21 Apr 2008 18:34:24 +0000 (18:34 +0000)]
BM now recognizes two distinct forms of 'debug' mode.
1) 'dbg' comes from failing to reach 'boot' state.
2) 'diag' comes from an admin explicitly setting the node to
'diagnostic'/'debug' mode.
Previously there was no way to tell the diff. And, users trying to
reconfigure their machines, would reboot the node, but not reset the boot
state through the gui. The result would be that the machine would pull the
current boot state from PLC (which happened to be debug from previous
failures) and the node would appear to the user to still be broken. THis adds
unnecessary time to the support@ list in mis-understanding what's going on for the
user, when everything is actually in place on the node; it's just in the wrong
boot state.
Thus, having BM automatically try to perform the 'bootRun()' when in 'dbg'
state will prevent a correct configuration from stalling the node from coming
online. In this way, we have moved the requirement for setting the boot state
through the GUI by the user to BM.
I think this is preferrable.
Stephen Soltesz [Mon, 21 Apr 2008 16:31:38 +0000 (16:31 +0000)]
Add additional size to ROOT_SIZE to allow netflow data to be collected without
running out of room. Our initial trials on the public pl with 3GB root
partitions resulted in out of space errors, and lost netflow logs.
A preferable solution might be to have an independent partition, or to have a
reserved space in the /vserver partition, but adding a new parition is too
large a change too soon prior to relase of 4.2. And, putting non-slice data
into /vserver seems inappropriate design.
Stephen Soltesz [Fri, 11 Apr 2008 21:18:01 +0000 (21:18 +0000)]
Got complaints via RT that users are using usb sticks with 8GB of flash. This
causes BM to try to add the usb stick as part of the LVM volume. This fix
just increases the size of the flash device to avoid being included.
preferably, BM should avoid adding the device's bootmedia to the LVM. :-)
Thierry Parmentelat [Wed, 26 Mar 2008 09:34:01 +0000 (09:34 +0000)]
Tagging module BootManager - BootManager-3.2-5
renamed step InstallBootstrapRPM into InstallBootstrapFS
reviewed selection of bootstrapfs, based on nodegroups, for multi-arch deployment
import pypcimap rather than pypciscan
initial downlaoding of plc_config made more robust
root and /vservers file systems mounted ext3
calls to BootGetNodeDetails replaced with GetNodes/GetNodeNetworks
also seems to be using session-based authentication rather than former hmac-based one
Stephen Soltesz [Mon, 24 Mar 2008 16:33:36 +0000 (16:33 +0000)]
display the source file path in the log messages to determine specifically
which 'bootstrapfs' was downloaded, since we're using different directories to
differentiate nodegroup images.
Faiyaz Ahmed [Tue, 18 Mar 2008 20:13:43 +0000 (20:13 +0000)]
bump config version
Faiyaz Ahmed [Tue, 18 Mar 2008 19:48:44 +0000 (19:48 +0000)]
wrong api server.
Faiyaz Ahmed [Tue, 18 Mar 2008 15:57:22 +0000 (15:57 +0000)]
Key error. fixed. Sorry for the delay, Thierry.
Faiyaz Ahmed [Mon, 17 Mar 2008 19:22:58 +0000 (19:22 +0000)]
wrong key. Fixed.
Faiyaz Ahmed [Mon, 17 Mar 2008 15:44:36 +0000 (15:44 +0000)]
remove debug line.
Faiyaz Ahmed [Fri, 14 Mar 2008 20:53:12 +0000 (20:53 +0000)]
Keep session around instead of calling the API for a new one every time a call is made.
Faiyaz Ahmed [Fri, 14 Mar 2008 19:16:15 +0000 (19:16 +0000)]
BootNotifyOwners now accepts session auth.
Faiyaz Ahmed [Fri, 14 Mar 2008 03:08:43 +0000 (03:08 +0000)]
use Notify API function
Faiyaz Ahmed [Fri, 14 Mar 2008 02:07:09 +0000 (02:07 +0000)]
syntax error.
Faiyaz Ahmed [Fri, 14 Mar 2008 02:02:11 +0000 (02:02 +0000)]
get site_id
Faiyaz Ahmed [Fri, 14 Mar 2008 00:51:28 +0000 (00:51 +0000)]
Unclear why this was needed. Removed Boot API calls.
Faiyaz Ahmed [Thu, 13 Mar 2008 23:31:57 +0000 (23:31 +0000)]
dont know how this got in there, but reverting to original copy