From: Planet-Lab Support Date: Fri, 21 Jan 2005 03:35:15 +0000 (+0000) Subject: This commit was manufactured by cvs2svn to create tag X-Git-Tag: before-ckrm_E16rc1-mem-controller-fix-merge^0 X-Git-Url: http://git.onelab.eu/?a=commitdiff_plain;h=43afcf71063ca7e32b981fc404737d7a78b545ac;p=linux-2.6.git This commit was manufactured by cvs2svn to create tag 'before-ckrm_E16rc1-mem-controller-fix-merge'. --- diff --git a/Documentation/ckrm/ckrm_basics b/Documentation/ckrm/ckrm_basics deleted file mode 100644 index cfd9a9256..000000000 --- a/Documentation/ckrm/ckrm_basics +++ /dev/null @@ -1,66 +0,0 @@ -CKRM Basics -------------- -A brief review of CKRM concepts and terminology will help make installation -and testing easier. For more details, please visit http://ckrm.sf.net. - -Currently there are two class types, taskclass and socketclass for grouping, -regulating and monitoring tasks and sockets respectively. - -To avoid repeating instructions for each classtype, this document assumes a -task to be the kernel object being grouped. By and large, one can replace task -with socket and taskclass with socketclass. - -RCFS depicts a CKRM class as a directory. Hierarchy of classes can be -created in which children of a class share resources allotted to -the parent. Tasks can be classified to any class which is at any level. -There is no correlation between parent-child relationship of tasks and -the parent-child relationship of classes they belong to. - -Without a Classification Engine, class is inherited by a task. A privileged -user can reassigned a task to a class as described below, after which all -the child tasks under that task will be assigned to that class, unless the -user reassigns any of them. - -A Classification Engine, if one exists, will be used by CKRM to -classify a task to a class. The Rule based classification engine uses some -of the attributes of the task to classify a task. When a CE is present -class is not inherited by a task. - -Characteristics of a class can be accessed/changed through the following magic -files under the directory representing the class: - -shares: allows to change the shares of different resources managed by the - class -stats: allows to see the statistics associated with each resources managed - by the class -target: allows to assign a task to a class. If a CE is present, assigning - a task to a class through this interface will prevent CE from - reassigning the task to any class during reclassification. -members: allows to see which tasks has been assigned to a class -config: allow to view and modify configuration information of different - resources in a class. - -Resource allocations for a class is controlled by the parameters: - -guarantee: specifies how much of a resource is guranteed to a class. A - special value DONT_CARE(-2) mean that there is no specific - guarantee of a resource is specified, this class may not get - any resource if the system is runing short of resources -limit: specifies the maximum amount of resource that is allowed to be - allocated by a class. A special value DONT_CARE(-2) mean that - there is no specific limit is specified, this class can get all - the resources available. -total_guarantee: total guarantee that is allowed among the children of this - class. In other words, the sum of "guarantee"s of all children - of this class cannot exit this number. -max_limit: Maximum "limit" allowed for any of this class's children. In - other words, "limit" of any children of this class cannot exceed - this value. - -None of this parameters are absolute or have any units associated with -them. These are just numbers(that are relative to its parents') that are -used to calculate the absolute number of resource available for a specific -class. - -Note: The root class has an absolute number of resource units associated with it. - diff --git a/Documentation/ckrm/core_usage b/Documentation/ckrm/core_usage deleted file mode 100644 index 6b5d808c3..000000000 --- a/Documentation/ckrm/core_usage +++ /dev/null @@ -1,72 +0,0 @@ -Usage of CKRM without a classification engine ------------------------------------------------ - -1. Create a class - - # mkdir /rcfs/taskclass/c1 - creates a taskclass named c1 , while - # mkdir /rcfs/socket_class/s1 - creates a socketclass named s1 - -The newly created class directory is automatically populated by magic files -shares, stats, members, target and config. - -2. View default shares - - # cat /rcfs/taskclass/c1/shares - - "guarantee=-2,limit=-2,total_guarantee=100,max_limit=100" is the default - value set for resources that have controllers registered with CKRM. - -3. change shares of a - - One or more of the following fields can/must be specified - res= #mandatory - guarantee= - limit= - total_guarantee= - max_limit= - e.g. - # echo "res=numtasks,limit=20" > /rcfs/taskclass/c1 - - If any of these parameters are not specified, the current value will be - retained. - -4. Reclassify a task (listening socket) - - write the pid of the process to the destination class' target file - # echo 1004 > /rcfs/taskclass/c1/target - - write the "\" string to the destination class' target file - # echo "0.0.0.0\32770" > /rcfs/taskclass/c1/target - -5. Get a list of tasks (sockets) assigned to a taskclass (socketclass) - - # cat /rcfs/taskclass/c1/members - lists pids of tasks belonging to c1 - - # cat /rcfs/socket_class/s1/members - lists the ipaddress\port of all listening sockets in s1 - -6. Get the statictics of different resources of a class - - # cat /rcfs/tasksclass/c1/stats - shows c1's statistics for each resource with a registered resource - controller. - - # cat /rcfs/socket_class/s1/stats - show's s1's stats for the listenaq controller. - -7. View the configuration values of the resources associated with a class - - # cat /rcfs/taskclass/c1/config - shows per-controller config values for c1. - -8. Change the configuration values of resources associated with a class - Configuration values are different for different resources. the comman - field "res=" must always be specified. - - # echo "res=numtasks,parameter=value" > /rcfs/taskclass/c1/config - to change (without any effect), the value associated with . - - diff --git a/Documentation/ckrm/crbce b/Documentation/ckrm/crbce deleted file mode 100644 index dfb4b1e96..000000000 --- a/Documentation/ckrm/crbce +++ /dev/null @@ -1,33 +0,0 @@ -CRBCE ----------- - -crbce is a superset of rbce. In addition to providing automatic -classification, the crbce module -- monitors per-process delay data that is collected by the delay -accounting patch -- collects data on significant kernel events where reclassification -could occur e.g. fork/exec/setuid/setgid etc., and -- uses relayfs to supply both these datapoints to userspace - -To illustrate the utility of the data gathered by crbce, we provide a -userspace daemon called crbcedmn that prints the header info received -from the records sent by the crbce module. - -0. Ensure that a CKRM-enabled kernel with following options configured - has been compiled. At a minimum, core, rcfs, atleast one classtype, - delay-accounting patch and relayfs. For testing, it is recommended - all classtypes and resource controllers be compiled as modules. - -1. Ensure that the Makefile's BUILD_CRBCE=1 and KDIR points to the - kernel of step 1 and call make. - This also builds the userspace daemon, crbcedmn. - -2..9 Same as rbce installation and testing instructions, - except replacing rbce.ko with crbce.ko - -10. Read the pseudo daemon help file - # ./crbcedmn -h - -11. Run the crbcedmn to display all records being processed - # ./crbcedmn - diff --git a/Documentation/ckrm/installation b/Documentation/ckrm/installation deleted file mode 100644 index 0c9033891..000000000 --- a/Documentation/ckrm/installation +++ /dev/null @@ -1,70 +0,0 @@ -Kernel installation ------------------------------- - - = version of mainline Linux kernel - = version of CKRM - -Note: It is expected that CKRM versions will change fairly rapidly. Hence once -a CKRM version has been released for some , it will only be made -available for future 's until the next CKRM version is released. - -1. Patch - - Apply ckrm/kernel//ckrm-.patch to a mainline kernel - tree with version . - - If CRBCE will be used, additionally apply the following patches, in order: - delayacctg-.patch - relayfs-.patch - - -2. Configure - -Select appropriate configuration options: - -a. for taskclasses - - General Setup-->Class Based Kernel Resource Management - - [*] Class Based Kernel Resource Management - Resource Class File System (User API) - [*] Class Manager for Task Groups - Number of Tasks Resource Manager - -b. To test socket_classes and multiple accept queue controller - - General Setup-->Class Based Kernel Resource Management - [*] Class Based Kernel Resource Management - Resource Class File System (User API) - [*] Class Manager for socket groups - Multiple Accept Queues Resource Manager - - Device Drivers-->Networking Support-->Networking options--> - [*] Network packet filtering (replaces ipchains) - [*] IP: TCP Multiple accept queues support - -c. To test CRBCE later (requires 2a.) - - File Systems-->Pseudo filesystems--> - Relayfs filesystem support - (enable all sub fields) - - General Setup--> - [*] Enable delay accounting - - -3. Build, boot into kernel - -4. Enable rcfs - - # insmod /fs/rcfs/rcfs.ko - # mount -t rcfs rcfs /rcfs - - This will create the directories /rcfs/taskclass and - /rcfs/socketclass which are the "roots" of subtrees for creating - taskclasses and socketclasses respectively. - -5. Load numtasks and listenaq controllers - - # insmod /kernel/ckrm/ckrm_tasks.ko - # insmod /kernel/ckrm/ckrm_listenaq.ko diff --git a/Documentation/ckrm/rbce_basics b/Documentation/ckrm/rbce_basics deleted file mode 100644 index fd66ef2fb..000000000 --- a/Documentation/ckrm/rbce_basics +++ /dev/null @@ -1,67 +0,0 @@ -Rule-based Classification Engine (RBCE) -------------------------------------------- - -The ckrm/rbce directory contains the sources for two classification engines -called rbce and crbce. Both are optional, built as kernel modules and share much -of their codebase. Only one classification engine (CE) can be loaded at a time -in CKRM. - - -With RBCE, user can specify rules for how tasks are classified to a -class. Rules are specified by one or more attribute-value pairs and -an associated class. The tasks that match all the attr-value pairs -will get classified to the class attached with the rule. - -The file rbce_info under /rcfs/ce directory details the functionality -of different files available under the directory and also details -about attributes that can are used to define rules. - -order: When multiple rules are defined the rules are executed - according to the order of a rule. Order can be specified - while defining a rule. If order is not specified, the - highest order will be assigned to the rule(i.e, the new - rule will be executed after all the previously defined - evaluate false). So, order of rules is important as that - will decide, which class a task will get assigned to. For - example, if we have the two following rules: r1: - uid=1004,order=10,class=/rcfs/taskclass/c1 r2: - uid=1004,cmd=grep,order=20,class=/rcfs/taskclass/c2 then, - the task "grep" executed by user 1004 will always be - assigned to class /rcfs/taskclass/c1, as rule r1 will be - executed before r2 and the task successfully matched the - rule's attr-value pairs. Rule r2 will never be consulted - for the command. Note: The order in which the rules are - displayed(by ls) has no correlation with the order of the - rule. - -dependency: Rules can be defined to be depend on another rule. i.e a - rule can be dependent on one rule and has its own - additional attr-value pairs. the dependent rule will - evaluate true only if all the attr-value pairs of both - rules are satisfied. ex: r1: gid=502,class=/rcfs/taskclass - r2: depend=r1,cmd=grep,class=rcfstaskclass/c1 r2 is a - dependent rule that depends on r1, a task will be assigned - to /rcfs/taskclass/c1 if its gid is 502 and the executable - command name is "grep". If a task's gid is 502 but the - command name is _not_ "grep" then it will be assigned to - /rcfs/taskclass - - Note: The order of dependent rule must be _lesser_ than the - rule it depends on, so that it is evaluated _before the - base rule is evaluated. Otherwise the base rule will - evaluate true and the task will be assigned to the class of - that rule without the dependent rule ever getting - evaluated. In the example above, order of r2 must be lesser - than order of r1. - -app_tag: a task can be attached with a tag(ascii string), that becomes - an attribute of that task and rules can be defined with the - tag value. - -state: states are at two levels in RBCE. The entire RBCE can be - enabled or disabled which writing 1 or 0 to the file - rbce_state under /rcfs/ce. Disabling RBCE, would mean that - the rules defined in RBCE will not be utilized for - classifying a task to a class. A specific rule can be - enabled/disabled by changing the state of that rule. Once - it is disabled, the rule will not be evaluated. diff --git a/Documentation/ckrm/rbce_usage b/Documentation/ckrm/rbce_usage deleted file mode 100644 index 6d1592646..000000000 --- a/Documentation/ckrm/rbce_usage +++ /dev/null @@ -1,98 +0,0 @@ -Usage of CKRM with RBCE --------------------------- - -0. Ensure that a CKRM-enabled kernel with following options configured - has been compiled. At a minimum, core, rcfs and atleast one - classtype. For testing, it is recommended all classtypes and - resource controllers be compiled as modules. - -1. Change ckrm/rbce/Makefile's KDIR to point to this compiled kernel's source - tree and call make - -2. Load rbce module. - # insmod ckrm/rbce/rbce.ko - Note that /rcfs has to be mounted before this. - Note: this command should populate the directory /rcfs/ce with files - rbce_reclassify, rbce_tag, rbce_info, rbce_state and a directory - rules. - - Note2: If these are not created automatically, just create them by - using the commands touch and mkdir.(bug that needs to be fixed) - -3. Defining a rule - Rules are defined by creating(by writing) to a file under the - /rcfs/ce/rules directory by concatinating multiple attribute value - pairs. - - Note that the classes must be defined before defining rules that - uses the classes. eg: the command # echo - "uid=1004,class=/rcfs/taskclass/c1" > /rcfs/ce/rules/r1 will define - a rule r1 that classifies all tasks belong to user id 1004 to class - /rcfs/taskclass/c1 - -4. Viewing a rule - read the corresponding file. - to read rule r1, issue the command: - # cat /rcfs/ce/rules/r1 - -5. Changing a rule - - Changing a rule is done the same way as defining a rule, the new - rule will include the old set of attr-value pairs slapped with new - attr-value pairs. eg: if the current r2 is - uid=1004,depend=r1,class=/rcfs/taskclass/c1 - (r1 as defined in step 3) - - the command: - # echo gid=502 > /rcfs/ce/rules/r1 - will change the rule to - r1: uid=1004,gid=502,depend=r1,class=/rcfs/taskclass/c1 - - the command: - # echo uid=1005 > /rcfs/ce/rules/r1 - will change the rule to - r1: uid=1005,class=/rcfs/taskclass/c1 - - the command: - # echo class=/rcfs/taskclass/c2 > /rcfs/ce/rules/r1 - will change the rule to - r1: uid=1004,depend=r1,class=/rcfs/taskclass/c2 - - the command: - # echo depend=r4 > /rcfs/ce/rules/r1 - will change the rule to - r1: uid=1004,depend=r4,class=/rcfs/taskclass/c2 - - the command: - # echo +depend=r4 > /rcfs/ce/rules/r1 - will change the rule to - r1: uid=1004,depend=r1,depend=r4,class=/rcfs/taskclass/c2 - - the command: - # echo -depend=r1 > /rcfs/ce/rules/r1 - will change the rule to - r1: uid=1004,class=/rcfs/taskclass/c2 - -6. Checking the state of RBCE - State(enabled/disabled) of RBCE can be checked by reading the file - /rcfs/ce/rbce_state, it will show 1(enabled) or 0(disabled). - By default, RBCE is enabled(1). - ex: # cat /rcfs/ce/rbce_state - -7. Changing the state of RBCE - State of RBCE can be changed by writing 1(enable) or 0(disable). - ex: # echo 1 > cat /rcfs/ce/rbce_state - -8. Checking the state of a rule - State of a rule is displayed in the rule. Rule can be viewed by - reading the rule file. ex: # cat /rcfs/ce/rules/r1 - -9. Changing the state of a rule - - State of a rule can be changed by writing "state=1"(enable) or - "state=0"(disable) to the corresponding rule file. By defeault, the - rule is enabled when defined. ex: to disable an existing rule r1, - issue the command - # echo "state=0" > /rcfs/ce/rules/r1 - - diff --git a/Makefile b/Makefile index 4d94580e0..68753a836 100644 --- a/Makefile +++ b/Makefile @@ -1,7 +1,7 @@ VERSION = 2 PATCHLEVEL = 6 SUBLEVEL = 8 -EXTRAVERSION = -1.521.2.5.planetlab +EXTRAVERSION = -1.521.1.planetlab NAME=Zonked Quokka # *DOCUMENTATION* diff --git a/configs/kernel-2.6.8-i686-planetlab.config b/configs/kernel-2.6.8-i686-planetlab.config index ea66387e5..5bed0a847 100644 --- a/configs/kernel-2.6.8-i686-planetlab.config +++ b/configs/kernel-2.6.8-i686-planetlab.config @@ -1053,7 +1053,12 @@ CONFIG_NLS_UTF8=m # # Kernel hacking # -# CONFIG_CRASH_DUMP is not set +CONFIG_CRASH_DUMP=y +CONFIG_CRASH_DUMP_BLOCKDEV=y +# CONFIG_CRASH_DUMP_NETDEV is not set +# CONFIG_CRASH_DUMP_MEMDEV is not set +# CONFIG_CRASH_DUMP_COMPRESS_RLE is not set +# CONFIG_CRASH_DUMP_COMPRESS_GZIP is not set CONFIG_DEBUG_KERNEL=y CONFIG_EARLY_PRINTK=y CONFIG_DEBUG_STACKOVERFLOW=y @@ -1079,7 +1084,7 @@ CONFIG_VSERVER_LEGACY=y CONFIG_INOXID_UGID24=y # CONFIG_INOXID_INTERN is not set # CONFIG_INOXID_RUNTIME is not set -# CONFIG_VSERVER_DEBUG is not set +CONFIG_VSERVER_DEBUG=y # # Security options diff --git a/fs/ext2/acl.c b/fs/ext2/acl.c index d232026b4..74acc7846 100644 --- a/fs/ext2/acl.c +++ b/fs/ext2/acl.c @@ -9,7 +9,6 @@ #include #include #include -#include #include "ext2.h" #include "xattr.h" #include "acl.h" @@ -292,9 +291,6 @@ ext2_permission(struct inode *inode, int mask, struct nameidata *nd) { int mode = inode->i_mode; - /* Prevent vservers from escaping chroot() barriers */ - if (IS_BARRIER(inode) && !vx_check(0, VX_ADMIN)) - return -EACCES; /* Nobody gets write access to a read-only fs */ if ((mask & MAY_WRITE) && (IS_RDONLY(inode) || (nd && MNT_IS_RDONLY(nd->mnt))) && diff --git a/fs/ext2/inode.c b/fs/ext2/inode.c index fe9c6a13b..1ef02bccb 100644 --- a/fs/ext2/inode.c +++ b/fs/ext2/inode.c @@ -1030,7 +1030,7 @@ void ext2_set_inode_flags(struct inode *inode) { unsigned int flags = EXT2_I(inode)->i_flags; - inode->i_flags &= ~(S_SYNC|S_APPEND|S_IMMUTABLE|S_IUNLINK|S_BARRIER|S_NOATIME|S_DIRSYNC); + inode->i_flags &= ~(S_SYNC|S_APPEND|S_IMMUTABLE|S_NOATIME|S_DIRSYNC); if (flags & EXT2_SYNC_FL) inode->i_flags |= S_SYNC; if (flags & EXT2_APPEND_FL) diff --git a/fs/ext2/ioctl.c b/fs/ext2/ioctl.c index 594c16c80..f6043a6e2 100644 --- a/fs/ext2/ioctl.c +++ b/fs/ext2/ioctl.c @@ -50,11 +50,11 @@ int ext2_ioctl (struct inode * inode, struct file * filp, unsigned int cmd, * * This test looks nicer. Thanks to Pauline Middelink */ - if (((oldflags & EXT2_IMMUTABLE_FL) || + if ((oldflags & EXT2_IMMUTABLE_FL) || ((flags ^ oldflags) & - (EXT2_APPEND_FL | EXT2_IMMUTABLE_FL | EXT2_IUNLINK_FL))) - && !capable(CAP_LINUX_IMMUTABLE)) { - return -EPERM; + (EXT2_APPEND_FL | EXT2_IMMUTABLE_FL))) { + if (!capable(CAP_LINUX_IMMUTABLE)) + return -EPERM; } flags = flags & EXT2_FL_USER_MODIFIABLE; diff --git a/fs/ext3/acl.c b/fs/ext3/acl.c index e89cb306c..cc26948d5 100644 --- a/fs/ext3/acl.c +++ b/fs/ext3/acl.c @@ -11,7 +11,6 @@ #include #include #include -#include #include "xattr.h" #include "acl.h" @@ -297,9 +296,6 @@ ext3_permission(struct inode *inode, int mask, struct nameidata *nd) { int mode = inode->i_mode; - /* Prevent vservers from escaping chroot() barriers */ - if (IS_BARRIER(inode) && !vx_check(0, VX_ADMIN)) - return -EACCES; /* Nobody gets write access to a read-only fs */ if ((mask & MAY_WRITE) && (IS_RDONLY(inode) || (nd && nd->mnt && MNT_IS_RDONLY(nd->mnt))) && diff --git a/fs/ext3/inode.c b/fs/ext3/inode.c index 7bc33d5f5..962aef215 100644 --- a/fs/ext3/inode.c +++ b/fs/ext3/inode.c @@ -2474,7 +2474,7 @@ void ext3_set_inode_flags(struct inode *inode) { unsigned int flags = EXT3_I(inode)->i_flags; - inode->i_flags &= ~(S_SYNC|S_APPEND|S_IMMUTABLE|S_IUNLINK|S_BARRIER|S_NOATIME|S_DIRSYNC); + inode->i_flags &= ~(S_SYNC|S_APPEND|S_IMMUTABLE|S_NOATIME|S_DIRSYNC); if (flags & EXT3_SYNC_FL) inode->i_flags |= S_SYNC; if (flags & EXT3_APPEND_FL) diff --git a/fs/ext3/ioctl.c b/fs/ext3/ioctl.c index f58d49736..37bd4509d 100644 --- a/fs/ext3/ioctl.c +++ b/fs/ext3/ioctl.c @@ -59,11 +59,11 @@ int ext3_ioctl (struct inode * inode, struct file * filp, unsigned int cmd, * * This test looks nicer. Thanks to Pauline Middelink */ - if (((oldflags & EXT3_IMMUTABLE_FL) || + if ((oldflags & EXT3_IMMUTABLE_FL) || ((flags ^ oldflags) & - (EXT3_APPEND_FL | EXT3_IMMUTABLE_FL | EXT3_IUNLINK_FL))) - && !capable(CAP_LINUX_IMMUTABLE)) { - return -EPERM; + (EXT3_APPEND_FL | EXT3_IMMUTABLE_FL))) { + if (!capable(CAP_LINUX_IMMUTABLE)) + return -EPERM; } /* diff --git a/fs/ioctl.c b/fs/ioctl.c index 6404b0c10..96a1b601e 100644 --- a/fs/ioctl.c +++ b/fs/ioctl.c @@ -173,19 +173,6 @@ asmlinkage long sys_ioctl(unsigned int fd, unsigned int cmd, unsigned long arg) error = vx_proc_ioctl(filp->f_dentry->d_inode, filp, cmd, arg); break; #endif - case FIOC_SETIATTR: - case FIOC_GETIATTR: - /* - * Verify that this filp is a file object, - * not (say) a socket. - */ - error = -ENOTTY; - if (S_ISREG(filp->f_dentry->d_inode->i_mode) || - S_ISDIR(filp->f_dentry->d_inode->i_mode)) - error = vc_iattr_ioctl(filp->f_dentry, - cmd, arg); - break; - default: error = -ENOTTY; if (S_ISREG(filp->f_dentry->d_inode->i_mode)) diff --git a/fs/namei.c b/fs/namei.c index 656430d6b..34da5b453 100644 --- a/fs/namei.c +++ b/fs/namei.c @@ -165,10 +165,6 @@ int vfs_permission(struct inode * inode, int mask) { umode_t mode = inode->i_mode; - /* Prevent vservers from escaping chroot() barriers */ - if (IS_BARRIER(inode) && !vx_check(0, VX_ADMIN)) - return -EACCES; - if (mask & MAY_WRITE) { /* * Nobody gets write access to a read-only fs. @@ -214,6 +210,20 @@ int vfs_permission(struct inode * inode, int mask) return -EACCES; } +static inline int xid_permission(struct inode *inode, int mask, struct nameidata *nd) +{ + if (inode->i_xid == 0) + return 0; + if (vx_check(inode->i_xid, VX_ADMIN|VX_WATCH|VX_IDENT)) + return 0; +/* + printk("VSW: xid=%d denied access to %p[#%d,%lu] »%*s«.\n", + vx_current_xid(), inode, inode->i_xid, inode->i_ino, + nd->dentry->d_name.len, nd->dentry->d_name.name); +*/ + return -EACCES; +} + int permission(struct inode * inode,int mask, struct nameidata *nd) { int retval; @@ -227,6 +237,8 @@ int permission(struct inode * inode,int mask, struct nameidata *nd) (S_ISREG(mode) || S_ISDIR(mode) || S_ISLNK(mode))) return -EROFS; + if ((retval = xid_permission(inode, mask, nd))) + return retval; if (inode->i_op && inode->i_op->permission) retval = inode->i_op->permission(inode, submask, nd); else @@ -2013,13 +2025,8 @@ asmlinkage long sys_link(const char __user * oldname, const char __user * newnam error = path_lookup(to, LOOKUP_PARENT, &nd); if (error) goto out; - /* - * We allow hard-links to be created to a bind-mount as long - * as the bind-mount is not read-only. Checking for cross-dev - * links is subsumed by the superblock check in vfs_link(). - */ - error = -EROFS; - if (MNT_IS_RDONLY(old_nd.mnt)) + error = -EXDEV; + if (old_nd.mnt != nd.mnt) goto out_release; new_dentry = lookup_create(&nd, 0); error = PTR_ERR(new_dentry); diff --git a/fs/rcfs/magic.c b/fs/rcfs/magic.c index 1cada33e5..38281ee3d 100644 --- a/fs/rcfs/magic.c +++ b/fs/rcfs/magic.c @@ -504,7 +504,7 @@ shares_write(struct file *file, const char __user * buf, } } - printk(KERN_DEBUG "Set %s shares to %d %d %d %d\n", + printk(KERN_ERR "Set %s shares to %d %d %d %d\n", resname, newshares.my_guarantee, newshares.my_limit, diff --git a/fs/rcfs/rootdir.c b/fs/rcfs/rootdir.c index d827db662..6da575ed6 100644 --- a/fs/rcfs/rootdir.c +++ b/fs/rcfs/rootdir.c @@ -91,7 +91,7 @@ int rcfs_mkroot(struct rcfs_magf *mfdesc, int mfcount, struct dentry **rootde) return -EINVAL; rootdesc = &mfdesc[0]; - printk(KERN_DEBUG "allocating classtype root <%s>\n", rootdesc->name); + printk("allocating classtype root <%s>\n", rootdesc->name); dentry = rcfs_create_internal(rcfs_rootde, rootdesc, 0); if (!dentry) { diff --git a/fs/rcfs/super.c b/fs/rcfs/super.c index f013df226..871b7fb17 100644 --- a/fs/rcfs/super.c +++ b/fs/rcfs/super.c @@ -164,7 +164,7 @@ static int rcfs_fill_super(struct super_block *sb, void *data, int silent) clstype = ckrm_classtypes[i]; if (clstype == NULL) continue; - printk(KERN_DEBUG "A non null classtype\n"); + printk("A non null classtype\n"); if ((rc = rcfs_register_classtype(clstype))) continue; // could return with an error too diff --git a/fs/reiserfs/xattr.c b/fs/reiserfs/xattr.c index a70801f35..f8babe603 100644 --- a/fs/reiserfs/xattr.c +++ b/fs/reiserfs/xattr.c @@ -1338,10 +1338,6 @@ __reiserfs_permission (struct inode *inode, int mask, struct nameidata *nd, { umode_t mode = inode->i_mode; - /* Prevent vservers from escaping chroot() barriers */ - if (IS_BARRIER(inode) && !vx_check(0, VX_ADMIN)) - return -EACCES; - if (mask & MAY_WRITE) { /* * Nobody gets write access to a read-only fs. diff --git a/include/linux/ckrm_classqueue.h b/include/linux/ckrm_classqueue.h index 3041c8179..a825336cb 100644 --- a/include/linux/ckrm_classqueue.h +++ b/include/linux/ckrm_classqueue.h @@ -28,8 +28,7 @@ #include -#define CLASSQUEUE_SIZE 1024 // acb: changed from 128 -//#define CLASSQUEUE_SIZE 128 +#define CLASSQUEUE_SIZE 128 #define CQ_BITMAP_SIZE ((((CLASSQUEUE_SIZE+1+7)/8)+sizeof(long)-1)/sizeof(long)) /** diff --git a/include/linux/ckrm_mem_inline.h b/include/linux/ckrm_mem_inline.h index 221f93601..815fdaa49 100644 --- a/include/linux/ckrm_mem_inline.h +++ b/include/linux/ckrm_mem_inline.h @@ -73,18 +73,11 @@ mem_class_get(ckrm_mem_res_t *cls) static inline void mem_class_put(ckrm_mem_res_t *cls) { - const char *name; if (cls && atomic_dec_and_test(&(cls->nr_users)) ) { - if (cls->core == NULL) { - name = "unknown"; - } else { - name = cls->core->name; - } - printk(KERN_DEBUG "freeing memclass %p of \n", cls, name); - + printk("freeing memclass %p of \n", cls, cls->core->name); // BUG_ON(ckrm_memclass_valid(cls)); - // kfree(cls); + //kfree(cls); } } diff --git a/include/linux/ckrm_sched.h b/include/linux/ckrm_sched.h index 3611c2d3e..62b3ba27a 100644 --- a/include/linux/ckrm_sched.h +++ b/include/linux/ckrm_sched.h @@ -293,7 +293,7 @@ void adjust_local_weight(void); #define CLASS_QUANTIZER 16 //shift from ns to increase class bonus #define PRIORITY_QUANTIZER 2 //controls how much a high prio task can borrow -#define CKRM_SHARE_ACCURACY 13 +#define CKRM_SHARE_ACCURACY 10 #define NSEC_PER_MS 1000000 #define NSEC_PER_JIFFIES (NSEC_PER_SEC/HZ) @@ -361,11 +361,7 @@ static inline int get_effective_prio(ckrm_lrq_t * lrq) int prio; prio = lrq->local_cvt >> CLASS_QUANTIZER; // cumulative usage -#ifndef URGENCY_SUPPORT -#warning "ACB removing urgency calculation from get_effective_prio" -#else prio += lrq->top_priority >> PRIORITY_QUANTIZER; // queue urgency -#endif return prio; } diff --git a/include/linux/ext2_fs.h b/include/linux/ext2_fs.h index cd252c8eb..7c6f650c9 100644 --- a/include/linux/ext2_fs.h +++ b/include/linux/ext2_fs.h @@ -196,13 +196,8 @@ struct ext2_group_desc #define EXT2_IUNLINK_FL 0x08000000 /* Immutable unlink */ #define EXT2_RESERVED_FL 0x80000000 /* reserved for ext2 lib */ -#ifdef CONFIG_VSERVER_LEGACY -#define EXT2_FL_USER_VISIBLE 0x0C03DFFF /* User visible flags */ -#define EXT2_FL_USER_MODIFIABLE 0x0C0380FF /* User modifiable flags */ -#else #define EXT2_FL_USER_VISIBLE 0x0003DFFF /* User visible flags */ #define EXT2_FL_USER_MODIFIABLE 0x000380FF /* User modifiable flags */ -#endif /* * ioctl commands diff --git a/include/linux/ext3_fs.h b/include/linux/ext3_fs.h index 7fe32d0be..100fba908 100644 --- a/include/linux/ext3_fs.h +++ b/include/linux/ext3_fs.h @@ -189,13 +189,8 @@ struct ext3_group_desc #define EXT3_IUNLINK_FL 0x08000000 /* Immutable unlink */ #define EXT3_RESERVED_FL 0x80000000 /* reserved for ext3 lib */ -#ifdef CONFIG_VSERVER_LEGACY -#define EXT3_FL_USER_VISIBLE 0x0C03DFFF /* User visible flags */ -#define EXT3_FL_USER_MODIFIABLE 0x0C0380FF /* User modifiable flags */ -#else #define EXT3_FL_USER_VISIBLE 0x0003DFFF /* User visible flags */ #define EXT3_FL_USER_MODIFIABLE 0x000380FF /* User modifiable flags */ -#endif /* * Inode dynamic state flags diff --git a/include/linux/fs.h b/include/linux/fs.h index ece31a727..996b7378e 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -42,7 +42,7 @@ struct vfsmount; /* Fixed constants first: */ #undef NR_OPEN #define NR_OPEN (1024*1024) /* Absolute upper limit on fd num */ -#define INR_OPEN 4096 /* Initial setting for nfile rlimits */ +#define INR_OPEN 1024 /* Initial setting for nfile rlimits */ #define BLOCK_SIZE_BITS 10 #define BLOCK_SIZE (1< -DECLARE_LOCK_EXTERN(ip_pptp_lock); - -#define IP_CONNTR_PPTP PPTP_CONTROL_PORT - -#define PPTP_CONTROL_PORT 1723 - -#define PPTP_PACKET_CONTROL 1 -#define PPTP_PACKET_MGMT 2 - -#define PPTP_MAGIC_COOKIE 0x1a2b3c4d - -struct pptp_pkt_hdr { - __u16 packetLength; - __u16 packetType; - __u32 magicCookie; -}; - -/* PptpControlMessageType values */ -#define PPTP_START_SESSION_REQUEST 1 -#define PPTP_START_SESSION_REPLY 2 -#define PPTP_STOP_SESSION_REQUEST 3 -#define PPTP_STOP_SESSION_REPLY 4 -#define PPTP_ECHO_REQUEST 5 -#define PPTP_ECHO_REPLY 6 -#define PPTP_OUT_CALL_REQUEST 7 -#define PPTP_OUT_CALL_REPLY 8 -#define PPTP_IN_CALL_REQUEST 9 -#define PPTP_IN_CALL_REPLY 10 -#define PPTP_IN_CALL_CONNECT 11 -#define PPTP_CALL_CLEAR_REQUEST 12 -#define PPTP_CALL_DISCONNECT_NOTIFY 13 -#define PPTP_WAN_ERROR_NOTIFY 14 -#define PPTP_SET_LINK_INFO 15 - -#define PPTP_MSG_MAX 15 - -/* PptpGeneralError values */ -#define PPTP_ERROR_CODE_NONE 0 -#define PPTP_NOT_CONNECTED 1 -#define PPTP_BAD_FORMAT 2 -#define PPTP_BAD_VALUE 3 -#define PPTP_NO_RESOURCE 4 -#define PPTP_BAD_CALLID 5 -#define PPTP_REMOVE_DEVICE_ERROR 6 - -struct PptpControlHeader { - __u16 messageType; - __u16 reserved; -}; - -/* FramingCapability Bitmap Values */ -#define PPTP_FRAME_CAP_ASYNC 0x1 -#define PPTP_FRAME_CAP_SYNC 0x2 - -/* BearerCapability Bitmap Values */ -#define PPTP_BEARER_CAP_ANALOG 0x1 -#define PPTP_BEARER_CAP_DIGITAL 0x2 - -struct PptpStartSessionRequest { - __u16 protocolVersion; - __u8 reserved1; - __u8 reserved2; - __u32 framingCapability; - __u32 bearerCapability; - __u16 maxChannels; - __u16 firmwareRevision; - __u8 hostName[64]; - __u8 vendorString[64]; -}; - -/* PptpStartSessionResultCode Values */ -#define PPTP_START_OK 1 -#define PPTP_START_GENERAL_ERROR 2 -#define PPTP_START_ALREADY_CONNECTED 3 -#define PPTP_START_NOT_AUTHORIZED 4 -#define PPTP_START_UNKNOWN_PROTOCOL 5 - -struct PptpStartSessionReply { - __u16 protocolVersion; - __u8 resultCode; - __u8 generalErrorCode; - __u32 framingCapability; - __u32 bearerCapability; - __u16 maxChannels; - __u16 firmwareRevision; - __u8 hostName[64]; - __u8 vendorString[64]; -}; - -/* PptpStopReasons */ -#define PPTP_STOP_NONE 1 -#define PPTP_STOP_PROTOCOL 2 -#define PPTP_STOP_LOCAL_SHUTDOWN 3 - -struct PptpStopSessionRequest { - __u8 reason; -}; - -/* PptpStopSessionResultCode */ -#define PPTP_STOP_OK 1 -#define PPTP_STOP_GENERAL_ERROR 2 - -struct PptpStopSessionReply { - __u8 resultCode; - __u8 generalErrorCode; -}; - -struct PptpEchoRequest { - __u32 identNumber; -}; - -/* PptpEchoReplyResultCode */ -#define PPTP_ECHO_OK 1 -#define PPTP_ECHO_GENERAL_ERROR 2 - -struct PptpEchoReply { - __u32 identNumber; - __u8 resultCode; - __u8 generalErrorCode; - __u16 reserved; -}; - -/* PptpFramingType */ -#define PPTP_ASYNC_FRAMING 1 -#define PPTP_SYNC_FRAMING 2 -#define PPTP_DONT_CARE_FRAMING 3 - -/* PptpCallBearerType */ -#define PPTP_ANALOG_TYPE 1 -#define PPTP_DIGITAL_TYPE 2 -#define PPTP_DONT_CARE_BEARER_TYPE 3 - -struct PptpOutCallRequest { - __u16 callID; - __u16 callSerialNumber; - __u32 minBPS; - __u32 maxBPS; - __u32 bearerType; - __u32 framingType; - __u16 packetWindow; - __u16 packetProcDelay; - __u16 reserved1; - __u16 phoneNumberLength; - __u16 reserved2; - __u8 phoneNumber[64]; - __u8 subAddress[64]; -}; - -/* PptpCallResultCode */ -#define PPTP_OUTCALL_CONNECT 1 -#define PPTP_OUTCALL_GENERAL_ERROR 2 -#define PPTP_OUTCALL_NO_CARRIER 3 -#define PPTP_OUTCALL_BUSY 4 -#define PPTP_OUTCALL_NO_DIAL_TONE 5 -#define PPTP_OUTCALL_TIMEOUT 6 -#define PPTP_OUTCALL_DONT_ACCEPT 7 - -struct PptpOutCallReply { - __u16 callID; - __u16 peersCallID; - __u8 resultCode; - __u8 generalErrorCode; - __u16 causeCode; - __u32 connectSpeed; - __u16 packetWindow; - __u16 packetProcDelay; - __u32 physChannelID; -}; - -struct PptpInCallRequest { - __u16 callID; - __u16 callSerialNumber; - __u32 callBearerType; - __u32 physChannelID; - __u16 dialedNumberLength; - __u16 dialingNumberLength; - __u8 dialedNumber[64]; - __u8 dialingNumber[64]; - __u8 subAddress[64]; -}; - -/* PptpInCallResultCode */ -#define PPTP_INCALL_ACCEPT 1 -#define PPTP_INCALL_GENERAL_ERROR 2 -#define PPTP_INCALL_DONT_ACCEPT 3 - -struct PptpInCallReply { - __u16 callID; - __u16 peersCallID; - __u8 resultCode; - __u8 generalErrorCode; - __u16 packetWindow; - __u16 packetProcDelay; - __u16 reserved; -}; - -struct PptpInCallConnected { - __u16 peersCallID; - __u16 reserved; - __u32 connectSpeed; - __u16 packetWindow; - __u16 packetProcDelay; - __u32 callFramingType; -}; - -struct PptpClearCallRequest { - __u16 callID; - __u16 reserved; -}; - -struct PptpCallDisconnectNotify { - __u16 callID; - __u8 resultCode; - __u8 generalErrorCode; - __u16 causeCode; - __u16 reserved; - __u8 callStatistics[128]; -}; - -struct PptpWanErrorNotify { - __u16 peersCallID; - __u16 reserved; - __u32 crcErrors; - __u32 framingErrors; - __u32 hardwareOverRuns; - __u32 bufferOverRuns; - __u32 timeoutErrors; - __u32 alignmentErrors; -}; - -struct PptpSetLinkInfo { - __u16 peersCallID; - __u16 reserved; - __u32 sendAccm; - __u32 recvAccm; -}; - - -struct pptp_priv_data { - __u16 call_id; - __u16 mcall_id; - __u16 pcall_id; -}; - -union pptp_ctrl_union { - struct PptpStartSessionRequest sreq; - struct PptpStartSessionReply srep; - struct PptpStopSessionRequest streq; - struct PptpStopSessionReply strep; - struct PptpOutCallRequest ocreq; - struct PptpOutCallReply ocack; - struct PptpInCallRequest icreq; - struct PptpInCallReply icack; - struct PptpInCallConnected iccon; - struct PptpClearCallRequest clrreq; - struct PptpCallDisconnectNotify disc; - struct PptpWanErrorNotify wanerr; - struct PptpSetLinkInfo setlink; -}; - -#endif /* __KERNEL__ */ -#endif /* _CONNTRACK_PPTP_H */ diff --git a/include/linux/netfilter_ipv4/ip_conntrack_proto_gre.h b/include/linux/netfilter_ipv4/ip_conntrack_proto_gre.h deleted file mode 100644 index 07646857c..000000000 --- a/include/linux/netfilter_ipv4/ip_conntrack_proto_gre.h +++ /dev/null @@ -1,123 +0,0 @@ -#ifndef _CONNTRACK_PROTO_GRE_H -#define _CONNTRACK_PROTO_GRE_H -#include - -/* GRE PROTOCOL HEADER */ - -/* GRE Version field */ -#define GRE_VERSION_1701 0x0 -#define GRE_VERSION_PPTP 0x1 - -/* GRE Protocol field */ -#define GRE_PROTOCOL_PPTP 0x880B - -/* GRE Flags */ -#define GRE_FLAG_C 0x80 -#define GRE_FLAG_R 0x40 -#define GRE_FLAG_K 0x20 -#define GRE_FLAG_S 0x10 -#define GRE_FLAG_A 0x80 - -#define GRE_IS_C(f) ((f)&GRE_FLAG_C) -#define GRE_IS_R(f) ((f)&GRE_FLAG_R) -#define GRE_IS_K(f) ((f)&GRE_FLAG_K) -#define GRE_IS_S(f) ((f)&GRE_FLAG_S) -#define GRE_IS_A(f) ((f)&GRE_FLAG_A) - -/* GRE is a mess: Four different standards */ -struct gre_hdr { -#if defined(__LITTLE_ENDIAN_BITFIELD) - __u16 rec:3, - srr:1, - seq:1, - key:1, - routing:1, - csum:1, - version:3, - reserved:4, - ack:1; -#elif defined(__BIG_ENDIAN_BITFIELD) - __u16 csum:1, - routing:1, - key:1, - seq:1, - srr:1, - rec:3, - ack:1, - reserved:4, - version:3; -#else -#error "Adjust your defines" -#endif - __u16 protocol; -}; - -/* modified GRE header for PPTP */ -struct gre_hdr_pptp { - __u8 flags; /* bitfield */ - __u8 version; /* should be GRE_VERSION_PPTP */ - __u16 protocol; /* should be GRE_PROTOCOL_PPTP */ - __u16 payload_len; /* size of ppp payload, not inc. gre header */ - __u16 call_id; /* peer's call_id for this session */ - __u32 seq; /* sequence number. Present if S==1 */ - __u32 ack; /* seq number of highest packet recieved by */ - /* sender in this session */ -}; - - -/* this is part of ip_conntrack */ -struct ip_ct_gre { - unsigned int stream_timeout; - unsigned int timeout; -}; - -/* this is part of ip_conntrack_expect */ -struct ip_ct_gre_expect { - struct ip_ct_gre_keymap *keymap_orig, *keymap_reply; -}; - -#ifdef __KERNEL__ -struct ip_conntrack_expect; - -/* structure for original <-> reply keymap */ -struct ip_ct_gre_keymap { - struct list_head list; - - struct ip_conntrack_tuple tuple; -}; - - -/* add new tuple->key_reply pair to keymap */ -int ip_ct_gre_keymap_add(struct ip_conntrack_expect *exp, - struct ip_conntrack_tuple *t, - int reply); - -/* change an existing keymap entry */ -void ip_ct_gre_keymap_change(struct ip_ct_gre_keymap *km, - struct ip_conntrack_tuple *t); - -/* delete keymap entries */ -void ip_ct_gre_keymap_destroy(struct ip_conntrack_expect *exp); - - -/* get pointer to gre key, if present */ -static inline u_int32_t *gre_key(struct gre_hdr *greh) -{ - if (!greh->key) - return NULL; - if (greh->csum || greh->routing) - return (u_int32_t *) (greh+sizeof(*greh)+4); - return (u_int32_t *) (greh+sizeof(*greh)); -} - -/* get pointer ot gre csum, if present */ -static inline u_int16_t *gre_csum(struct gre_hdr *greh) -{ - if (!greh->csum) - return NULL; - return (u_int16_t *) (greh+sizeof(*greh)); -} - -#endif /* __KERNEL__ */ - -#endif /* _CONNTRACK_PROTO_GRE_H */ diff --git a/include/linux/netfilter_ipv4/ip_nat_pptp.h b/include/linux/netfilter_ipv4/ip_nat_pptp.h deleted file mode 100644 index eaf66c2e8..000000000 --- a/include/linux/netfilter_ipv4/ip_nat_pptp.h +++ /dev/null @@ -1,11 +0,0 @@ -/* PPTP constants and structs */ -#ifndef _NAT_PPTP_H -#define _NAT_PPTP_H - -/* conntrack private data */ -struct ip_nat_pptp { - u_int16_t pns_call_id; /* NAT'ed PNS call id */ - u_int16_t pac_call_id; /* NAT'ed PAC call id */ -}; - -#endif /* _NAT_PPTP_H */ diff --git a/include/linux/socket.h b/include/linux/socket.h index 602d03b5d..4cd4850d7 100644 --- a/include/linux/socket.h +++ b/include/linux/socket.h @@ -269,9 +269,6 @@ struct ucred { #define SOL_NETBEUI 267 #define SOL_LLC 268 -/* PlanetLab PL2525: reset the context ID of an existing socket */ -#define SO_SETXID SO_PEERCRED - /* IPX options */ #define IPX_TYPE 1 diff --git a/include/linux/vserver/inode.h b/include/linux/vserver/inode.h index e19632d08..fc49aba6d 100644 --- a/include/linux/vserver/inode.h +++ b/include/linux/vserver/inode.h @@ -57,10 +57,6 @@ extern int vc_set_iattr_v0(uint32_t, void __user *); extern int vc_get_iattr(uint32_t, void __user *); extern int vc_set_iattr(uint32_t, void __user *); -extern int vc_iattr_ioctl(struct dentry *de, - unsigned int cmd, - unsigned long arg); - #endif /* __KERNEL__ */ /* inode ioctls */ @@ -68,7 +64,4 @@ extern int vc_iattr_ioctl(struct dentry *de, #define FIOC_GETXFLG _IOR('x', 5, long) #define FIOC_SETXFLG _IOW('x', 6, long) -#define FIOC_GETIATTR _IOR('x', 7, long) -#define FIOC_SETIATTR _IOR('x', 8, long) - #endif /* _VX_INODE_H */ diff --git a/include/net/sock.h b/include/net/sock.h index a487663e0..0d3da11ec 100644 --- a/include/net/sock.h +++ b/include/net/sock.h @@ -1086,10 +1086,8 @@ static inline int sock_queue_rcv_skb(struct sock *sk, struct sk_buff *skb) * packet. */ if (inet_stream_ops.bind != inet_bind && - (int) sk->sk_xid > 0 && sk->sk_xid != skb->xid) { - err = -EPERM; + (int) sk->sk_xid > 0 && sk->sk_xid != skb->xid) goto out; - } /* Cast skb->rcvbuf to unsigned... It's pointless, but reduces number of warnings when compiling with -W --ANK diff --git a/kernel/ckrm/ckrm.c b/kernel/ckrm/ckrm.c index f1cfb268c..2a611c1d0 100644 --- a/kernel/ckrm/ckrm.c +++ b/kernel/ckrm/ckrm.c @@ -444,7 +444,7 @@ ckrm_init_core_class(struct ckrm_classtype *clstype, CLS_DEBUG("name %s => %p\n", name ? name : "default", dcore); if ((dcore != clstype->default_class) && (!ckrm_is_core_valid(parent))){ - printk(KERN_DEBUG "error not a valid parent %p\n", parent); + printk("error not a valid parent %p\n", parent); return -EINVAL; } #if 0 @@ -456,7 +456,7 @@ ckrm_init_core_class(struct ckrm_classtype *clstype, (void **)kmalloc(clstype->max_resid * sizeof(void *), GFP_KERNEL); if (dcore->res_class == NULL) { - printk(KERN_DEBUG "error no mem\n"); + printk("error no mem\n"); return -ENOMEM; } } @@ -532,10 +532,10 @@ void ckrm_free_core_class(struct ckrm_core_class *core) parent->name); if (core->delayed) { /* this core was marked as late */ - printk(KERN_DEBUG "class <%s> finally deleted %lu\n", core->name, jiffies); + printk("class <%s> finally deleted %lu\n", core->name, jiffies); } if (ckrm_remove_child(core) == 0) { - printk(KERN_DEBUG "Core class removal failed. Chilren present\n"); + printk("Core class removal failed. Chilren present\n"); } for (i = 0; i < clstype->max_resid; i++) { @@ -656,7 +656,7 @@ ckrm_register_res_ctlr(struct ckrm_classtype *clstype, ckrm_res_ctlr_t * rcbs) */ read_lock(&ckrm_class_lock); list_for_each_entry(core, &clstype->classes, clslist) { - printk(KERN_INFO "CKRM .. create res clsobj for resouce <%s>" + printk("CKRM .. create res clsobj for resouce <%s>" "class <%s> par=%p\n", rcbs->res_name, core->name, core->hnode.parent); ckrm_alloc_res_class(core, core->hnode.parent, resid); @@ -833,7 +833,7 @@ int ckrm_unregister_event_set(struct ckrm_event_spec especs[]) } #define ECC_PRINTK(fmt, args...) \ -// printk(KERN_DEBUG "%s: " fmt, __FUNCTION__ , ## args) +// printk("%s: " fmt, __FUNCTION__ , ## args) void ckrm_invoke_event_cb_chain(enum ckrm_event ev, void *arg) { @@ -978,7 +978,7 @@ void ckrm_cb_exit(struct task_struct *tsk) void __init ckrm_init(void) { - printk(KERN_DEBUG "CKRM Initialization\n"); + printk("CKRM Initialization\n"); // register/initialize the Metatypes @@ -996,7 +996,7 @@ void __init ckrm_init(void) #endif // prepare init_task and then rely on inheritance of properties ckrm_cb_newtask(&init_task); - printk(KERN_DEBUG "CKRM Initialization done\n"); + printk("CKRM Initialization done\n"); } EXPORT_SYMBOL(ckrm_register_engine); diff --git a/kernel/ckrm/ckrm_cpu_class.c b/kernel/ckrm/ckrm_cpu_class.c index 917875b18..09ea6ba80 100644 --- a/kernel/ckrm/ckrm_cpu_class.c +++ b/kernel/ckrm/ckrm_cpu_class.c @@ -318,7 +318,7 @@ static int ckrm_cpu_set_config(void *my_res, const char *cfgstr) if (!cls) return -EINVAL; - printk(KERN_DEBUG "ckrm_cpu config='%s'\n",cfgstr); + printk("ckrm_cpu config='%s'\n",cfgstr); return 0; } @@ -349,7 +349,7 @@ int __init init_ckrm_sched_res(void) if (resid == -1) { /*not registered */ resid = ckrm_register_res_ctlr(clstype,&cpu_rcbs); - printk(KERN_DEBUG "........init_ckrm_sched_res , resid= %d\n",resid); + printk("........init_ckrm_sched_res , resid= %d\n",resid); } return 0; } diff --git a/kernel/ckrm/ckrm_cpu_monitor.c b/kernel/ckrm/ckrm_cpu_monitor.c index d8c199a20..259b9c9bb 100644 --- a/kernel/ckrm/ckrm_cpu_monitor.c +++ b/kernel/ckrm/ckrm_cpu_monitor.c @@ -929,12 +929,8 @@ void ckrm_cpu_monitor(int check_min) if (update_max_demand(root_core) != 0) goto outunlock; -#ifndef ALLOC_SURPLUS_SUPPORT -#warning "MEF taking out alloc_surplus" -#else if (alloc_surplus(root_core) != 0) goto outunlock; -#endif adjust_local_weight(); @@ -963,7 +959,7 @@ static int ckrm_cpu_monitord(void *nothing) } cpu_monitor_pid = -1; thread_exit = 2; - printk(KERN_DEBUG "cpu_monitord exit\n"); + printk("cpu_monitord exit\n"); return 0; } @@ -971,13 +967,13 @@ void ckrm_start_monitor(void) { cpu_monitor_pid = kernel_thread(ckrm_cpu_monitord, 0, CLONE_KERNEL); if (cpu_monitor_pid < 0) { - printk(KERN_DEBUG "ckrm_cpu_monitord for failed\n"); + printk("ckrm_cpu_monitord for failed\n"); } } void ckrm_kill_monitor(void) { - printk(KERN_DEBUG "killing process %d\n", cpu_monitor_pid); + printk("killing process %d\n", cpu_monitor_pid); if (cpu_monitor_pid > 0) { thread_exit = 1; while (thread_exit != 2) { diff --git a/kernel/ckrm/ckrm_laq.c b/kernel/ckrm/ckrm_laq.c index b64205a06..3271f10bd 100644 --- a/kernel/ckrm/ckrm_laq.c +++ b/kernel/ckrm/ckrm_laq.c @@ -477,7 +477,7 @@ int __init init_ckrm_laq_res(void) resid = ckrm_register_res_ctlr(clstype, &laq_rcbs); if (resid >= 0) my_resid = resid; - printk(KERN_DEBUG "........init_ckrm_listen_aq_res -> %d\n", my_resid); + printk("........init_ckrm_listen_aq_res -> %d\n", my_resid); } return 0; diff --git a/kernel/ckrm/ckrm_mem.c b/kernel/ckrm/ckrm_mem.c index c6c594a96..4d38cda49 100644 --- a/kernel/ckrm/ckrm_mem.c +++ b/kernel/ckrm/ckrm_mem.c @@ -192,13 +192,11 @@ mem_res_free(void *my_res) child_guarantee_changed(&parres->shares, res->shares.my_guarantee, 0); child_maxlimit_changed_local(parres); } - ckrm_mem_evaluate_all_pages(); - res->core = NULL; - spin_lock(&ckrm_mem_lock); list_del(&res->mcls_list); spin_unlock(&ckrm_mem_lock); mem_class_put(res); + ckrm_mem_evaluate_all_pages(); return; } @@ -585,9 +583,6 @@ ckrm_get_reclaim_bits(unsigned int *flags, unsigned int *extract) void ckrm_at_limit(ckrm_mem_res_t *cls) { -#ifndef AT_LIMIT_SUPPORT -#warning "ckrm_at_limit disabled due to problems with memory hog tests" -#else struct zone *zone; unsigned long now = jiffies; @@ -611,7 +606,6 @@ ckrm_at_limit(ckrm_mem_res_t *cls) wakeup_kswapd(zone); break; // only once is enough } -#endif // AT_LIMIT_SUPPORT } static int unmapped = 0, changed = 0, unchanged = 0, maxnull = 0, @@ -736,7 +730,7 @@ ckrm_mem_evaluate_all_pages() } spin_unlock_irq(&zone->lru_lock); } - printk(KERN_DEBUG "all_pages: active %d inactive %d cleared %d\n", + printk("all_pages: active %d inactive %d cleared %d\n", active, inactive, cleared); spin_lock(&ckrm_mem_lock); list_for_each_entry(res, &ckrm_memclass_list, mcls_list) { @@ -746,7 +740,7 @@ ckrm_mem_evaluate_all_pages() inact_cnt += res->nr_inactive[idx]; idx++; } - printk(KERN_DEBUG "all_pages: %s: tmp_cnt %d; act_cnt %d inact_cnt %d\n", + printk("all_pages: %s: tmp_cnt %d; act_cnt %d inact_cnt %d\n", res->core->name, res->tmp_cnt, act_cnt, inact_cnt); } spin_unlock(&ckrm_mem_lock); diff --git a/kernel/ckrm/ckrm_numtasks.c b/kernel/ckrm/ckrm_numtasks.c index 61517aee0..23b3549d4 100644 --- a/kernel/ckrm/ckrm_numtasks.c +++ b/kernel/ckrm/ckrm_numtasks.c @@ -453,7 +453,7 @@ static int numtasks_set_config(void *my_res, const char *cfgstr) if (!res) return -EINVAL; - printk(KERN_DEBUG "numtasks config='%s'\n", cfgstr); + printk("numtasks config='%s'\n", cfgstr); return 0; } @@ -505,7 +505,7 @@ int __init init_ckrm_numtasks_res(void) if (resid == -1) { resid = ckrm_register_res_ctlr(clstype, &numtasks_rcbs); - printk(KERN_DEBUG "........init_ckrm_numtasks_res -> %d\n", resid); + printk("........init_ckrm_numtasks_res -> %d\n", resid); if (resid != -1) { ckrm_numtasks_register(numtasks_get_ref_local, numtasks_put_ref_local); diff --git a/kernel/ckrm/ckrm_sockc.c b/kernel/ckrm/ckrm_sockc.c index 8ccadfa39..7137dc2a9 100644 --- a/kernel/ckrm/ckrm_sockc.c +++ b/kernel/ckrm/ckrm_sockc.c @@ -470,7 +470,7 @@ sock_forced_reclassify(struct ckrm_core_class *target, const char *options) return -EPERM; if (id != 0) return -EINVAL; - printk(KERN_DEBUG "sock_class: reclassify all not net implemented\n"); + printk("sock_class: reclassify all not net implemented\n"); return 0; } @@ -553,7 +553,7 @@ static void sock_reclassify_class(struct ckrm_sock_class *cls) void __init ckrm_meta_init_sockclass(void) { - printk(KERN_DEBUG "...... Initializing ClassType<%s> ........\n", + printk("...... Initializing ClassType<%s> ........\n", CT_sockclass.name); // intialize the default class ckrm_init_core_class(&CT_sockclass, class_core(&sockclass_dflt_class), diff --git a/kernel/ckrm/ckrm_tasks.c b/kernel/ckrm/ckrm_tasks.c new file mode 100644 index 000000000..ee539216e --- /dev/null +++ b/kernel/ckrm/ckrm_tasks.c @@ -0,0 +1,519 @@ +/* ckrm_numtasks.c - "Number of tasks" resource controller for CKRM + * + * Copyright (C) Chandra Seetharaman, IBM Corp. 2003 + * + * Latest version, more details at http://ckrm.sf.net + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + */ + +/* Changes + * + * 31 Mar 2004: Created + * + */ + +/* + * Code Description: TBD + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#define TOTAL_NUM_TASKS (131072) // 128 K +#define NUMTASKS_DEBUG +#define NUMTASKS_NAME "numtasks" + +typedef struct ckrm_numtasks { + struct ckrm_core_class *core; // the core i am part of... + struct ckrm_core_class *parent; // parent of the core above. + struct ckrm_shares shares; + spinlock_t cnt_lock; // always grab parent's lock before child's + int cnt_guarantee; // num_tasks guarantee in local units + int cnt_unused; // has to borrow if more than this is needed + int cnt_limit; // no tasks over this limit. + atomic_t cnt_cur_alloc; // current alloc from self + atomic_t cnt_borrowed; // borrowed from the parent + + int over_guarantee; // turn on/off when cur_alloc goes + // over/under guarantee + + // internally maintained statictics to compare with max numbers + int limit_failures; // # failures as request was over the limit + int borrow_sucesses; // # successful borrows + int borrow_failures; // # borrow failures + + // Maximum the specific statictics has reached. + int max_limit_failures; + int max_borrow_sucesses; + int max_borrow_failures; + + // Total number of specific statistics + int tot_limit_failures; + int tot_borrow_sucesses; + int tot_borrow_failures; +} ckrm_numtasks_t; + +struct ckrm_res_ctlr numtasks_rcbs; + +/* Initialize rescls values + * May be called on each rcfs unmount or as part of error recovery + * to make share values sane. + * Does not traverse hierarchy reinitializing children. + */ +static void numtasks_res_initcls_one(ckrm_numtasks_t * res) +{ + res->shares.my_guarantee = CKRM_SHARE_DONTCARE; + res->shares.my_limit = CKRM_SHARE_DONTCARE; + res->shares.total_guarantee = CKRM_SHARE_DFLT_TOTAL_GUARANTEE; + res->shares.max_limit = CKRM_SHARE_DFLT_MAX_LIMIT; + res->shares.unused_guarantee = CKRM_SHARE_DFLT_TOTAL_GUARANTEE; + res->shares.cur_max_limit = 0; + + res->cnt_guarantee = CKRM_SHARE_DONTCARE; + res->cnt_unused = CKRM_SHARE_DONTCARE; + res->cnt_limit = CKRM_SHARE_DONTCARE; + + res->over_guarantee = 0; + + res->limit_failures = 0; + res->borrow_sucesses = 0; + res->borrow_failures = 0; + + res->max_limit_failures = 0; + res->max_borrow_sucesses = 0; + res->max_borrow_failures = 0; + + res->tot_limit_failures = 0; + res->tot_borrow_sucesses = 0; + res->tot_borrow_failures = 0; + + atomic_set(&res->cnt_cur_alloc, 0); + atomic_set(&res->cnt_borrowed, 0); + return; +} + +#if 0 +static void numtasks_res_initcls(void *my_res) +{ + ckrm_numtasks_t *res = my_res; + + /* Write a version which propagates values all the way down + and replace rcbs callback with that version */ + +} +#endif + +static int numtasks_get_ref_local(void *arg, int force) +{ + int rc, resid = numtasks_rcbs.resid; + ckrm_numtasks_t *res; + ckrm_core_class_t *core = arg; + + if ((resid < 0) || (core == NULL)) + return 1; + + res = ckrm_get_res_class(core, resid, ckrm_numtasks_t); + if (res == NULL) + return 1; + + atomic_inc(&res->cnt_cur_alloc); + + rc = 1; + if (((res->parent) && (res->cnt_unused == CKRM_SHARE_DONTCARE)) || + (atomic_read(&res->cnt_cur_alloc) > res->cnt_unused)) { + + rc = 0; + if (!force && (res->cnt_limit != CKRM_SHARE_DONTCARE) && + (atomic_read(&res->cnt_cur_alloc) > res->cnt_limit)) { + res->limit_failures++; + res->tot_limit_failures++; + } else if (res->parent != NULL) { + if ((rc = + numtasks_get_ref_local(res->parent, force)) == 1) { + atomic_inc(&res->cnt_borrowed); + res->borrow_sucesses++; + res->tot_borrow_sucesses++; + res->over_guarantee = 1; + } else { + res->borrow_failures++; + res->tot_borrow_failures++; + } + } else { + rc = force; + } + } else if (res->over_guarantee) { + res->over_guarantee = 0; + + if (res->max_limit_failures < res->limit_failures) { + res->max_limit_failures = res->limit_failures; + } + if (res->max_borrow_sucesses < res->borrow_sucesses) { + res->max_borrow_sucesses = res->borrow_sucesses; + } + if (res->max_borrow_failures < res->borrow_failures) { + res->max_borrow_failures = res->borrow_failures; + } + res->limit_failures = 0; + res->borrow_sucesses = 0; + res->borrow_failures = 0; + } + + if (!rc) { + atomic_dec(&res->cnt_cur_alloc); + } + return rc; +} + +static void numtasks_put_ref_local(void *arg) +{ + int resid = numtasks_rcbs.resid; + ckrm_numtasks_t *res; + ckrm_core_class_t *core = arg; + + if ((resid == -1) || (core == NULL)) { + return; + } + + res = ckrm_get_res_class(core, resid, ckrm_numtasks_t); + if (res == NULL) + return; + atomic_dec(&res->cnt_cur_alloc); + if (atomic_read(&res->cnt_borrowed) > 0) { + atomic_dec(&res->cnt_borrowed); + numtasks_put_ref_local(res->parent); + } + return; +} + +static void *numtasks_res_alloc(struct ckrm_core_class *core, + struct ckrm_core_class *parent) +{ + ckrm_numtasks_t *res; + + res = kmalloc(sizeof(ckrm_numtasks_t), GFP_ATOMIC); + + if (res) { + memset(res, 0, sizeof(ckrm_numtasks_t)); + res->core = core; + res->parent = parent; + numtasks_res_initcls_one(res); + res->cnt_lock = SPIN_LOCK_UNLOCKED; + if (parent == NULL) { + // I am part of root class. So set the max tasks + // to available default + res->cnt_guarantee = TOTAL_NUM_TASKS; + res->cnt_unused = TOTAL_NUM_TASKS; + res->cnt_limit = TOTAL_NUM_TASKS; + } + try_module_get(THIS_MODULE); + } else { + printk(KERN_ERR + "numtasks_res_alloc: failed GFP_ATOMIC alloc\n"); + } + return res; +} + +/* + * No locking of this resource class object necessary as we are not + * supposed to be assigned (or used) when/after this function is called. + */ +static void numtasks_res_free(void *my_res) +{ + ckrm_numtasks_t *res = my_res, *parres, *childres; + ckrm_core_class_t *child = NULL; + int i, borrowed, maxlimit, resid = numtasks_rcbs.resid; + + if (!res) + return; + + // Assuming there will be no children when this function is called + + parres = ckrm_get_res_class(res->parent, resid, ckrm_numtasks_t); + + if (unlikely(atomic_read(&res->cnt_cur_alloc) != 0 || + atomic_read(&res->cnt_borrowed))) { + printk(KERN_ERR + "numtasks_res_free: resource still alloc'd %p\n", res); + if ((borrowed = atomic_read(&res->cnt_borrowed)) > 0) { + for (i = 0; i < borrowed; i++) { + numtasks_put_ref_local(parres->core); + } + } + } + // return child's limit/guarantee to parent node + spin_lock(&parres->cnt_lock); + child_guarantee_changed(&parres->shares, res->shares.my_guarantee, 0); + + // run thru parent's children and get the new max_limit of the parent + ckrm_lock_hier(parres->core); + maxlimit = 0; + while ((child = ckrm_get_next_child(parres->core, child)) != NULL) { + childres = ckrm_get_res_class(child, resid, ckrm_numtasks_t); + if (maxlimit < childres->shares.my_limit) { + maxlimit = childres->shares.my_limit; + } + } + ckrm_unlock_hier(parres->core); + if (parres->shares.cur_max_limit < maxlimit) { + parres->shares.cur_max_limit = maxlimit; + } + + spin_unlock(&parres->cnt_lock); + kfree(res); + module_put(THIS_MODULE); + return; +} + +/* + * Recalculate the guarantee and limit in real units... and propagate the + * same to children. + * Caller is responsible for protecting res and for the integrity of parres + */ +static void +recalc_and_propagate(ckrm_numtasks_t * res, ckrm_numtasks_t * parres) +{ + ckrm_core_class_t *child = NULL; + ckrm_numtasks_t *childres; + int resid = numtasks_rcbs.resid; + + if (parres) { + struct ckrm_shares *par = &parres->shares; + struct ckrm_shares *self = &res->shares; + + // calculate cnt_guarantee and cnt_limit + // + if (parres->cnt_guarantee == CKRM_SHARE_DONTCARE) { + res->cnt_guarantee = CKRM_SHARE_DONTCARE; + } else if (par->total_guarantee) { + res->cnt_guarantee = + (self->my_guarantee * parres->cnt_guarantee) + / par->total_guarantee; + } else { + res->cnt_guarantee = 0; + } + + if (parres->cnt_limit == CKRM_SHARE_DONTCARE) { + res->cnt_limit = CKRM_SHARE_DONTCARE; + } else if (par->max_limit) { + res->cnt_limit = (self->my_limit * parres->cnt_limit) + / par->max_limit; + } else { + res->cnt_limit = 0; + } + + // Calculate unused units + if (res->cnt_guarantee == CKRM_SHARE_DONTCARE) { + res->cnt_unused = CKRM_SHARE_DONTCARE; + } else if (self->total_guarantee) { + res->cnt_unused = (self->unused_guarantee * + res->cnt_guarantee) / + self->total_guarantee; + } else { + res->cnt_unused = 0; + } + } + // propagate to children + ckrm_lock_hier(res->core); + while ((child = ckrm_get_next_child(res->core, child)) != NULL) { + childres = ckrm_get_res_class(child, resid, ckrm_numtasks_t); + + spin_lock(&childres->cnt_lock); + recalc_and_propagate(childres, res); + spin_unlock(&childres->cnt_lock); + } + ckrm_unlock_hier(res->core); + return; +} + +static int numtasks_set_share_values(void *my_res, struct ckrm_shares *new) +{ + ckrm_numtasks_t *parres, *res = my_res; + struct ckrm_shares *cur = &res->shares, *par; + int rc = -EINVAL, resid = numtasks_rcbs.resid; + + if (!res) + return rc; + + if (res->parent) { + parres = + ckrm_get_res_class(res->parent, resid, ckrm_numtasks_t); + spin_lock(&parres->cnt_lock); + spin_lock(&res->cnt_lock); + par = &parres->shares; + } else { + spin_lock(&res->cnt_lock); + par = NULL; + parres = NULL; + } + + rc = set_shares(new, cur, par); + + if ((rc == 0) && parres) { + // Calculate parent's unused units + if (parres->cnt_guarantee == CKRM_SHARE_DONTCARE) { + parres->cnt_unused = CKRM_SHARE_DONTCARE; + } else if (par->total_guarantee) { + parres->cnt_unused = (par->unused_guarantee * + parres->cnt_guarantee) / + par->total_guarantee; + } else { + parres->cnt_unused = 0; + } + recalc_and_propagate(res, parres); + } + spin_unlock(&res->cnt_lock); + if (res->parent) { + spin_unlock(&parres->cnt_lock); + } + return rc; +} + +static int numtasks_get_share_values(void *my_res, struct ckrm_shares *shares) +{ + ckrm_numtasks_t *res = my_res; + + if (!res) + return -EINVAL; + *shares = res->shares; + return 0; +} + +static int numtasks_get_stats(void *my_res, struct seq_file *sfile) +{ + ckrm_numtasks_t *res = my_res; + + if (!res) + return -EINVAL; + + seq_printf(sfile, "Number of tasks resource:\n"); + seq_printf(sfile, "Total Over limit failures: %d\n", + res->tot_limit_failures); + seq_printf(sfile, "Total Over guarantee sucesses: %d\n", + res->tot_borrow_sucesses); + seq_printf(sfile, "Total Over guarantee failures: %d\n", + res->tot_borrow_failures); + + seq_printf(sfile, "Maximum Over limit failures: %d\n", + res->max_limit_failures); + seq_printf(sfile, "Maximum Over guarantee sucesses: %d\n", + res->max_borrow_sucesses); + seq_printf(sfile, "Maximum Over guarantee failures: %d\n", + res->max_borrow_failures); +#ifdef NUMTASKS_DEBUG + seq_printf(sfile, + "cur_alloc %d; borrowed %d; cnt_guar %d; cnt_limit %d " + "unused_guarantee %d, cur_max_limit %d\n", + atomic_read(&res->cnt_cur_alloc), + atomic_read(&res->cnt_borrowed), res->cnt_guarantee, + res->cnt_limit, res->shares.unused_guarantee, + res->shares.cur_max_limit); +#endif + + return 0; +} + +static int numtasks_show_config(void *my_res, struct seq_file *sfile) +{ + ckrm_numtasks_t *res = my_res; + + if (!res) + return -EINVAL; + + seq_printf(sfile, "res=%s,parameter=somevalue\n", NUMTASKS_NAME); + return 0; +} + +static int numtasks_set_config(void *my_res, const char *cfgstr) +{ + ckrm_numtasks_t *res = my_res; + + if (!res) + return -EINVAL; + printk("numtasks config='%s'\n", cfgstr); + return 0; +} + +static void numtasks_change_resclass(void *task, void *old, void *new) +{ + ckrm_numtasks_t *oldres = old; + ckrm_numtasks_t *newres = new; + + if (oldres != (void *)-1) { + struct task_struct *tsk = task; + if (!oldres) { + struct ckrm_core_class *old_core = + &(tsk->parent->taskclass->core); + oldres = + ckrm_get_res_class(old_core, numtasks_rcbs.resid, + ckrm_numtasks_t); + } + numtasks_put_ref_local(oldres->core); + } + if (newres) { + (void)numtasks_get_ref_local(newres->core, 1); + } +} + +struct ckrm_res_ctlr numtasks_rcbs = { + .res_name = NUMTASKS_NAME, + .res_hdepth = 1, + .resid = -1, + .res_alloc = numtasks_res_alloc, + .res_free = numtasks_res_free, + .set_share_values = numtasks_set_share_values, + .get_share_values = numtasks_get_share_values, + .get_stats = numtasks_get_stats, + .show_config = numtasks_show_config, + .set_config = numtasks_set_config, + .change_resclass = numtasks_change_resclass, +}; + +int __init init_ckrm_numtasks_res(void) +{ + struct ckrm_classtype *clstype; + int resid = numtasks_rcbs.resid; + + clstype = ckrm_find_classtype_by_name("taskclass"); + if (clstype == NULL) { + printk(KERN_INFO " Unknown ckrm classtype"); + return -ENOENT; + } + + if (resid == -1) { + resid = ckrm_register_res_ctlr(clstype, &numtasks_rcbs); + printk("........init_ckrm_numtasks_res -> %d\n", resid); + if (resid != -1) { + ckrm_numtasks_register(numtasks_get_ref_local, + numtasks_put_ref_local); + numtasks_rcbs.classtype = clstype; + } + } + return 0; +} + +void __exit exit_ckrm_numtasks_res(void) +{ + if (numtasks_rcbs.resid != -1) { + ckrm_numtasks_register(NULL, NULL); + } + ckrm_unregister_res_ctlr(&numtasks_rcbs); + numtasks_rcbs.resid = -1; +} + +module_init(init_ckrm_numtasks_res) + module_exit(exit_ckrm_numtasks_res) + + MODULE_LICENSE("GPL"); diff --git a/kernel/ckrm/ckrm_tasks_stub.c b/kernel/ckrm/ckrm_tasks_stub.c new file mode 100644 index 000000000..179e6b5d6 --- /dev/null +++ b/kernel/ckrm/ckrm_tasks_stub.c @@ -0,0 +1,59 @@ +/* ckrm_tasks_stub.c - Stub file for ckrm_tasks modules + * + * Copyright (C) Chandra Seetharaman, IBM Corp. 2004 + * + * Latest version, more details at http://ckrm.sf.net + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + */ + +/* Changes + * + * 16 May 2004: Created + * + */ + +#include +#include +#include + +static spinlock_t stub_lock = SPIN_LOCK_UNLOCKED; + +static get_ref_t real_get_ref = NULL; +static put_ref_t real_put_ref = NULL; + +void ckrm_numtasks_register(get_ref_t gr, put_ref_t pr) +{ + spin_lock(&stub_lock); + real_get_ref = gr; + real_put_ref = pr; + spin_unlock(&stub_lock); +} + +int numtasks_get_ref(void *arg, int force) +{ + int ret = 1; + spin_lock(&stub_lock); + if (real_get_ref) { + ret = (*real_get_ref) (arg, force); + } + spin_unlock(&stub_lock); + return ret; +} + +void numtasks_put_ref(void *arg) +{ + spin_lock(&stub_lock); + if (real_put_ref) { + (*real_put_ref) (arg); + } + spin_unlock(&stub_lock); +} + +EXPORT_SYMBOL(ckrm_numtasks_register); +EXPORT_SYMBOL(numtasks_get_ref); +EXPORT_SYMBOL(numtasks_put_ref); diff --git a/kernel/ckrm/ckrm_tc.c b/kernel/ckrm/ckrm_tc.c index af95644f2..23ebb3a20 100644 --- a/kernel/ckrm/ckrm_tc.c +++ b/kernel/ckrm/ckrm_tc.c @@ -318,7 +318,7 @@ static void cb_taskclass_fork(struct task_struct *tsk) ckrm_task_unlock(tsk->parent); } if (!list_empty(&tsk->taskclass_link)) - printk(KERN_WARNING "BUG in cb_fork.. tsk (%s:%d> already linked\n", + printk("BUG in cb_fork.. tsk (%s:%d> already linked\n", tsk->comm, tsk->pid); ckrm_set_taskclass(tsk, cls, NULL, CKRM_EVENT_FORK); @@ -669,7 +669,7 @@ static int ckrm_free_task_class(struct ckrm_core_class *core) void __init ckrm_meta_init_taskclass(void) { - printk(KERN_DEBUG "...... Initializing ClassType<%s> ........\n", + printk("...... Initializing ClassType<%s> ........\n", CT_taskclass.name); // intialize the default class ckrm_init_core_class(&CT_taskclass, class_core(&taskclass_dflt_class), @@ -737,7 +737,7 @@ void check_tasklist_sanity(struct ckrm_task_class *cls) class_lock(core); if (list_empty(&core->objlist)) { class_lock(core); - printk(KERN_DEBUG "check_tasklist_sanity: class %s empty list\n", + printk("check_tasklist_sanity: class %s empty list\n", core->name); return; } @@ -746,14 +746,14 @@ void check_tasklist_sanity(struct ckrm_task_class *cls) container_of(lh1, struct task_struct, taskclass_link); if (count++ > 20000) { - printk(KERN_WARNING "list is CORRUPTED\n"); + printk("list is CORRUPTED\n"); break; } if (tsk->taskclass != cls) { const char *tclsname; tclsname = (tsk->taskclass) ? class_core(tsk->taskclass)->name:"NULL"; - printk(KERN_WARNING "sanity: task %s:%d has ckrm_core " + printk("sanity: task %s:%d has ckrm_core " "|%s| but in list |%s|\n", tsk->comm, tsk->pid, tclsname, core->name); } @@ -767,7 +767,7 @@ void ckrm_debug_free_task_class(struct ckrm_task_class *tskcls) struct task_struct *proc, *thread; int count = 0; - printk(KERN_DEBUG "Analyze Error <%s> %d\n", + printk("Analyze Error <%s> %d\n", class_core(tskcls)->name, atomic_read(&(class_core(tskcls)->refcnt))); @@ -779,7 +779,7 @@ void ckrm_debug_free_task_class(struct ckrm_task_class *tskcls) const char *tclsname; tclsname = (thread->taskclass) ? class_core(thread->taskclass)->name :"NULL"; - printk(KERN_DEBUG "%d thread=<%s:%d> -> <%s> <%lx>\n", count, + printk("%d thread=<%s:%d> -> <%s> <%lx>\n", count, thread->comm, thread->pid, tclsname, thread->flags & PF_EXITING); } @@ -787,7 +787,7 @@ void ckrm_debug_free_task_class(struct ckrm_task_class *tskcls) class_unlock(class_core(tskcls)); read_unlock(&tasklist_lock); - printk(KERN_DEBUG "End Analyze Error <%s> %d\n", + printk("End Analyze Error <%s> %d\n", class_core(tskcls)->name, atomic_read(&(class_core(tskcls)->refcnt))); } diff --git a/kernel/ckrm/rbce/bitvector.h b/kernel/ckrm/rbce/bitvector.h index 098cc2327..4f53f9847 100644 --- a/kernel/ckrm/rbce/bitvector.h +++ b/kernel/ckrm/rbce/bitvector.h @@ -136,12 +136,12 @@ inline static void bitvector_print(int flag, bitvector_t * vec) return; } if (vec == NULL) { - printk(KERN_DEBUG "v<0>-NULL\n"); + printk("v<0>-NULL\n"); return; } - printk(KERN_DEBUG "v<%d>-", sz = vec->size); + printk("v<%d>-", sz = vec->size); for (i = 0; i < sz; i++) { - printk(KERN_DEBUG "%c", test_bit(i, vec->bits) ? '1' : '0'); + printk("%c", test_bit(i, vec->bits) ? '1' : '0'); } return; } diff --git a/kernel/ckrm/rbce/rbce_fs.c b/kernel/ckrm/rbce/rbce_fs.c index 187e7cdba..7c1dbf230 100644 --- a/kernel/ckrm/rbce/rbce_fs.c +++ b/kernel/ckrm/rbce/rbce_fs.c @@ -422,7 +422,7 @@ static struct inode_operations rbce_dir_inode_operations = { static void rbce_put_super(struct super_block *sb) { module_put(THIS_MODULE); - printk(KERN_DEBUG "rbce_put_super called\n"); + printk("rbce_put_super called\n"); } static struct super_operations rbce_ops = { diff --git a/kernel/ckrm/rbce/rbcemod.c b/kernel/ckrm/rbce/rbcemod.c index 555ba0a4e..4d5f40aef 100644 --- a/kernel/ckrm/rbce/rbcemod.c +++ b/kernel/ckrm/rbce/rbcemod.c @@ -254,7 +254,7 @@ int rbcedebug = 0x00; #define DBG_RULE ( 0x20 ) #define DBG_POLICY ( 0x40 ) -#define DPRINTK(x, y...) if (rbcedebug & (x)) printk(KERN_DEBUG y) +#define DPRINTK(x, y...) if (rbcedebug & (x)) printk(y) // debugging selectively enabled through /proc/sys/debug/rbce static void print_context_vectors(void) @@ -265,9 +265,9 @@ static void print_context_vectors(void) return; } for (i = 0; i < NUM_TERM_MASK_VECTOR; i++) { - printk(KERN_DEBUG "%d: ", i); + printk("%d: ", i); bitvector_print(DBG_OPTIMIZATION, gl_mask_vecs[i]); - printk(KERN_DEBUG "\n"); + printk("\n"); } } #else @@ -506,7 +506,7 @@ rbce_class_deletecb(const char *classname, void *classobj, int classtype) } notify_class_action(cls, 0); cls->classobj = NULL; - list_for_each_entry(pos, &rules_list[classtype], link) { + list_for_each_entry(pos, &rules_list[cls->classtype], link) { rule = (struct rbce_rule *)pos; if (rule->target_class) { if (!strcmp @@ -1816,7 +1816,7 @@ static inline int valid_pdata(struct rbce_private_data *pdata) } } spin_unlock(&pdata_lock); - printk(KERN_WARNING "INVALID/CORRUPT PDATA %p\n", pdata); + printk("INVALID/CORRUPT PDATA %p\n", pdata); return 0; } @@ -1829,7 +1829,7 @@ static inline void store_pdata(struct rbce_private_data *pdata) while (i < MAX_PDATA) { if (pdata_arr[pdata_next] == NULL) { - printk(KERN_DEBUG "storing %p at %d, count %d\n", pdata, + printk("storing %p at %d, count %d\n", pdata, pdata_next, pdata_count); pdata_arr[pdata_next++] = pdata; if (pdata_next == MAX_PDATA) { @@ -1844,7 +1844,7 @@ static inline void store_pdata(struct rbce_private_data *pdata) spin_unlock(&pdata_lock); } if (i == MAX_PDATA) { - printk(KERN_DEBUG "PDATA BUFFER FULL pdata_count %d pdata %p\n", + printk("PDATA BUFFER FULL pdata_count %d pdata %p\n", pdata_count, pdata); } } @@ -1856,7 +1856,7 @@ static inline void unstore_pdata(struct rbce_private_data *pdata) spin_lock(&pdata_lock); for (i = 0; i < MAX_PDATA; i++) { if (pdata_arr[i] == pdata) { - printk(KERN_DEBUG "unstoring %p at %d, count %d\n", pdata, + printk("unstoring %p at %d, count %d\n", pdata, i, pdata_count); pdata_arr[i] = NULL; pdata_count--; @@ -1866,7 +1866,7 @@ static inline void unstore_pdata(struct rbce_private_data *pdata) } spin_unlock(&pdata_lock); if (i == MAX_PDATA) { - printk(KERN_DEBUG "pdata %p not found in the stored array\n", + printk("pdata %p not found in the stored array\n", pdata); } } @@ -1929,7 +1929,7 @@ static struct rbce_private_data *create_private_data(struct rbce_private_data // pdata->evaluate = src->evaluate; // if(src->app_tag) { // int len = strlen(src->app_tag)+1; - // printk(KERN_DEBUG "CREATE_PRIVATE: apptag %s len %d\n", + // printk("CREATE_PRIVATE: apptag %s len %d\n", // src->app_tag,len); // pdata->app_tag = kmalloc(len, GFP_ATOMIC); // if (pdata->app_tag) { @@ -2262,7 +2262,7 @@ void *rbce_tc_classify(enum ckrm_event event, ...) * [ CKRM_LATCHABLE_EVENTS .. CKRM_NONLATCHABLE_EVENTS ) */ - // printk(KERN_DEBUG "tc_classify %p:%d:%s '%s'\n",tsk,tsk->pid, + // printk("tc_classify %p:%d:%s '%s'\n",tsk,tsk->pid, // tsk->comm,event_names[event]); switch (event) { @@ -2314,7 +2314,7 @@ void *rbce_tc_classify(enum ckrm_event event, ...) break; } - // printk(KERN_DEBUG "tc_classify %p:%d:%s '%s' ==> %p\n",tsk,tsk->pid, + // printk("tc_classify %p:%d:%s '%s' ==> %p\n",tsk,tsk->pid, // tsk->comm,event_names[event],cls); return cls; @@ -2323,7 +2323,7 @@ void *rbce_tc_classify(enum ckrm_event event, ...) #ifndef RBCE_EXTENSION static void rbce_tc_notify(int event, void *core, struct task_struct *tsk) { - printk(KERN_DEBUG "tc_manual %p:%d:%s '%s'\n", tsk, tsk->pid, tsk->comm, + printk("tc_manual %p:%d:%s '%s'\n", tsk, tsk->pid, tsk->comm, event_names[event]); if (event != CKRM_EVENT_MANUAL) return; @@ -2409,10 +2409,10 @@ static void unregister_classtype_engines(void) while (ceptr->name) { if (*ceptr->clsvar >= 0) { - printk(KERN_DEBUG "ce unregister with <%s>\n",ceptr->name); + printk("ce unregister with <%s>\n",ceptr->name); while ((rc = ckrm_unregister_engine(ceptr->name)) == -EAGAIN) ; - printk(KERN_DEBUG "ce unregister with <%s> rc=%d\n",ceptr->name,rc); + printk("ce unregister with <%s> rc=%d\n",ceptr->name,rc); *ceptr->clsvar = -1; } ceptr++; @@ -2426,7 +2426,7 @@ static int register_classtype_engines(void) while (ceptr->name) { rc = ckrm_register_engine(ceptr->name, ceptr->cbs); - printk(KERN_DEBUG "ce register with <%s> typeId=%d\n",ceptr->name,rc); + printk("ce register with <%s> typeId=%d\n",ceptr->name,rc); if ((rc < 0) && (rc != -ENOENT)) { unregister_classtype_engines(); return (rc); @@ -2506,7 +2506,7 @@ int init_rbce(void) { int rc, i, line; - printk(KERN_DEBUG "<1>\nInstalling \'%s\' module\n", modname); + printk("<1>\nInstalling \'%s\' module\n", modname); for (i = 0; i < CKRM_MAX_CLASSTYPES; i++) { INIT_LIST_HEAD(&rules_list[i]); @@ -2555,7 +2555,7 @@ int init_rbce(void) exit_rbce_ext(); out: - printk(KERN_DEBUG "<1>%s: error installing rc=%d line=%d\n", __FUNCTION__, rc, + printk("<1>%s: error installing rc=%d line=%d\n", __FUNCTION__, rc, line); return rc; } @@ -2564,19 +2564,19 @@ void exit_rbce(void) { int i; - printk(KERN_DEBUG "<1>Removing \'%s\' module\n", modname); + printk("<1>Removing \'%s\' module\n", modname); stop_debug(); exit_rbce_ext(); // Print warnings if lists are not empty, which is a bug if (!list_empty(&class_list)) { - printk(KERN_DEBUG "exit_rbce: Class list is not empty\n"); + printk("exit_rbce: Class list is not empty\n"); } for (i = 0; i < CKRM_MAX_CLASSTYPES; i++) { if (!list_empty(&rules_list[i])) { - printk(KERN_DEBUG "exit_rbce: Rules list for classtype %d" + printk("exit_rbce: Rules list for classtype %d" " is not empty\n", i); } } diff --git a/kernel/ckrm/rbce/rbcemod_ext.c b/kernel/ckrm/rbce/rbcemod_ext.c index 3cae550f7..b0c6ee9aa 100644 --- a/kernel/ckrm/rbce/rbcemod_ext.c +++ b/kernel/ckrm/rbce/rbcemod_ext.c @@ -62,10 +62,10 @@ static int ukcc_fileop_notify(int rchan_id, { static int readers = 0; if (fileop == RELAY_FILE_OPEN) { - // printk(KERN_DEBUG "got fileop_notify RELAY_FILE_OPEN for file %p\n", + // printk("got fileop_notify RELAY_FILE_OPEN for file %p\n", // filp); if (readers) { - printk(KERN_DEBUG "only one client allowed, backoff .... \n"); + printk("only one client allowed, backoff .... \n"); return -EPERM; } if (!try_module_get(THIS_MODULE)) @@ -74,7 +74,7 @@ static int ukcc_fileop_notify(int rchan_id, client_attached(); } else if (fileop == RELAY_FILE_CLOSE) { - // printk(KERN_DEBUG "got fileop_notify RELAY_FILE_CLOSE for file %p\n", + // printk("got fileop_notify RELAY_FILE_CLOSE for file %p\n", // filp); client_detached(); readers--; @@ -109,10 +109,10 @@ static int create_ukcc_channel(void) channel_flags, &ukcc_callbacks, 0, 0, 0, 0, 0, 0, NULL, 0); if (ukcc_channel < 0) - printk(KERN_DEBUG "crbce: ukcc creation failed, errcode: %d\n", + printk("crbce: ukcc creation failed, errcode: %d\n", ukcc_channel); else - printk(KERN_DEBUG "crbce: ukcc created (%u KB)\n", + printk("crbce: ukcc created (%u KB)\n", UKCC_TOTAL_BUFFER_SIZE >> 10); return ukcc_channel; } @@ -144,9 +144,9 @@ static inline void close_ukcc_channel(void) (r),(l),-1,NULL) > 0); \ chan_state = chan_isok ? UKCC_OK : UKCC_STANDBY; \ if (chan_wasok && !chan_isok) { \ - printk(KERN_DEBUG "Channel stalled\n"); \ + printk("Channel stalled\n"); \ } else if (!chan_wasok && chan_isok) { \ - printk(KERN_DEBUG "Channel continues\n"); \ + printk("Channel continues\n"); \ } \ } while (0) @@ -288,7 +288,7 @@ send_task_record(struct task_struct *tsk, int event, return 0; pdata = RBCE_DATA(tsk); if (pdata == NULL) { - // printk(KERN_DEBUG "send [%d]<%s>: no pdata\n",tsk->pid,tsk->comm); + // printk("send [%d]<%s>: no pdata\n",tsk->pid,tsk->comm); return 0; } if (send_forced || (delta_mode == 0) @@ -384,7 +384,7 @@ static void send_task_data(void) rec_set_timehdr(&limrec, CRBCE_REC_DATA_DELIMITER, 0, 0); rec_send(&limrec); - // printk(KERN_DEBUG "send_task_data mode=%d t#=%d s#=%d\n", + // printk("send_task_data mode=%d t#=%d s#=%d\n", // delta_mode,taskcnt,sendcnt); } @@ -503,7 +503,7 @@ static void sample_task_data(unsigned long unused) } while_each_thread(proc, thread); read_unlock(&tasklist_lock); -// printk(KERN_DEBUG "sample_timer: run=%d wait=%d\n",run,wait); +// printk("sample_timer: run=%d wait=%d\n",run,wait); start_sample_timer(); } @@ -513,7 +513,7 @@ static void ukcc_cmd_deliver(int rchan_id, char *from, u32 len) struct crbce_cmd_done cmdret; int rc = 0; -// printk(KERN_DEBUG "ukcc_cmd_deliver: %d %d len=%d:%d\n",cmdrec->type, +// printk("ukcc_cmd_deliver: %d %d len=%d:%d\n",cmdrec->type, // cmdrec->cmd,cmdrec->len,len); cmdrec->len = len; // add this to reflection so the user doesn't @@ -578,20 +578,20 @@ static void ukcc_cmd_deliver(int rchan_id, char *from, u32 len) cmdret.hdr.cmd = cmdrec->cmd; cmdret.rc = rc; rec_send(&cmdret); -// printk(KERN_DEBUG "ukcc_cmd_deliver ACK: %d %d rc=%d %d\n",cmdret.hdr.type, +// printk("ukcc_cmd_deliver ACK: %d %d rc=%d %d\n",cmdret.hdr.type, // cmdret.hdr.cmd,rc,sizeof(cmdret)); } static void client_attached(void) { - printk(KERN_DEBUG "client [%d]<%s> attached to UKCC\n", current->pid, + printk("client [%d]<%s> attached to UKCC\n", current->pid, current->comm); relay_reset(ukcc_channel); } static void client_detached(void) { - printk(KERN_DEBUG "client [%d]<%s> detached to UKCC\n", current->pid, + printk("client [%d]<%s> detached to UKCC\n", current->pid, current->comm); chan_state = UKCC_STANDBY; stop_sample_timer(); diff --git a/kernel/ckrm/rbce/token.c b/kernel/ckrm/rbce/token.c index 32446fb2b..7bcdf5492 100644 --- a/kernel/ckrm/rbce/token.c +++ b/kernel/ckrm/rbce/token.c @@ -293,7 +293,7 @@ rules_parse(char *rule_defn, struct rbce_rule_term **rterms, int *term_mask) *term_mask = 0; } /* else { for (i = 0; i < nterms; i++) { - printk(KERN_DEBUG "token: i %d; op %d, operator %d, str %ld\n", + printk("token: i %d; op %d, operator %d, str %ld\n", i, terms[i].op, terms[i].operator, terms[i].u.id); } } */ diff --git a/kernel/ckrm_sched.c b/kernel/ckrm_sched.c index 5142b2eaa..1ca2611dc 100644 --- a/kernel/ckrm_sched.c +++ b/kernel/ckrm_sched.c @@ -37,18 +37,12 @@ static inline void check_inactive_class(ckrm_lrq_t * lrq,CVT_t cur_cvt) if (unlikely(! cur_cvt)) return; -#ifndef INTERACTIVE_BONUS_SUPPORT -#warning "ACB taking out interactive bonus calculation" - bonus = 0; -#else /* * Always leaving a small bonus for inactive classes * allows them to compete for cycles immediately when the become * active. This should improve interactive behavior */ bonus = INTERACTIVE_BONUS(lrq); -#endif - //cvt can't be negative if (cur_cvt > bonus) min_cvt = cur_cvt - bonus; @@ -83,11 +77,7 @@ static inline void check_inactive_class(ckrm_lrq_t * lrq,CVT_t cur_cvt) lrq->savings -= savings_used; unscale_cvt(savings_used,lrq); BUG_ON(lrq->local_cvt < savings_used); -#ifndef CVT_SAVINGS_SUPPORT -#warning "ACB taking out cvt saving" -#else lrq->local_cvt -= savings_used; -#endif } } diff --git a/kernel/itimer.c b/kernel/itimer.c index 5bf6c881c..6918cb746 100644 --- a/kernel/itimer.c +++ b/kernel/itimer.c @@ -68,9 +68,7 @@ void it_real_fn(unsigned long __data) struct task_struct * p = (struct task_struct *) __data; unsigned long interval; - if (send_group_sig_info(SIGALRM, SEND_SIG_PRIV, p)) - printk("*warning*: failed to send SIGALRM to %u\n", p->pid); - + send_group_sig_info(SIGALRM, SEND_SIG_PRIV, p); interval = p->it_real_incr; if (interval) { if (interval > (unsigned long) LONG_MAX) diff --git a/kernel/signal.c b/kernel/signal.c index e4282d2de..b3574b096 100644 --- a/kernel/signal.c +++ b/kernel/signal.c @@ -603,28 +603,17 @@ static int check_kill_permission(int sig, struct siginfo *info, struct task_struct *t) { int error = -EINVAL; - int user; - if (sig < 0 || sig > _NSIG) return error; - - user = (!info || - (info != SEND_SIG_PRIV && - info != SEND_SIG_FORCED && - SI_FROMUSER(info))); - error = -EPERM; - if (user && (sig != SIGCONT || - current->signal->session != t->signal->session) + if ((!info || ((unsigned long)info != 1 && + (unsigned long)info != 2 && SI_FROMUSER(info))) + && ((sig != SIGCONT) || + (current->signal->session != t->signal->session)) && (current->euid ^ t->suid) && (current->euid ^ t->uid) && (current->uid ^ t->suid) && (current->uid ^ t->uid) && !capable(CAP_KILL)) return error; - - error = -ESRCH; - if (user && !vx_check(vx_task_xid(t), VX_ADMIN|VX_IDENT)) - return error; - return security_task_kill(t, info, sig); } @@ -1066,6 +1055,9 @@ int group_send_sig_info(int sig, struct siginfo *info, struct task_struct *p) unsigned long flags; int ret; + if (!vx_check(vx_task_xid(p), VX_ADMIN|VX_WATCH|VX_IDENT)) + return -ESRCH; + ret = check_kill_permission(sig, info, p); if (!ret && sig && p->sighand) { spin_lock_irqsave(&p->sighand->siglock, flags); diff --git a/kernel/vserver/inode.c b/kernel/vserver/inode.c index 3e8120bd3..dda881895 100644 --- a/kernel/vserver/inode.c +++ b/kernel/vserver/inode.c @@ -170,37 +170,6 @@ int vc_set_iattr(uint32_t id, void __user *data) return ret; } -int vc_iattr_ioctl(struct dentry *de, unsigned int cmd, unsigned long arg) -{ - void __user *data = (void __user *)arg; - struct vcmd_ctx_iattr_v1 vc_data; - int ret; - - /* - * I don't think we need any dget/dput pairs in here as long as - * this function is always called from sys_ioctl i.e., de is - * a field of a struct file that is guaranteed not to be freed. - */ - if (cmd == FIOC_SETIATTR) { - if (!capable(CAP_SYS_ADMIN) || !capable(CAP_LINUX_IMMUTABLE)) - return -EPERM; - if (copy_from_user (&vc_data, data, sizeof(vc_data))) - return -EFAULT; - ret = __vc_set_iattr(de, - &vc_data.xid, &vc_data.flags, &vc_data.mask); - } - else { - if (!vx_check(0, VX_ADMIN)) - return -ENOSYS; - ret = __vc_get_iattr(de->d_inode, - &vc_data.xid, &vc_data.flags, &vc_data.mask); - } - - if (!ret && copy_to_user (data, &vc_data, sizeof(vc_data))) - ret = -EFAULT; - return ret; -} - #ifdef CONFIG_VSERVER_LEGACY #include diff --git a/mm/memory.c b/mm/memory.c index 0dfb74060..6c44ecca0 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -1650,9 +1650,8 @@ retry: */ /* Only go through if we didn't race with anybody else... */ if (pte_none(*page_table)) { - if (!PageReserved(new_page)) - //++mm->rss; - vx_rsspages_inc(mm); + if (!PageReserved(new_page)) + ++mm->rss; flush_icache_page(vma, new_page); entry = mk_pte(new_page, vma->vm_page_prot); if (write_access) diff --git a/mm/vmscan.c b/mm/vmscan.c index e01d5c98d..fa5a5e795 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -39,12 +39,6 @@ #include #include -#ifndef AT_LIMIT_SUPPORT -#warning "ckrm_at_limit disabled due to problems with memory hog tests -- seting ckrm_shrink_list_empty to true" -#undef ckrm_shrink_list_empty -#define ckrm_shrink_list_empty() (1) -#endif - /* possible outcome of pageout() */ typedef enum { /* failed to write page out, page is locked */ @@ -725,8 +719,8 @@ redo: list_add(&page->lru, &l_hold); ckrm_mem_dec_active(page); pgmoved++; - pgscanned++; - } + pgscanned++; + } if (!--nr_pass && ckrm_flags) { goto redo; } @@ -901,7 +895,7 @@ shrink_zone(struct zone *zone, struct scan_control *sc) } } -#if defined(CONFIG_CKRM_RES_MEM) && defined(AT_LIMIT_SUPPORT) +#ifdef CONFIG_CKRM_RES_MEM // This function needs to be given more thought. // Shrink the class to be at 90% of its limit static void @@ -1001,11 +995,6 @@ ckrm_shrink_classes(void) } #else - -#if defined(CONFIG_CKRM_RES_MEM) && !defined(AT_LIMIT_SUPPORT) -#warning "disabling ckrm_at_limit -- setting ckrm_shrink_classes to noop " -#endif - #define ckrm_shrink_classes() do { } while(0) #endif diff --git a/net/core/sock.c b/net/core/sock.c index 266397922..d5b2d9105 100644 --- a/net/core/sock.c +++ b/net/core/sock.c @@ -331,18 +331,6 @@ int sock_setsockopt(struct socket *sock, int level, int optname, clear_bit(SOCK_PASS_CRED, &sock->flags); break; - case SO_SETXID: - if (current->xid) { - ret = -EPERM; - break; - } - if (val < 0 || val > MAX_S_CONTEXT) { - ret = -EINVAL; - break; - } - sk->sk_xid = val; - break; - case SO_TIMESTAMP: sk->sk_rcvtstamp = valbool; if (valbool) diff --git a/net/ipv4/netfilter/ip_conntrack_pptp.c b/net/ipv4/netfilter/ip_conntrack_pptp.c deleted file mode 100644 index 29ab1a495..000000000 --- a/net/ipv4/netfilter/ip_conntrack_pptp.c +++ /dev/null @@ -1,712 +0,0 @@ -/* - * ip_conntrack_pptp.c - Version 2.0 - * - * Connection tracking support for PPTP (Point to Point Tunneling Protocol). - * PPTP is a a protocol for creating virtual private networks. - * It is a specification defined by Microsoft and some vendors - * working with Microsoft. PPTP is built on top of a modified - * version of the Internet Generic Routing Encapsulation Protocol. - * GRE is defined in RFC 1701 and RFC 1702. Documentation of - * PPTP can be found in RFC 2637 - * - * (C) 2000-2003 by Harald Welte - * - * Development of this code funded by Astaro AG (http://www.astaro.com/) - * - * Limitations: - * - We blindly assume that control connections are always - * established in PNS->PAC direction. This is a violation - * of RFFC2673 - * - * TODO: - finish support for multiple calls within one session - * (needs expect reservations in newnat) - * - testing of incoming PPTP calls - * - * Changes: - * 2002-02-05 - Version 1.3 - * - Call ip_conntrack_unexpect_related() from - * pptp_timeout_related() to destroy expectations in case - * CALL_DISCONNECT_NOTIFY or tcp fin packet was seen - * (Philip Craig ) - * - Add Version information at module loadtime - * 2002-02-10 - Version 1.6 - * - move to C99 style initializers - * - remove second expectation if first arrives - * 2004-10-22 - Version 2.0 - * - merge Mandrake's 2.6.x port with recent 2.6.x API changes - * - fix lots of linear skb assumptions from Mandrake's port - * - */ - -#include -#include -#include -#include -#include -#include - -#include -#include -#include -#include - -#define IP_CT_PPTP_VERSION "2.0" - -MODULE_LICENSE("GPL"); -MODULE_AUTHOR("Harald Welte "); -MODULE_DESCRIPTION("Netfilter connection tracking helper module for PPTP"); - -DECLARE_LOCK(ip_pptp_lock); - -#if 0 -#include "ip_conntrack_pptp_priv.h" -#define DEBUGP(format, args...) printk(KERN_DEBUG "%s:%s: " format, __FILE__, __FUNCTION__, ## args) -#else -#define DEBUGP(format, args...) -#endif - -#define SECS *HZ -#define MINS * 60 SECS -#define HOURS * 60 MINS -#define DAYS * 24 HOURS - -#define PPTP_GRE_TIMEOUT (10 MINS) -#define PPTP_GRE_STREAM_TIMEOUT (5 DAYS) - -static int pptp_expectfn(struct ip_conntrack *ct) -{ - struct ip_conntrack *master; - struct ip_conntrack_expect *exp; - - DEBUGP("increasing timeouts\n"); - /* increase timeout of GRE data channel conntrack entry */ - ct->proto.gre.timeout = PPTP_GRE_TIMEOUT; - ct->proto.gre.stream_timeout = PPTP_GRE_STREAM_TIMEOUT; - - master = master_ct(ct); - if (!master) { - DEBUGP(" no master!!!\n"); - return 0; - } - - exp = ct->master; - if (!exp) { - DEBUGP("no expectation!!\n"); - return 0; - } - - DEBUGP("completing tuples with ct info\n"); - /* we can do this, since we're unconfirmed */ - if (ct->tuplehash[IP_CT_DIR_ORIGINAL].tuple.dst.u.gre.key == - htonl(master->help.ct_pptp_info.pac_call_id)) { - /* assume PNS->PAC */ - ct->tuplehash[IP_CT_DIR_ORIGINAL].tuple.src.u.gre.key = - htonl(master->help.ct_pptp_info.pns_call_id); - ct->tuplehash[IP_CT_DIR_REPLY].tuple.dst.u.gre.key = - htonl(master->help.ct_pptp_info.pns_call_id); - } else { - /* assume PAC->PNS */ - ct->tuplehash[IP_CT_DIR_ORIGINAL].tuple.src.u.gre.key = - htonl(master->help.ct_pptp_info.pac_call_id); - ct->tuplehash[IP_CT_DIR_REPLY].tuple.dst.u.gre.key = - htonl(master->help.ct_pptp_info.pac_call_id); - } - - /* delete other expectation */ - if (exp->expected_list.next != &exp->expected_list) { - struct ip_conntrack_expect *other_exp; - struct list_head *cur_item, *next; - - for (cur_item = master->sibling_list.next; - cur_item != &master->sibling_list; cur_item = next) { - next = cur_item->next; - other_exp = list_entry(cur_item, - struct ip_conntrack_expect, - expected_list); - /* remove only if occurred at same sequence number */ - if (other_exp != exp && other_exp->seq == exp->seq) { - DEBUGP("unexpecting other direction\n"); - ip_ct_gre_keymap_destroy(other_exp); - ip_conntrack_unexpect_related(other_exp); - } - } - } - - return 0; -} - -/* timeout GRE data connections */ -static int pptp_timeout_related(struct ip_conntrack *ct) -{ - struct list_head *cur_item, *next; - struct ip_conntrack_expect *exp; - - /* FIXME: do we have to lock something ? */ - for (cur_item = ct->sibling_list.next; - cur_item != &ct->sibling_list; cur_item = next) { - next = cur_item->next; - exp = list_entry(cur_item, struct ip_conntrack_expect, - expected_list); - - ip_ct_gre_keymap_destroy(exp); - if (!exp->sibling) { - ip_conntrack_unexpect_related(exp); - continue; - } - - DEBUGP("setting timeout of conntrack %p to 0\n", - exp->sibling); - exp->sibling->proto.gre.timeout = 0; - exp->sibling->proto.gre.stream_timeout = 0; - /* refresh_acct will not modify counters if skb == NULL */ - ip_ct_refresh_acct(exp->sibling, 0, NULL, 0); - } - - return 0; -} - -/* expect GRE connections (PNS->PAC and PAC->PNS direction) */ -static inline int -exp_gre(struct ip_conntrack *master, - u_int32_t seq, - u_int16_t callid, - u_int16_t peer_callid) -{ - struct ip_conntrack_tuple inv_tuple; - struct ip_conntrack_tuple exp_tuples[] = { - /* tuple in original direction, PNS->PAC */ - { .src = { .ip = master->tuplehash[IP_CT_DIR_ORIGINAL].tuple.src.ip, - .u = { .gre = { .key = htonl(ntohs(peer_callid)) } } - }, - .dst = { .ip = master->tuplehash[IP_CT_DIR_ORIGINAL].tuple.dst.ip, - .u = { .gre = { .key = htonl(ntohs(callid)) } }, - .protonum = IPPROTO_GRE - }, - }, - /* tuple in reply direction, PAC->PNS */ - { .src = { .ip = master->tuplehash[IP_CT_DIR_REPLY].tuple.src.ip, - .u = { .gre = { .key = htonl(ntohs(callid)) } } - }, - .dst = { .ip = master->tuplehash[IP_CT_DIR_REPLY].tuple.dst.ip, - .u = { .gre = { .key = htonl(ntohs(peer_callid)) } }, - .protonum = IPPROTO_GRE - }, - } - }, *exp_tuple; - - for (exp_tuple = exp_tuples; exp_tuple < &exp_tuples[2]; exp_tuple++) { - struct ip_conntrack_expect *exp; - - exp = ip_conntrack_expect_alloc(); - if (exp == NULL) - return 1; - - memcpy(&exp->tuple, exp_tuple, sizeof(exp->tuple)); - - exp->mask.src.ip = 0xffffffff; - exp->mask.src.u.all = 0; - exp->mask.dst.u.all = 0; - exp->mask.dst.u.gre.key = 0xffffffff; - exp->mask.dst.ip = 0xffffffff; - exp->mask.dst.protonum = 0xffff; - - exp->seq = seq; - exp->expectfn = pptp_expectfn; - - exp->help.exp_pptp_info.pac_call_id = ntohs(callid); - exp->help.exp_pptp_info.pns_call_id = ntohs(peer_callid); - - DEBUGP("calling expect_related "); - DUMP_TUPLE_RAW(&exp->tuple); - - /* Add GRE keymap entries */ - if (ip_ct_gre_keymap_add(exp, &exp->tuple, 0) != 0) { - kfree(exp); - return 1; - } - - invert_tuplepr(&inv_tuple, &exp->tuple); - if (ip_ct_gre_keymap_add(exp, &inv_tuple, 1) != 0) { - ip_ct_gre_keymap_destroy(exp); - kfree(exp); - return 1; - } - - if (ip_conntrack_expect_related(exp, master) != 0) { - ip_ct_gre_keymap_destroy(exp); - kfree(exp); - DEBUGP("cannot expect_related()\n"); - return 1; - } - } - - return 0; -} - -static inline int -pptp_inbound_pkt(struct sk_buff *skb, - struct tcphdr *tcph, - unsigned int ctlhoff, - size_t datalen, - struct ip_conntrack *ct) -{ - struct PptpControlHeader _ctlh, *ctlh; - unsigned int reqlen; - union pptp_ctrl_union _pptpReq, *pptpReq; - struct ip_ct_pptp_master *info = &ct->help.ct_pptp_info; - u_int16_t msg, *cid, *pcid; - u_int32_t seq; - - ctlh = skb_header_pointer(skb, ctlhoff, sizeof(_ctlh), &_ctlh); - if (unlikely(!ctlh)) { - DEBUGP("error during skb_header_pointer\n"); - return NF_ACCEPT; - } - - reqlen = datalen - sizeof(struct pptp_pkt_hdr) - sizeof(_ctlh); - pptpReq = skb_header_pointer(skb, ctlhoff+sizeof(struct pptp_pkt_hdr), - reqlen, &_pptpReq); - if (unlikely(!pptpReq)) { - DEBUGP("error during skb_header_pointer\n"); - return NF_ACCEPT; - } - - msg = ntohs(ctlh->messageType); - DEBUGP("inbound control message %s\n", strMName[msg]); - - switch (msg) { - case PPTP_START_SESSION_REPLY: - if (reqlen < sizeof(_pptpReq.srep)) { - DEBUGP("%s: short packet\n", strMName[msg]); - break; - } - - /* server confirms new control session */ - if (info->sstate < PPTP_SESSION_REQUESTED) { - DEBUGP("%s without START_SESS_REQUEST\n", - strMName[msg]); - break; - } - if (pptpReq->srep.resultCode == PPTP_START_OK) - info->sstate = PPTP_SESSION_CONFIRMED; - else - info->sstate = PPTP_SESSION_ERROR; - break; - - case PPTP_STOP_SESSION_REPLY: - if (reqlen < sizeof(_pptpReq.strep)) { - DEBUGP("%s: short packet\n", strMName[msg]); - break; - } - - /* server confirms end of control session */ - if (info->sstate > PPTP_SESSION_STOPREQ) { - DEBUGP("%s without STOP_SESS_REQUEST\n", - strMName[msg]); - break; - } - if (pptpReq->strep.resultCode == PPTP_STOP_OK) - info->sstate = PPTP_SESSION_NONE; - else - info->sstate = PPTP_SESSION_ERROR; - break; - - case PPTP_OUT_CALL_REPLY: - if (reqlen < sizeof(_pptpReq.ocack)) { - DEBUGP("%s: short packet\n", strMName[msg]); - break; - } - - /* server accepted call, we now expect GRE frames */ - if (info->sstate != PPTP_SESSION_CONFIRMED) { - DEBUGP("%s but no session\n", strMName[msg]); - break; - } - if (info->cstate != PPTP_CALL_OUT_REQ && - info->cstate != PPTP_CALL_OUT_CONF) { - DEBUGP("%s without OUTCALL_REQ\n", strMName[msg]); - break; - } - if (pptpReq->ocack.resultCode != PPTP_OUTCALL_CONNECT) { - info->cstate = PPTP_CALL_NONE; - break; - } - - cid = &pptpReq->ocack.callID; - pcid = &pptpReq->ocack.peersCallID; - - info->pac_call_id = ntohs(*cid); - - if (htons(info->pns_call_id) != *pcid) { - DEBUGP("%s for unknown callid %u\n", - strMName[msg], ntohs(*pcid)); - break; - } - - DEBUGP("%s, CID=%X, PCID=%X\n", strMName[msg], - ntohs(*cid), ntohs(*pcid)); - - info->cstate = PPTP_CALL_OUT_CONF; - - seq = ntohl(tcph->seq) + sizeof(struct pptp_pkt_hdr) - + sizeof(struct PptpControlHeader) - + ((void *)pcid - (void *)pptpReq); - - if (exp_gre(ct, seq, *cid, *pcid) != 0) - printk("ip_conntrack_pptp: error during exp_gre\n"); - break; - - case PPTP_IN_CALL_REQUEST: - if (reqlen < sizeof(_pptpReq.icack)) { - DEBUGP("%s: short packet\n", strMName[msg]); - break; - } - - /* server tells us about incoming call request */ - if (info->sstate != PPTP_SESSION_CONFIRMED) { - DEBUGP("%s but no session\n", strMName[msg]); - break; - } - pcid = &pptpReq->icack.peersCallID; - DEBUGP("%s, PCID=%X\n", strMName[msg], ntohs(*pcid)); - info->cstate = PPTP_CALL_IN_REQ; - info->pac_call_id = ntohs(*pcid); - break; - - case PPTP_IN_CALL_CONNECT: - if (reqlen < sizeof(_pptpReq.iccon)) { - DEBUGP("%s: short packet\n", strMName[msg]); - break; - } - - /* server tells us about incoming call established */ - if (info->sstate != PPTP_SESSION_CONFIRMED) { - DEBUGP("%s but no session\n", strMName[msg]); - break; - } - if (info->sstate != PPTP_CALL_IN_REP - && info->sstate != PPTP_CALL_IN_CONF) { - DEBUGP("%s but never sent IN_CALL_REPLY\n", - strMName[msg]); - break; - } - - pcid = &pptpReq->iccon.peersCallID; - cid = &info->pac_call_id; - - if (info->pns_call_id != ntohs(*pcid)) { - DEBUGP("%s for unknown CallID %u\n", - strMName[msg], ntohs(*cid)); - break; - } - - DEBUGP("%s, PCID=%X\n", strMName[msg], ntohs(*pcid)); - info->cstate = PPTP_CALL_IN_CONF; - - /* we expect a GRE connection from PAC to PNS */ - seq = ntohl(tcph->seq) + sizeof(struct pptp_pkt_hdr) - + sizeof(struct PptpControlHeader) - + ((void *)pcid - (void *)pptpReq); - - if (exp_gre(ct, seq, *cid, *pcid) != 0) - printk("ip_conntrack_pptp: error during exp_gre\n"); - - break; - - case PPTP_CALL_DISCONNECT_NOTIFY: - if (reqlen < sizeof(_pptpReq.disc)) { - DEBUGP("%s: short packet\n", strMName[msg]); - break; - } - - /* server confirms disconnect */ - cid = &pptpReq->disc.callID; - DEBUGP("%s, CID=%X\n", strMName[msg], ntohs(*cid)); - info->cstate = PPTP_CALL_NONE; - - /* untrack this call id, unexpect GRE packets */ - pptp_timeout_related(ct); - break; - - case PPTP_WAN_ERROR_NOTIFY: - break; - - case PPTP_ECHO_REQUEST: - case PPTP_ECHO_REPLY: - /* I don't have to explain these ;) */ - break; - default: - DEBUGP("invalid %s (TY=%d)\n", (msg <= PPTP_MSG_MAX) - ? strMName[msg]:strMName[0], msg); - break; - } - - return NF_ACCEPT; - -} - -static inline int -pptp_outbound_pkt(struct sk_buff *skb, - struct tcphdr *tcph, - unsigned int ctlhoff, - size_t datalen, - struct ip_conntrack *ct) -{ - struct PptpControlHeader _ctlh, *ctlh; - unsigned int reqlen; - union pptp_ctrl_union _pptpReq, *pptpReq; - struct ip_ct_pptp_master *info = &ct->help.ct_pptp_info; - u_int16_t msg, *cid, *pcid; - - ctlh = skb_header_pointer(skb, ctlhoff, sizeof(_ctlh), &_ctlh); - if (!ctlh) - return NF_ACCEPT; - - reqlen = datalen - sizeof(struct pptp_pkt_hdr) - sizeof(_ctlh); - pptpReq = skb_header_pointer(skb, ctlhoff+sizeof(_ctlh), reqlen, - &_pptpReq); - if (!pptpReq) - return NF_ACCEPT; - - msg = ntohs(ctlh->messageType); - DEBUGP("outbound control message %s\n", strMName[msg]); - - switch (msg) { - case PPTP_START_SESSION_REQUEST: - /* client requests for new control session */ - if (info->sstate != PPTP_SESSION_NONE) { - DEBUGP("%s but we already have one", - strMName[msg]); - } - info->sstate = PPTP_SESSION_REQUESTED; - break; - case PPTP_STOP_SESSION_REQUEST: - /* client requests end of control session */ - info->sstate = PPTP_SESSION_STOPREQ; - break; - - case PPTP_OUT_CALL_REQUEST: - if (reqlen < sizeof(_pptpReq.ocreq)) { - DEBUGP("%s: short packet\n", strMName[msg]); - break; - } - - /* client initiating connection to server */ - if (info->sstate != PPTP_SESSION_CONFIRMED) { - DEBUGP("%s but no session\n", - strMName[msg]); - break; - } - info->cstate = PPTP_CALL_OUT_REQ; - /* track PNS call id */ - cid = &pptpReq->ocreq.callID; - DEBUGP("%s, CID=%X\n", strMName[msg], ntohs(*cid)); - info->pns_call_id = ntohs(*cid); - break; - case PPTP_IN_CALL_REPLY: - if (reqlen < sizeof(_pptpReq.icack)) { - DEBUGP("%s: short packet\n", strMName[msg]); - break; - } - - /* client answers incoming call */ - if (info->cstate != PPTP_CALL_IN_REQ - && info->cstate != PPTP_CALL_IN_REP) { - DEBUGP("%s without incall_req\n", - strMName[msg]); - break; - } - if (pptpReq->icack.resultCode != PPTP_INCALL_ACCEPT) { - info->cstate = PPTP_CALL_NONE; - break; - } - pcid = &pptpReq->icack.peersCallID; - if (info->pac_call_id != ntohs(*pcid)) { - DEBUGP("%s for unknown call %u\n", - strMName[msg], ntohs(*pcid)); - break; - } - DEBUGP("%s, CID=%X\n", strMName[msg], ntohs(*pcid)); - /* part two of the three-way handshake */ - info->cstate = PPTP_CALL_IN_REP; - info->pns_call_id = ntohs(pptpReq->icack.callID); - break; - - case PPTP_CALL_CLEAR_REQUEST: - /* client requests hangup of call */ - if (info->sstate != PPTP_SESSION_CONFIRMED) { - DEBUGP("CLEAR_CALL but no session\n"); - break; - } - /* FUTURE: iterate over all calls and check if - * call ID is valid. We don't do this without newnat, - * because we only know about last call */ - info->cstate = PPTP_CALL_CLEAR_REQ; - break; - case PPTP_SET_LINK_INFO: - break; - case PPTP_ECHO_REQUEST: - case PPTP_ECHO_REPLY: - /* I don't have to explain these ;) */ - break; - default: - DEBUGP("invalid %s (TY=%d)\n", (msg <= PPTP_MSG_MAX)? - strMName[msg]:strMName[0], msg); - /* unknown: no need to create GRE masq table entry */ - break; - } - - return NF_ACCEPT; -} - - -/* track caller id inside control connection, call expect_related */ -static int -conntrack_pptp_help(struct sk_buff *skb, - struct ip_conntrack *ct, enum ip_conntrack_info ctinfo) - -{ - struct pptp_pkt_hdr _pptph, *pptph; - - struct tcphdr _tcph, *tcph; - u_int32_t tcplen = skb->len - skb->nh.iph->ihl * 4; - u_int32_t datalen; - void *datalimit; - int dir = CTINFO2DIR(ctinfo); - struct ip_ct_pptp_master *info = &ct->help.ct_pptp_info; - unsigned int nexthdr_off; - - int oldsstate, oldcstate; - int ret; - - /* don't do any tracking before tcp handshake complete */ - if (ctinfo != IP_CT_ESTABLISHED - && ctinfo != IP_CT_ESTABLISHED+IP_CT_IS_REPLY) { - DEBUGP("ctinfo = %u, skipping\n", ctinfo); - return NF_ACCEPT; - } - - nexthdr_off = skb->nh.iph->ihl*4; - tcph = skb_header_pointer(skb, skb->nh.iph->ihl*4, sizeof(_tcph), - &_tcph); - if (!tcph) - return NF_ACCEPT; - - /* not a complete TCP header? */ - if (tcplen < sizeof(struct tcphdr) || tcplen < tcph->doff * 4) { - DEBUGP("tcplen = %u\n", tcplen); - return NF_ACCEPT; - } - - - datalen = tcplen - tcph->doff * 4; - - /* checksum invalid? */ - if (tcp_v4_check(tcph, tcplen, skb->nh.iph->saddr, skb->nh.iph->daddr, - csum_partial((char *) tcph, tcplen, 0))) { - printk(KERN_NOTICE __FILE__ ": bad csum\n"); - /* W2K PPTP server sends TCP packets with wrong checksum :(( */ - //return NF_ACCEPT; - } - - if (tcph->fin || tcph->rst) { - DEBUGP("RST/FIN received, timeouting GRE\n"); - /* can't do this after real newnat */ - info->cstate = PPTP_CALL_NONE; - - /* untrack this call id, unexpect GRE packets */ - pptp_timeout_related(ct); - } - - nexthdr_off += tcph->doff*4; - pptph = skb_header_pointer(skb, skb->nh.iph->ihl*4 + tcph->doff*4, - sizeof(_pptph), &_pptph); - if (!pptph) { - DEBUGP("no full PPTP header, can't track\n"); - return NF_ACCEPT; - } - - datalimit = (void *) pptph + datalen; - - /* if it's not a control message we can't do anything with it */ - if (ntohs(pptph->packetType) != PPTP_PACKET_CONTROL || - ntohl(pptph->magicCookie) != PPTP_MAGIC_COOKIE) { - DEBUGP("not a control packet\n"); - return NF_ACCEPT; - } - - oldsstate = info->sstate; - oldcstate = info->cstate; - - LOCK_BH(&ip_pptp_lock); - - nexthdr_off += sizeof(_pptph); - /* FIXME: We just blindly assume that the control connection is always - * established from PNS->PAC. However, RFC makes no guarantee */ - if (dir == IP_CT_DIR_ORIGINAL) - /* client -> server (PNS -> PAC) */ - ret = pptp_outbound_pkt(skb, tcph, nexthdr_off, datalen, ct); - else - /* server -> client (PAC -> PNS) */ - ret = pptp_inbound_pkt(skb, tcph, nexthdr_off, datalen, ct); - DEBUGP("sstate: %d->%d, cstate: %d->%d\n", - oldsstate, info->sstate, oldcstate, info->cstate); - UNLOCK_BH(&ip_pptp_lock); - - return ret; -} - -/* control protocol helper */ -static struct ip_conntrack_helper pptp = { - .list = { NULL, NULL }, - .name = "pptp", - .flags = IP_CT_HELPER_F_REUSE_EXPECT, - .me = THIS_MODULE, - .max_expected = 2, - .timeout = 0, - .tuple = { .src = { .ip = 0, - .u = { .tcp = { .port = - __constant_htons(PPTP_CONTROL_PORT) } } - }, - .dst = { .ip = 0, - .u = { .all = 0 }, - .protonum = IPPROTO_TCP - } - }, - .mask = { .src = { .ip = 0, - .u = { .tcp = { .port = 0xffff } } - }, - .dst = { .ip = 0, - .u = { .all = 0 }, - .protonum = 0xffff - } - }, - .help = conntrack_pptp_help -}; - -/* ip_conntrack_pptp initialization */ -static int __init init(void) -{ - int retcode; - - DEBUGP(__FILE__ ": registering helper\n"); - if ((retcode = ip_conntrack_helper_register(&pptp))) { - printk(KERN_ERR "Unable to register conntrack application " - "helper for pptp: %d\n", retcode); - return -EIO; - } - - printk("ip_conntrack_pptp version %s loaded\n", IP_CT_PPTP_VERSION); - return 0; -} - -static void __exit fini(void) -{ - ip_conntrack_helper_unregister(&pptp); - printk("ip_conntrack_pptp version %s unloaded\n", IP_CT_PPTP_VERSION); -} - -module_init(init); -module_exit(fini); - -EXPORT_SYMBOL(ip_pptp_lock); diff --git a/net/ipv4/netfilter/ip_conntrack_pptp_priv.h b/net/ipv4/netfilter/ip_conntrack_pptp_priv.h deleted file mode 100644 index 6b52564e8..000000000 --- a/net/ipv4/netfilter/ip_conntrack_pptp_priv.h +++ /dev/null @@ -1,24 +0,0 @@ -#ifndef _IP_CT_PPTP_PRIV_H -#define _IP_CT_PPTP_PRIV_H - -/* PptpControlMessageType names */ -static const char *strMName[] = { - "UNKNOWN_MESSAGE", - "START_SESSION_REQUEST", - "START_SESSION_REPLY", - "STOP_SESSION_REQUEST", - "STOP_SESSION_REPLY", - "ECHO_REQUEST", - "ECHO_REPLY", - "OUT_CALL_REQUEST", - "OUT_CALL_REPLY", - "IN_CALL_REQUEST", - "IN_CALL_REPLY", - "IN_CALL_CONNECT", - "CALL_CLEAR_REQUEST", - "CALL_DISCONNECT_NOTIFY", - "WAN_ERROR_NOTIFY", - "SET_LINK_INFO" -}; - -#endif diff --git a/net/ipv4/netfilter/ip_conntrack_proto_gre.c b/net/ipv4/netfilter/ip_conntrack_proto_gre.c deleted file mode 100644 index 013f759cc..000000000 --- a/net/ipv4/netfilter/ip_conntrack_proto_gre.c +++ /dev/null @@ -1,349 +0,0 @@ -/* - * ip_conntrack_proto_gre.c - Version 2.0 - * - * Connection tracking protocol helper module for GRE. - * - * GRE is a generic encapsulation protocol, which is generally not very - * suited for NAT, as it has no protocol-specific part as port numbers. - * - * It has an optional key field, which may help us distinguishing two - * connections between the same two hosts. - * - * GRE is defined in RFC 1701 and RFC 1702, as well as RFC 2784 - * - * PPTP is built on top of a modified version of GRE, and has a mandatory - * field called "CallID", which serves us for the same purpose as the key - * field in plain GRE. - * - * Documentation about PPTP can be found in RFC 2637 - * - * (C) 2000-2004 by Harald Welte - * - * Development of this code funded by Astaro AG (http://www.astaro.com/) - * - */ - -#include -#include -#include -#include -#include -#include -#include -#include - -#include - -DECLARE_RWLOCK(ip_ct_gre_lock); -#define ASSERT_READ_LOCK(x) MUST_BE_READ_LOCKED(&ip_ct_gre_lock) -#define ASSERT_WRITE_LOCK(x) MUST_BE_WRITE_LOCKED(&ip_ct_gre_lock) - -#include -#include -#include -#include - -#include -#include - -MODULE_LICENSE("GPL"); -MODULE_AUTHOR("Harald Welte "); -MODULE_DESCRIPTION("netfilter connection tracking protocol helper for GRE"); - -/* shamelessly stolen from ip_conntrack_proto_udp.c */ -#define GRE_TIMEOUT (30*HZ) -#define GRE_STREAM_TIMEOUT (180*HZ) - -#if 0 -#define DEBUGP(format, args...) printk(KERN_DEBUG "%s:%s: " format, __FILE__, __FUNCTION__, ## args) -#define DUMP_TUPLE_GRE(x) printk("%u.%u.%u.%u:0x%x -> %u.%u.%u.%u:0x%x\n", \ - NIPQUAD((x)->src.ip), ntohl((x)->src.u.gre.key), \ - NIPQUAD((x)->dst.ip), ntohl((x)->dst.u.gre.key)) -#else -#define DEBUGP(x, args...) -#define DUMP_TUPLE_GRE(x) -#endif - -/* GRE KEYMAP HANDLING FUNCTIONS */ -static LIST_HEAD(gre_keymap_list); - -static inline int gre_key_cmpfn(const struct ip_ct_gre_keymap *km, - const struct ip_conntrack_tuple *t) -{ - return ((km->tuple.src.ip == t->src.ip) && - (km->tuple.dst.ip == t->dst.ip) && - (km->tuple.dst.protonum == t->dst.protonum) && - (km->tuple.dst.u.all == t->dst.u.all)); -} - -/* look up the source key for a given tuple */ -static u_int32_t gre_keymap_lookup(struct ip_conntrack_tuple *t) -{ - struct ip_ct_gre_keymap *km; - u_int32_t key; - - READ_LOCK(&ip_ct_gre_lock); - km = LIST_FIND(&gre_keymap_list, gre_key_cmpfn, - struct ip_ct_gre_keymap *, t); - if (!km) { - READ_UNLOCK(&ip_ct_gre_lock); - return 0; - } - - key = km->tuple.src.u.gre.key; - READ_UNLOCK(&ip_ct_gre_lock); - - return key; -} - -/* add a single keymap entry, associate with specified expect */ -int ip_ct_gre_keymap_add(struct ip_conntrack_expect *exp, - struct ip_conntrack_tuple *t, int reply) -{ - struct ip_ct_gre_keymap *km; - - km = kmalloc(sizeof(*km), GFP_ATOMIC); - if (!km) - return -1; - - /* initializing list head should be sufficient */ - memset(km, 0, sizeof(*km)); - - memcpy(&km->tuple, t, sizeof(*t)); - - if (!reply) - exp->proto.gre.keymap_orig = km; - else - exp->proto.gre.keymap_reply = km; - - DEBUGP("adding new entry %p: ", km); - DUMP_TUPLE_GRE(&km->tuple); - - WRITE_LOCK(&ip_ct_gre_lock); - list_append(&gre_keymap_list, km); - WRITE_UNLOCK(&ip_ct_gre_lock); - - return 0; -} - -/* change the tuple of a keymap entry (used by nat helper) */ -void ip_ct_gre_keymap_change(struct ip_ct_gre_keymap *km, - struct ip_conntrack_tuple *t) -{ - if (!km) - { - printk(KERN_WARNING - "NULL GRE conntrack keymap change requested\n"); - return; - } - - DEBUGP("changing entry %p to: ", km); - DUMP_TUPLE_GRE(t); - - WRITE_LOCK(&ip_ct_gre_lock); - memcpy(&km->tuple, t, sizeof(km->tuple)); - WRITE_UNLOCK(&ip_ct_gre_lock); -} - -/* destroy the keymap entries associated with specified expect */ -void ip_ct_gre_keymap_destroy(struct ip_conntrack_expect *exp) -{ - DEBUGP("entering for exp %p\n", exp); - WRITE_LOCK(&ip_ct_gre_lock); - if (exp->proto.gre.keymap_orig) { - DEBUGP("removing %p from list\n", exp->proto.gre.keymap_orig); - list_del(&exp->proto.gre.keymap_orig->list); - kfree(exp->proto.gre.keymap_orig); - exp->proto.gre.keymap_orig = NULL; - } - if (exp->proto.gre.keymap_reply) { - DEBUGP("removing %p from list\n", exp->proto.gre.keymap_reply); - list_del(&exp->proto.gre.keymap_reply->list); - kfree(exp->proto.gre.keymap_reply); - exp->proto.gre.keymap_reply = NULL; - } - WRITE_UNLOCK(&ip_ct_gre_lock); -} - - -/* PUBLIC CONNTRACK PROTO HELPER FUNCTIONS */ - -/* invert gre part of tuple */ -static int gre_invert_tuple(struct ip_conntrack_tuple *tuple, - const struct ip_conntrack_tuple *orig) -{ - tuple->dst.u.gre.key = orig->src.u.gre.key; - tuple->src.u.gre.key = orig->dst.u.gre.key; - - return 1; -} - -/* gre hdr info to tuple */ -static int gre_pkt_to_tuple(const struct sk_buff *skb, - unsigned int dataoff, - struct ip_conntrack_tuple *tuple) -{ - struct gre_hdr _grehdr, *grehdr; - struct gre_hdr_pptp _pgrehdr, *pgrehdr; - u_int32_t srckey; - - grehdr = skb_header_pointer(skb, dataoff, sizeof(_grehdr), &_grehdr); - /* PPTP header is variable length, only need up to the call_id field */ - pgrehdr = skb_header_pointer(skb, dataoff, 8, &_pgrehdr); - - if (!grehdr || !pgrehdr) - return 0; - - switch (grehdr->version) { - case GRE_VERSION_1701: - if (!grehdr->key) { - DEBUGP("Can't track GRE without key\n"); - return 0; - } - tuple->dst.u.gre.key = *(gre_key(grehdr)); - break; - - case GRE_VERSION_PPTP: - if (ntohs(grehdr->protocol) != GRE_PROTOCOL_PPTP) { - DEBUGP("GRE_VERSION_PPTP but unknown proto\n"); - return 0; - } - tuple->dst.u.gre.key = htonl(ntohs(pgrehdr->call_id)); - break; - - default: - printk(KERN_WARNING "unknown GRE version %hu\n", - grehdr->version); - return 0; - } - - srckey = gre_keymap_lookup(tuple); - - tuple->src.u.gre.key = srckey; -#if 0 - DEBUGP("found src key %x for tuple ", ntohl(srckey)); - DUMP_TUPLE_GRE(tuple); -#endif - - return 1; -} - -/* print gre part of tuple */ -static unsigned int gre_print_tuple(char *buffer, - const struct ip_conntrack_tuple *tuple) -{ - return sprintf(buffer, "srckey=0x%x dstkey=0x%x ", - ntohl(tuple->src.u.gre.key), - ntohl(tuple->dst.u.gre.key)); -} - -/* print private data for conntrack */ -static unsigned int gre_print_conntrack(char *buffer, - const struct ip_conntrack *ct) -{ - return sprintf(buffer, "timeout=%u, stream_timeout=%u ", - (ct->proto.gre.timeout / HZ), - (ct->proto.gre.stream_timeout / HZ)); -} - -/* Returns verdict for packet, and may modify conntrack */ -static int gre_packet(struct ip_conntrack *ct, - const struct sk_buff *skb, - enum ip_conntrack_info conntrackinfo) -{ - /* If we've seen traffic both ways, this is a GRE connection. - * Extend timeout. */ - if (ct->status & IPS_SEEN_REPLY) { - ip_ct_refresh_acct(ct, conntrackinfo, skb, - ct->proto.gre.stream_timeout); - /* Also, more likely to be important, and not a probe. */ - set_bit(IPS_ASSURED_BIT, &ct->status); - } else - ip_ct_refresh_acct(ct, conntrackinfo, skb, - ct->proto.gre.timeout); - - return NF_ACCEPT; -} - -/* Called when a new connection for this protocol found. */ -static int gre_new(struct ip_conntrack *ct, - const struct sk_buff *skb) -{ - DEBUGP(": "); - DUMP_TUPLE_GRE(&ct->tuplehash[IP_CT_DIR_ORIGINAL].tuple); - - /* initialize to sane value. Ideally a conntrack helper - * (e.g. in case of pptp) is increasing them */ - ct->proto.gre.stream_timeout = GRE_STREAM_TIMEOUT; - ct->proto.gre.timeout = GRE_TIMEOUT; - - return 1; -} - -/* Called when a conntrack entry has already been removed from the hashes - * and is about to be deleted from memory */ -static void gre_destroy(struct ip_conntrack *ct) -{ - struct ip_conntrack_expect *master = ct->master; - - DEBUGP(" entering\n"); - - if (!master) { - DEBUGP("no master exp for ct %p\n", ct); - return; - } - - ip_ct_gre_keymap_destroy(master); -} - -/* protocol helper struct */ -static struct ip_conntrack_protocol gre = { - .proto = IPPROTO_GRE, - .name = "gre", - .pkt_to_tuple = gre_pkt_to_tuple, - .invert_tuple = gre_invert_tuple, - .print_tuple = gre_print_tuple, - .print_conntrack = gre_print_conntrack, - .packet = gre_packet, - .new = gre_new, - .destroy = gre_destroy, - .exp_matches_pkt = NULL, - .me = THIS_MODULE -}; - -/* ip_conntrack_proto_gre initialization */ -static int __init init(void) -{ - int retcode; - - if ((retcode = ip_conntrack_protocol_register(&gre))) { - printk(KERN_ERR "Unable to register conntrack protocol " - "helper for gre: %d\n", retcode); - return -EIO; - } - - return 0; -} - -static void __exit fini(void) -{ - struct list_head *pos, *n; - - /* delete all keymap entries */ - WRITE_LOCK(&ip_ct_gre_lock); - list_for_each_safe(pos, n, &gre_keymap_list) { - DEBUGP("deleting keymap %p at module unload time\n", pos); - list_del(pos); - kfree(pos); - } - WRITE_UNLOCK(&ip_ct_gre_lock); - - ip_conntrack_protocol_unregister(&gre); -} - -EXPORT_SYMBOL(ip_ct_gre_keymap_add); -EXPORT_SYMBOL(ip_ct_gre_keymap_change); -EXPORT_SYMBOL(ip_ct_gre_keymap_destroy); - -module_init(init); -module_exit(fini); diff --git a/net/ipv4/netfilter/ip_nat_pptp.c b/net/ipv4/netfilter/ip_nat_pptp.c deleted file mode 100644 index 2bbb815e9..000000000 --- a/net/ipv4/netfilter/ip_nat_pptp.c +++ /dev/null @@ -1,477 +0,0 @@ -/* - * ip_nat_pptp.c - Version 2.0 - * - * NAT support for PPTP (Point to Point Tunneling Protocol). - * PPTP is a a protocol for creating virtual private networks. - * It is a specification defined by Microsoft and some vendors - * working with Microsoft. PPTP is built on top of a modified - * version of the Internet Generic Routing Encapsulation Protocol. - * GRE is defined in RFC 1701 and RFC 1702. Documentation of - * PPTP can be found in RFC 2637 - * - * (C) 2000-2004 by Harald Welte - * - * Development of this code funded by Astaro AG (http://www.astaro.com/) - * - * TODO: - Support for multiple calls within one session - * (needs netfilter newnat code) - * - NAT to a unique tuple, not to TCP source port - * (needs netfilter tuple reservation) - * - * Changes: - * 2002-02-10 - Version 1.3 - * - Use ip_nat_mangle_tcp_packet() because of cloned skb's - * in local connections (Philip Craig ) - * - add checks for magicCookie and pptp version - * - make argument list of pptp_{out,in}bound_packet() shorter - * - move to C99 style initializers - * - print version number at module loadtime - * 2003-09-22 - Version 1.5 - * - use SNATed tcp sourceport as callid, since we get called before - * TCP header is mangled (Philip Craig ) - * 2004-10-22 - Version 2.0 - * - kernel 2.6.x version - * - */ - -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include - -#define IP_NAT_PPTP_VERSION "2.0" - -MODULE_LICENSE("GPL"); -MODULE_AUTHOR("Harald Welte "); -MODULE_DESCRIPTION("Netfilter NAT helper module for PPTP"); - - -#if 0 -#include "ip_conntrack_pptp_priv.h" -#define DEBUGP(format, args...) printk(KERN_DEBUG __FILE__ ":" __FUNCTION__ \ - ": " format, ## args) -#else -#define DEBUGP(format, args...) -#endif - -static unsigned int -pptp_nat_expected(struct sk_buff **pskb, - unsigned int hooknum, - struct ip_conntrack *ct, - struct ip_nat_info *info) -{ - struct ip_conntrack *master = master_ct(ct); - struct ip_nat_multi_range mr; - struct ip_ct_pptp_master *ct_pptp_info; - struct ip_nat_pptp *nat_pptp_info; - u_int32_t newip, newcid; - int ret; - - IP_NF_ASSERT(info); - IP_NF_ASSERT(master); - IP_NF_ASSERT(!(info->initialized & (1 << HOOK2MANIP(hooknum)))); - - DEBUGP("we have a connection!\n"); - - LOCK_BH(&ip_pptp_lock); - ct_pptp_info = &master->help.ct_pptp_info; - nat_pptp_info = &master->nat.help.nat_pptp_info; - - /* need to alter GRE tuple because conntrack expectfn() used 'wrong' - * (unmanipulated) values */ - if (HOOK2MANIP(hooknum) == IP_NAT_MANIP_DST) { - DEBUGP("completing tuples with NAT info \n"); - /* we can do this, since we're unconfirmed */ - if (ct->tuplehash[IP_CT_DIR_ORIGINAL].tuple.dst.u.gre.key == - htonl(ct_pptp_info->pac_call_id)) { - /* assume PNS->PAC */ - ct->tuplehash[IP_CT_DIR_ORIGINAL].tuple.src.u.gre.key = - htonl(nat_pptp_info->pns_call_id); - ct->tuplehash[IP_CT_DIR_REPLY].tuple.dst.u.gre.key = - htonl(nat_pptp_info->pns_call_id); - newip = master->tuplehash[IP_CT_DIR_REPLY].tuple.src.ip; - newcid = htonl(nat_pptp_info->pac_call_id); - } else { - /* assume PAC->PNS */ - ct->tuplehash[IP_CT_DIR_ORIGINAL].tuple.src.u.gre.key = - htonl(nat_pptp_info->pac_call_id); - ct->tuplehash[IP_CT_DIR_REPLY].tuple.dst.u.gre.key = - htonl(nat_pptp_info->pac_call_id); - newip = master->tuplehash[IP_CT_DIR_ORIGINAL].tuple.src.ip; - newcid = htonl(nat_pptp_info->pns_call_id); - } - } else { - if (ct->tuplehash[IP_CT_DIR_ORIGINAL].tuple.dst.u.gre.key == - htonl(ct_pptp_info->pac_call_id)) { - /* assume PNS->PAC */ - newip = master->tuplehash[IP_CT_DIR_REPLY].tuple.dst.ip; - newcid = htonl(ct_pptp_info->pns_call_id); - } - else { - /* assume PAC->PNS */ - newip = master->tuplehash[IP_CT_DIR_ORIGINAL].tuple.dst.ip; - newcid = htonl(ct_pptp_info->pac_call_id); - } - } - - mr.rangesize = 1; - mr.range[0].flags = IP_NAT_RANGE_MAP_IPS | IP_NAT_RANGE_PROTO_SPECIFIED; - mr.range[0].min_ip = mr.range[0].max_ip = newip; - mr.range[0].min = mr.range[0].max = - ((union ip_conntrack_manip_proto ) { newcid }); - DEBUGP("change ip to %u.%u.%u.%u\n", - NIPQUAD(newip)); - DEBUGP("change key to 0x%x\n", ntohl(newcid)); - ret = ip_nat_setup_info(ct, &mr, hooknum); - - UNLOCK_BH(&ip_pptp_lock); - - return ret; - -} - -/* outbound packets == from PNS to PAC */ -static inline unsigned int -pptp_outbound_pkt(struct sk_buff **pskb, - struct ip_conntrack *ct, - enum ip_conntrack_info ctinfo, - struct ip_conntrack_expect *exp) - -{ - struct iphdr *iph = (*pskb)->nh.iph; - struct tcphdr *tcph = (void *) iph + iph->ihl*4; - struct pptp_pkt_hdr *pptph = (struct pptp_pkt_hdr *) - ((void *)tcph + tcph->doff*4); - - struct PptpControlHeader *ctlh; - union pptp_ctrl_union *pptpReq; - struct ip_ct_pptp_master *ct_pptp_info = &ct->help.ct_pptp_info; - struct ip_nat_pptp *nat_pptp_info = &ct->nat.help.nat_pptp_info; - - u_int16_t msg, *cid = NULL, new_callid; - - /* FIXME: size checks !!! */ - ctlh = (struct PptpControlHeader *) ((void *) pptph + sizeof(*pptph)); - pptpReq = (void *) ((void *) ctlh + sizeof(*ctlh)); - - new_callid = htons(ct_pptp_info->pns_call_id); - - switch (msg = ntohs(ctlh->messageType)) { - case PPTP_OUT_CALL_REQUEST: - cid = &pptpReq->ocreq.callID; - /* FIXME: ideally we would want to reserve a call ID - * here. current netfilter NAT core is not able to do - * this :( For now we use TCP source port. This breaks - * multiple calls within one control session */ - - /* save original call ID in nat_info */ - nat_pptp_info->pns_call_id = ct_pptp_info->pns_call_id; - - /* don't use tcph->source since we are at a DSTmanip - * hook (e.g. PREROUTING) and pkt is not mangled yet */ - new_callid = ct->tuplehash[IP_CT_DIR_REPLY].tuple.dst.u.tcp.port; - - /* save new call ID in ct info */ - ct_pptp_info->pns_call_id = ntohs(new_callid); - break; - case PPTP_IN_CALL_REPLY: - cid = &pptpReq->icreq.callID; - break; - case PPTP_CALL_CLEAR_REQUEST: - cid = &pptpReq->clrreq.callID; - break; - default: - DEBUGP("unknown outbound packet 0x%04x:%s\n", msg, - (msg <= PPTP_MSG_MAX)? strMName[msg]:strMName[0]); - /* fall through */ - - case PPTP_SET_LINK_INFO: - /* only need to NAT in case PAC is behind NAT box */ - case PPTP_START_SESSION_REQUEST: - case PPTP_START_SESSION_REPLY: - case PPTP_STOP_SESSION_REQUEST: - case PPTP_STOP_SESSION_REPLY: - case PPTP_ECHO_REQUEST: - case PPTP_ECHO_REPLY: - /* no need to alter packet */ - return NF_ACCEPT; - } - - IP_NF_ASSERT(cid); - - DEBUGP("altering call id from 0x%04x to 0x%04x\n", - ntohs(*cid), ntohs(new_callid)); - - /* mangle packet */ - ip_nat_mangle_tcp_packet(pskb, ct, ctinfo, (void *)cid - (void *)pptph, - sizeof(new_callid), (char *)&new_callid, - sizeof(new_callid)); - - return NF_ACCEPT; -} - -/* inbound packets == from PAC to PNS */ -static inline unsigned int -pptp_inbound_pkt(struct sk_buff **pskb, - struct ip_conntrack *ct, - enum ip_conntrack_info ctinfo, - struct ip_conntrack_expect *oldexp) -{ - struct iphdr *iph = (*pskb)->nh.iph; - struct tcphdr *tcph = (void *) iph + iph->ihl*4; - struct pptp_pkt_hdr *pptph = (struct pptp_pkt_hdr *) - ((void *)tcph + tcph->doff*4); - - struct PptpControlHeader *ctlh; - union pptp_ctrl_union *pptpReq; - struct ip_ct_pptp_master *ct_pptp_info = &ct->help.ct_pptp_info; - struct ip_nat_pptp *nat_pptp_info = &ct->nat.help.nat_pptp_info; - - u_int16_t msg, new_cid = 0, new_pcid, *pcid = NULL, *cid = NULL; - u_int32_t old_dst_ip; - - struct ip_conntrack_tuple t, inv_t; - struct ip_conntrack_tuple *orig_t, *reply_t; - - /* FIXME: size checks !!! */ - ctlh = (struct PptpControlHeader *) ((void *) pptph + sizeof(*pptph)); - pptpReq = (void *) ((void *) ctlh + sizeof(*ctlh)); - - new_pcid = htons(nat_pptp_info->pns_call_id); - - switch (msg = ntohs(ctlh->messageType)) { - case PPTP_OUT_CALL_REPLY: - pcid = &pptpReq->ocack.peersCallID; - cid = &pptpReq->ocack.callID; - if (!oldexp) { - DEBUGP("outcall but no expectation\n"); - break; - } - old_dst_ip = oldexp->tuple.dst.ip; - t = oldexp->tuple; - invert_tuplepr(&inv_t, &t); - - /* save original PAC call ID in nat_info */ - nat_pptp_info->pac_call_id = ct_pptp_info->pac_call_id; - - /* alter expectation */ - orig_t = &ct->tuplehash[IP_CT_DIR_ORIGINAL].tuple; - reply_t = &ct->tuplehash[IP_CT_DIR_REPLY].tuple; - if (t.src.ip == orig_t->src.ip && t.dst.ip == orig_t->dst.ip) { - /* expectation for PNS->PAC direction */ - t.src.u.gre.key = htonl(nat_pptp_info->pns_call_id); - t.dst.u.gre.key = htonl(ct_pptp_info->pac_call_id); - inv_t.src.ip = reply_t->src.ip; - inv_t.dst.ip = reply_t->dst.ip; - inv_t.src.u.gre.key = htonl(nat_pptp_info->pac_call_id); - inv_t.dst.u.gre.key = htonl(ct_pptp_info->pns_call_id); - } else { - /* expectation for PAC->PNS direction */ - t.src.u.gre.key = htonl(nat_pptp_info->pac_call_id); - t.dst.u.gre.key = htonl(ct_pptp_info->pns_call_id); - inv_t.src.ip = orig_t->src.ip; - inv_t.dst.ip = orig_t->dst.ip; - inv_t.src.u.gre.key = htonl(nat_pptp_info->pns_call_id); - inv_t.dst.u.gre.key = htonl(ct_pptp_info->pac_call_id); - } - - if (!ip_conntrack_change_expect(oldexp, &t)) { - DEBUGP("successfully changed expect\n"); - } else { - DEBUGP("can't change expect\n"); - } - ip_ct_gre_keymap_change(oldexp->proto.gre.keymap_orig, &t); - ip_ct_gre_keymap_change(oldexp->proto.gre.keymap_reply, &inv_t); - break; - case PPTP_IN_CALL_CONNECT: - pcid = &pptpReq->iccon.peersCallID; - if (!oldexp) - break; - old_dst_ip = oldexp->tuple.dst.ip; - t = oldexp->tuple; - - /* alter expectation, no need for callID */ - if (t.dst.ip == ct->tuplehash[IP_CT_DIR_ORIGINAL].tuple.dst.ip) { - /* expectation for PNS->PAC direction */ - t.src.ip = ct->tuplehash[IP_CT_DIR_REPLY].tuple.dst.ip; - } else { - /* expectation for PAC->PNS direction */ - t.dst.ip = ct->tuplehash[IP_CT_DIR_REPLY].tuple.dst.ip; - } - - if (!ip_conntrack_change_expect(oldexp, &t)) { - DEBUGP("successfully changed expect\n"); - } else { - DEBUGP("can't change expect\n"); - } - break; - case PPTP_IN_CALL_REQUEST: - /* only need to nat in case PAC is behind NAT box */ - break; - case PPTP_WAN_ERROR_NOTIFY: - pcid = &pptpReq->wanerr.peersCallID; - break; - case PPTP_CALL_DISCONNECT_NOTIFY: - pcid = &pptpReq->disc.callID; - break; - - default: - DEBUGP("unknown inbound packet %s\n", - (msg <= PPTP_MSG_MAX)? strMName[msg]:strMName[0]); - /* fall through */ - - case PPTP_START_SESSION_REQUEST: - case PPTP_START_SESSION_REPLY: - case PPTP_STOP_SESSION_REQUEST: - case PPTP_STOP_SESSION_REPLY: - case PPTP_ECHO_REQUEST: - case PPTP_ECHO_REPLY: - /* no need to alter packet */ - return NF_ACCEPT; - } - - /* mangle packet */ - IP_NF_ASSERT(pcid); - DEBUGP("altering peer call id from 0x%04x to 0x%04x\n", - ntohs(*pcid), ntohs(new_pcid)); - ip_nat_mangle_tcp_packet(pskb, ct, ctinfo, (void *)pcid - (void *)pptph, - sizeof(new_pcid), (char *)&new_pcid, - sizeof(new_pcid)); - - if (new_cid) { - IP_NF_ASSERT(cid); - DEBUGP("altering call id from 0x%04x to 0x%04x\n", - ntohs(*cid), ntohs(new_cid)); - ip_nat_mangle_tcp_packet(pskb, ct, ctinfo, - (void *)cid - (void *)pptph, - sizeof(new_cid), (char *)&new_cid, - sizeof(new_cid)); - } - - /* great, at least we don't need to resize packets */ - return NF_ACCEPT; -} - - -static unsigned int tcp_help(struct ip_conntrack *ct, - struct ip_conntrack_expect *exp, - struct ip_nat_info *info, - enum ip_conntrack_info ctinfo, - unsigned int hooknum, struct sk_buff **pskb) -{ - struct iphdr *iph = (*pskb)->nh.iph; - struct tcphdr *tcph = (void *) iph + iph->ihl*4; - unsigned int datalen = (*pskb)->len - iph->ihl*4 - tcph->doff*4; - struct pptp_pkt_hdr *pptph; - - int dir; - - DEBUGP("entering\n"); - - /* Only mangle things once: DST for original direction - and SRC for reply direction. */ - dir = CTINFO2DIR(ctinfo); - if (!((HOOK2MANIP(hooknum) == IP_NAT_MANIP_SRC - && dir == IP_CT_DIR_ORIGINAL) - || (HOOK2MANIP(hooknum) == IP_NAT_MANIP_DST - && dir == IP_CT_DIR_REPLY))) { - DEBUGP("Not touching dir %s at hook %s\n", - dir == IP_CT_DIR_ORIGINAL ? "ORIG" : "REPLY", - hooknum == NF_IP_POST_ROUTING ? "POSTROUTING" - : hooknum == NF_IP_PRE_ROUTING ? "PREROUTING" - : hooknum == NF_IP_LOCAL_OUT ? "OUTPUT" - : hooknum == NF_IP_LOCAL_IN ? "INPUT" : "???"); - return NF_ACCEPT; - } - - /* if packet is too small, just skip it */ - if (datalen < sizeof(struct pptp_pkt_hdr)+ - sizeof(struct PptpControlHeader)) { - DEBUGP("pptp packet too short\n"); - return NF_ACCEPT; - } - - pptph = (struct pptp_pkt_hdr *) ((void *)tcph + tcph->doff*4); - - /* if it's not a control message, we can't handle it */ - if (ntohs(pptph->packetType) != PPTP_PACKET_CONTROL || - ntohl(pptph->magicCookie) != PPTP_MAGIC_COOKIE) { - DEBUGP("not a pptp control packet\n"); - return NF_ACCEPT; - } - - LOCK_BH(&ip_pptp_lock); - - if (dir == IP_CT_DIR_ORIGINAL) { - /* reuqests sent by client to server (PNS->PAC) */ - pptp_outbound_pkt(pskb, ct, ctinfo, exp); - } else { - /* response from the server to the client (PAC->PNS) */ - pptp_inbound_pkt(pskb, ct, ctinfo, exp); - } - - UNLOCK_BH(&ip_pptp_lock); - - return NF_ACCEPT; -} - -/* nat helper struct for control connection */ -static struct ip_nat_helper pptp_tcp_helper = { - .list = { NULL, NULL }, - .name = "pptp", - .flags = IP_NAT_HELPER_F_ALWAYS, - .me = THIS_MODULE, - .tuple = { .src = { .ip = 0, - .u = { .tcp = { .port = - __constant_htons(PPTP_CONTROL_PORT) } - } - }, - .dst = { .ip = 0, - .u = { .all = 0 }, - .protonum = IPPROTO_TCP - } - }, - - .mask = { .src = { .ip = 0, - .u = { .tcp = { .port = 0xFFFF } } - }, - .dst = { .ip = 0, - .u = { .all = 0 }, - .protonum = 0xFFFF - } - }, - .help = tcp_help, - .expect = pptp_nat_expected -}; - - -static int __init init(void) -{ - DEBUGP("%s: registering NAT helper\n", __FILE__); - if (ip_nat_helper_register(&pptp_tcp_helper)) { - printk(KERN_ERR "Unable to register NAT application helper " - "for pptp\n"); - return -EIO; - } - - printk("ip_nat_pptp version %s loaded\n", IP_NAT_PPTP_VERSION); - return 0; -} - -static void __exit fini(void) -{ - DEBUGP("cleanup_module\n" ); - ip_nat_helper_unregister(&pptp_tcp_helper); - printk("ip_nat_pptp version %s unloaded\n", IP_NAT_PPTP_VERSION); -} - -module_init(init); -module_exit(fini); diff --git a/net/ipv4/netfilter/ip_nat_proto_gre.c b/net/ipv4/netfilter/ip_nat_proto_gre.c deleted file mode 100644 index 5691a102a..000000000 --- a/net/ipv4/netfilter/ip_nat_proto_gre.c +++ /dev/null @@ -1,210 +0,0 @@ -/* - * ip_nat_proto_gre.c - Version 2.0 - * - * NAT protocol helper module for GRE. - * - * GRE is a generic encapsulation protocol, which is generally not very - * suited for NAT, as it has no protocol-specific part as port numbers. - * - * It has an optional key field, which may help us distinguishing two - * connections between the same two hosts. - * - * GRE is defined in RFC 1701 and RFC 1702, as well as RFC 2784 - * - * PPTP is built on top of a modified version of GRE, and has a mandatory - * field called "CallID", which serves us for the same purpose as the key - * field in plain GRE. - * - * Documentation about PPTP can be found in RFC 2637 - * - * (C) 2000-2004 by Harald Welte - * - * Development of this code funded by Astaro AG (http://www.astaro.com/) - * - */ - -#include -#include -#include -#include -#include -#include -#include - -MODULE_LICENSE("GPL"); -MODULE_AUTHOR("Harald Welte "); -MODULE_DESCRIPTION("Netfilter NAT protocol helper module for GRE"); - -#if 0 -#define DEBUGP(format, args...) printk(KERN_DEBUG __FILE__ ":" __FUNCTION__ \ - ": " format, ## args) -#else -#define DEBUGP(x, args...) -#endif - -/* is key in given range between min and max */ -static int -gre_in_range(const struct ip_conntrack_tuple *tuple, - enum ip_nat_manip_type maniptype, - const union ip_conntrack_manip_proto *min, - const union ip_conntrack_manip_proto *max) -{ - u_int32_t key; - - if (maniptype == IP_NAT_MANIP_SRC) - key = tuple->src.u.gre.key; - else - key = tuple->dst.u.gre.key; - - return ntohl(key) >= ntohl(min->gre.key) - && ntohl(key) <= ntohl(max->gre.key); -} - -/* generate unique tuple ... */ -static int -gre_unique_tuple(struct ip_conntrack_tuple *tuple, - const struct ip_nat_range *range, - enum ip_nat_manip_type maniptype, - const struct ip_conntrack *conntrack) -{ - u_int32_t min, i, range_size; - u_int32_t key = 0, *keyptr; - - if (maniptype == IP_NAT_MANIP_SRC) - keyptr = &tuple->src.u.gre.key; - else - keyptr = &tuple->dst.u.gre.key; - - if (!(range->flags & IP_NAT_RANGE_PROTO_SPECIFIED)) { - DEBUGP("%p: NATing GRE PPTP\n", conntrack); - min = 1; - range_size = 0xffff; - } else { - min = ntohl(range->min.gre.key); - range_size = ntohl(range->max.gre.key) - min + 1; - } - - DEBUGP("min = %u, range_size = %u\n", min, range_size); - - for (i = 0; i < range_size; i++, key++) { - *keyptr = htonl(min + key % range_size); - if (!ip_nat_used_tuple(tuple, conntrack)) - return 1; - } - - DEBUGP("%p: no NAT mapping\n", conntrack); - - return 0; -} - -/* manipulate a GRE packet according to maniptype */ -static int -gre_manip_pkt(struct sk_buff **pskb, - unsigned int hdroff, - const struct ip_conntrack_manip *manip, - enum ip_nat_manip_type maniptype) -{ - struct gre_hdr *greh; - struct gre_hdr_pptp *pgreh; - - if (!skb_ip_make_writable(pskb, hdroff + sizeof(*pgreh))) - return 0; - - greh = (void *)(*pskb)->data + hdroff; - pgreh = (struct gre_hdr_pptp *) greh; - - /* we only have destination manip of a packet, since 'source key' - * is not present in the packet itself */ - if (maniptype == IP_NAT_MANIP_DST) { - /* key manipulation is always dest */ - switch (greh->version) { - case 0: - if (!greh->key) { - DEBUGP("can't nat GRE w/o key\n"); - break; - } - if (greh->csum) { - /* FIXME: Never tested this code... */ - *(gre_csum(greh)) = - ip_nat_cheat_check(~*(gre_key(greh)), - manip->u.gre.key, - *(gre_csum(greh))); - } - *(gre_key(greh)) = manip->u.gre.key; - break; - case GRE_VERSION_PPTP: - DEBUGP("call_id -> 0x%04x\n", - ntohl(manip->u.gre.key)); - pgreh->call_id = htons(ntohl(manip->u.gre.key)); - break; - default: - DEBUGP("can't nat unknown GRE version\n"); - return 0; - break; - } - } - return 1; -} - -/* print out a nat tuple */ -static unsigned int -gre_print(char *buffer, - const struct ip_conntrack_tuple *match, - const struct ip_conntrack_tuple *mask) -{ - unsigned int len = 0; - - if (mask->src.u.gre.key) - len += sprintf(buffer + len, "srckey=0x%x ", - ntohl(match->src.u.gre.key)); - - if (mask->dst.u.gre.key) - len += sprintf(buffer + len, "dstkey=0x%x ", - ntohl(match->src.u.gre.key)); - - return len; -} - -/* print a range of keys */ -static unsigned int -gre_print_range(char *buffer, const struct ip_nat_range *range) -{ - if (range->min.gre.key != 0 - || range->max.gre.key != 0xFFFF) { - if (range->min.gre.key == range->max.gre.key) - return sprintf(buffer, "key 0x%x ", - ntohl(range->min.gre.key)); - else - return sprintf(buffer, "keys 0x%u-0x%u ", - ntohl(range->min.gre.key), - ntohl(range->max.gre.key)); - } else - return 0; -} - -/* nat helper struct */ -static struct ip_nat_protocol gre = { - .name = "GRE", - .protonum = IPPROTO_GRE, - .manip_pkt = gre_manip_pkt, - .in_range = gre_in_range, - .unique_tuple = gre_unique_tuple, - .print = gre_print, - .print_range = gre_print_range -}; - -static int __init init(void) -{ - if (ip_nat_protocol_register(&gre)) - return -EIO; - - return 0; -} - -static void __exit fini(void) -{ - ip_nat_protocol_unregister(&gre); -} - -module_init(init); -module_exit(fini); diff --git a/scripts/kernel-2.6-planetlab.spec b/scripts/kernel-2.6-planetlab.spec index 4e2be569b..d6c459453 100644 --- a/scripts/kernel-2.6-planetlab.spec +++ b/scripts/kernel-2.6-planetlab.spec @@ -22,7 +22,7 @@ Summary: The Linux kernel (the core of the Linux operating system) %define kversion 2.6.%{sublevel} %define rpmversion 2.6.%{sublevel} %define rhbsys %([ -r /etc/beehive-root ] && echo || echo .`whoami`) -%define release 1.521.2.6.planetlab%{?date:.%{date}} +%define release 1.521.1.planetlab%{?date:.%{date}} %define signmodules 0 %define KVERREL %{PACKAGE_VERSION}-%{PACKAGE_RELEASE} @@ -62,11 +62,6 @@ Summary: The Linux kernel (the core of the Linux operating system) # %define kernel_prereq fileutils, module-init-tools, initscripts >= 5.83, mkinitrd >= 3.5.5 -Vendor: PlanetLab -Packager: PlanetLab Central -Distribution: PlanetLab 3.0 -URL: http://cvs.planet-lab.org/cvs/linux-2.6 - Name: kernel Group: System Environment/Kernel License: GPLv2 @@ -178,19 +173,6 @@ Group: System Environment/Kernel %description uml This package includes a user mode version of the Linux kernel. -%package vserver -Summary: A placeholder RPM that provides kernel and kernel-drm - -Group: System Environment/Kernel -Provides: kernel = %{version} -Provides: kernel-drm = 4.3.0 - -%description vserver -VServers do not require and cannot use kernels, but some RPMs have -implicit or explicit dependencies on the "kernel" package -(e.g. tcpdump). This package installs no files but provides the -necessary dependencies to make rpm and yum happy. - %prep %setup -n linux-%{kversion} @@ -258,7 +240,7 @@ BuildKernel() { grep "__crc_$i\$" System.map >> $RPM_BUILD_ROOT/boot/System.map-$KernelVer ||: done rm -f exported -# install -m 644 init/kerntypes.o $RPM_BUILD_ROOT/boot/Kerntypes-$KernelVer + install -m 644 init/kerntypes.o $RPM_BUILD_ROOT/boot/Kerntypes-$KernelVer install -m 644 .config $RPM_BUILD_ROOT/boot/config-$KernelVer rm -f System.map cp arch/*/boot/bzImage $RPM_BUILD_ROOT/%{image_install_path}/vmlinuz-$KernelVer @@ -429,7 +411,7 @@ fi # make some useful links pushd /boot > /dev/null ; { ln -sf System.map-%{KVERREL} System.map -# ln -sf Kerntypes-%{KVERREL} Kerntypes + ln -sf Kerntypes-%{KVERREL} Kerntypes ln -sf config-%{KVERREL} config ln -sf initrd-%{KVERREL}.img initrd-boot ln -sf vmlinuz-%{KVERREL} kernel-boot @@ -468,7 +450,7 @@ fi %files %defattr(-,root,root) /%{image_install_path}/vmlinuz-%{KVERREL} -#/boot/Kerntypes-%{KVERREL} +/boot/Kerntypes-%{KVERREL} /boot/System.map-%{KVERREL} /boot/config-%{KVERREL} %dir /lib/modules/%{KVERREL} @@ -481,7 +463,7 @@ fi %files smp %defattr(-,root,root) /%{image_install_path}/vmlinuz-%{KVERREL}smp -#/boot/Kerntypes-%{KVERREL}smp +/boot/Kerntypes-%{KVERREL}smp /boot/System.map-%{KVERREL}smp /boot/config-%{KVERREL}smp %dir /lib/modules/%{KVERREL}smp @@ -511,11 +493,6 @@ fi /usr/share/doc/kernel-doc-%{kversion}/Documentation/* %endif - -%files vserver -%defattr(-,root,root) -# no files - %changelog * Thu Sep 16 2004 Mark Huang - merge to Fedora Core 2 2.6.8-1.521