1 diff -Nurp linux-2.6.22-680/Documentation/web100/locking.txt linux-2.6.22-690/Documentation/web100/locking.txt
2 --- linux-2.6.22-680/Documentation/web100/locking.txt 1970-01-01 01:00:00.000000000 +0100
3 +++ linux-2.6.22-690/Documentation/web100/locking.txt 2008-11-14 21:20:17.000000000 +0100
5 +Web100 Locking Model for Linux 2.4
6 +John Heffner <jheffner@psc.edu>
12 +The connections entries are kept linked together simultaneously in a table
13 +and in a list. Only entries in these structures can be looked up. To
14 +protect these lookup structures, we have a single global reader-writer
15 +spinlock, web100_linkage_lock. Since we grab the lock both from user space
16 +and in the bottom half, we must do a [read/write]_lock_bh. As this disables
17 +the local BH's, this lock should *not* be held for very long.
22 +The statistics are protected by the sock's lock. Any code modifying or
23 +reading the statistics should hold the sock lock while doing so. We assume
24 +that if the socket is gone, the statistics should not be modified, so
25 +readers need not hold any lock.
28 +3. Statistics Destruction
30 +A statistics structure keeps a count of the number of references to it,
31 +wc_users. When a lookup is performed, the reference count should be
32 +incremented (while the linkage lock is held) by calling web100_stats_use.
33 +When the reference is no longer needed, decrement the count by calling
34 +web100_stats_unuse. The latter function will free the statistics when there
35 +are no remaining references. The lookup structures keep one reference. The
36 +sock also keeps one, since the sock may be destroyed before it ever enters
37 +the ESTABLISHED state.
38 diff -Nurp linux-2.6.22-680/Documentation/web100/proc_interface.txt linux-2.6.22-690/Documentation/web100/proc_interface.txt
39 --- linux-2.6.22-680/Documentation/web100/proc_interface.txt 1970-01-01 01:00:00.000000000 +0100
40 +++ linux-2.6.22-690/Documentation/web100/proc_interface.txt 2008-11-14 21:20:17.000000000 +0100
42 +WEB100 proc interface notes
43 +===========================
45 +The web100 modifications to the kernel collect information about the
46 +state of a TCP transfer in a kernel data structure that is linked
47 +out of the "sock" TCP structure in sock.h. Please see
48 +"include/net/web100_stats.h" for the structure definition.
50 +The API for this structure is provided through the /proc interface.
51 +This document provides a brief description of this interface. Please
52 +see fs/proc/web100.c for source code.
54 +First, kernel creates the /proc/web100 directory and the file
55 +/proc/web100/header at system boot time.
57 +Each new TCP connection is assigned a unique, unchanging number
58 +(similar to a pid), and its directory name is that number as ASCII
59 +decimal. These directories persist for about sixty seconds after the
60 +connection is terminated (goes into a CLOSED or TIME_WAIT state). The
61 +connection stats will not change after the connection is terminated.
62 +(So a connection whose state variable is TIME_WAIT is not necessarily
63 +still in TIME_WAIT.) It should be noted that what is meant by a
64 +"connection" here is actually one side of a connection. If a
65 +connection is created from the local host to the local host, two
66 +connection ID's will be created.
68 +When writing an application to read from the proc interface, it should be
69 +taken into consideration that the directories and their files can disappear at
70 +any time (they do so at an interrupt level). So if a file open fails on a
71 +file you just looked up (say, with glob), that's probably normal and the
72 +program should handle it gracefully.
74 +Another seemingly strange thing that can happen is that stats for multiple
75 +connections with the same four-tuple can show up. No more than one of the
76 +connections may be in any state but CLOSED or TIME_WAIT. This behavior is
77 +correct, and should be handled as such.
79 +The algorithms governing the connection numbers are not yet final.
80 +Currently, for simplification, it is only possible to have 32768
83 +Inside each connection directory is an identical set of files. One is
84 +spec-ascii, which contains the connection four-tuple in human-readable
85 +format. One can, for example, see all outgoing ssh connections by executing
86 +"grep ':22$' /proc/web100/*/spec-ascii" from the command prompt.
88 +The remaining files provide access to states of TCP-KIS variables in
89 +local host byte-order. Since the number, names, and contents of these
90 +files can and will change with releases, they are described in a
91 +header file -- /proc/web100/header. A file named spec, which contains the
92 +variables describing the connection's four-tuple, should be present
95 +The header file is in human-readable format as follows:
99 + <varname> <offset> <type>
100 + <varname> <offset> <type>
105 +The filename is the name of the file inside each connection directory. (The
106 +/ is prepended to make it clear it is a new file, not a new variable in the
107 +previous file. There is also an empty line before each filename.) Each
108 +file has an arbitrary number of variables, and there are an arbitrary number
109 +of files. The type is an integer, and is currently defined something like:
112 + WEB100_TYPE_INTEGER,
113 + WEB100_TYPE_INTEGER32,
114 + WEB100_TYPE_IP_ADDRESS,
115 + WEB100_TYPE_COUNTER32,
116 + WEB100_TYPE_GAUGE32,
117 + WEB100_TYPE_UNSIGNED32,
118 + WEB100_TYPE_TIME_TICKS,
119 + WEB100_TYPE_COUNTER64,
120 + WEB100_TYPE_UNSIGNED16
123 +in the kernel source file fs/proc/web100.c. These correspond to
124 +MIB-II types. (RFC2578)
126 +To read variables, seek to the appropriate offset, then read the appropriate
127 +amount of data. (Length is implied by the type.) Multiple variables may be
128 +read with a single read, and will be read atomically when doing so.
129 +Currently, all variables are readable, but this may not be true in the
132 +To write variables, seek to the appropriate offset, and write the
133 +appropriate amount of data. Only a single variable may be written at one
134 +time. If variables must be atomically written, a variable should be used as
135 +a flag to signal that the write is done, and the kernel code depending on
136 +the variables should be written to handle this.
138 +See: http://www.web100.org
139 +Please send coments to prog@web100.org
141 +John Heffner, Matt Mathis, R. Reddy
142 +August 2000, Jan 2001
144 diff -Nurp linux-2.6.22-680/Documentation/web100/sysctl.txt linux-2.6.22-690/Documentation/web100/sysctl.txt
145 --- linux-2.6.22-680/Documentation/web100/sysctl.txt 1970-01-01 01:00:00.000000000 +0100
146 +++ linux-2.6.22-690/Documentation/web100/sysctl.txt 2008-11-14 21:20:17.000000000 +0100
148 +Web100 sysctl variables
149 +John Heffner <jheffner@psc.edu>
152 +net.ipv4.WAD_FloydAIMD
153 + This value is used for WAD_FloydAIMD by a connection when its KIS
154 + variable is 0. This variable requires that private extenisons be
158 + This value is used for WAD_IFQ by a connection when its KIS
159 + variable is 0. This variable requires that Net100 extensions be
162 +net.ipv4.WAD_MaxBurst
163 + This value is used for WAD_MaxBurst by a connection when its KIS
164 + variable is 0. This variable requires that Net100 extensions be
167 +net.ipv4.web100_fperms
168 + Sets the file permissions of the files in /proc/web100/*/
171 + Sets the group of the files in /proc/web100/*/
172 diff -Nurp linux-2.6.22-680/fs/proc/Makefile linux-2.6.22-690/fs/proc/Makefile
173 --- linux-2.6.22-680/fs/proc/Makefile 2007-07-09 01:32:17.000000000 +0200
174 +++ linux-2.6.22-690/fs/proc/Makefile 2008-11-14 21:20:17.000000000 +0100
175 @@ -15,3 +15,4 @@ proc-$(CONFIG_PROC_KCORE) += kcore.o
176 proc-$(CONFIG_PROC_VMCORE) += vmcore.o
177 proc-$(CONFIG_PROC_DEVICETREE) += proc_devtree.o
178 proc-$(CONFIG_PRINTK) += kmsg.o
179 +proc-$(CONFIG_WEB100_STATS) += web100.o
180 diff -Nurp linux-2.6.22-680/fs/proc/root.c linux-2.6.22-690/fs/proc/root.c
181 --- linux-2.6.22-680/fs/proc/root.c 2008-11-12 17:40:22.000000000 +0100
182 +++ linux-2.6.22-690/fs/proc/root.c 2008-11-14 21:20:17.000000000 +0100
183 @@ -84,6 +84,10 @@ void __init proc_root_init(void)
184 proc_bus = proc_mkdir("bus", NULL);
188 +#ifdef CONFIG_WEB100_STATS
189 + proc_web100_init();
193 static int proc_root_getattr(struct vfsmount *mnt, struct dentry *dentry, struct kstat *stat
194 diff -Nurp linux-2.6.22-680/fs/proc/web100.c linux-2.6.22-690/fs/proc/web100.c
195 --- linux-2.6.22-680/fs/proc/web100.c 1970-01-01 01:00:00.000000000 +0100
196 +++ linux-2.6.22-690/fs/proc/web100.c 2008-11-14 21:20:17.000000000 +0100
201 + * Copyright (C) 2001 Matt Mathis <mathis@psc.edu>
202 + * Copyright (C) 2001 John Heffner <jheffner@psc.edu>
204 + * The Web 100 project. See http://www.web100.org
206 + * This program is free software; you can redistribute it and/or
207 + * modify it under the terms of the GNU General Public License
208 + * as published by the Free Software Foundation; either version
209 + * 2 of the License, or (at your option) any later version.
213 +#include <linux/proc_fs.h>
214 +#include <net/sock.h>
215 +#include <net/tcp.h>
216 +#include <net/web100.h>
217 +#include <linux/init.h>
218 +#include <linux/sysctl.h>
219 +#include <linux/mount.h>
221 +#define WEB100MIB_BLOCK_SIZE PAGE_SIZE - 1024
223 +extern __u32 sysctl_wmem_default;
224 +extern __u32 sysctl_wmem_max;
226 +struct proc_dir_entry *proc_web100_dir;
227 +static struct proc_dir_entry *proc_web100_header;
231 + * Web100 variable reading/writing
234 +enum web100_connection_inos {
235 + PROC_CONN_SPEC_ASCII = 1,
240 + PROC_CONN_HIGH_INO /* Keep at the end */
244 + WEB100_TYPE_INTEGER = 0,
245 + WEB100_TYPE_INTEGER32,
246 + WEB100_TYPE_INET_ADDRESS_IPV4,
247 + WEB100_TYPE_IP_ADDRESS = WEB100_TYPE_INET_ADDRESS_IPV4, /* Depricated */
248 + WEB100_TYPE_COUNTER32,
249 + WEB100_TYPE_GAUGE32,
250 + WEB100_TYPE_UNSIGNED32,
251 + WEB100_TYPE_TIME_TICKS,
252 + WEB100_TYPE_COUNTER64,
253 + WEB100_TYPE_INET_PORT_NUMBER,
254 + WEB100_TYPE_UNSIGNED16 = WEB100_TYPE_INET_PORT_NUMBER, /* Depricated */
255 + WEB100_TYPE_INET_ADDRESS,
256 + WEB100_TYPE_INET_ADDRESS_IPV6,
260 +typedef int (*web100_rwfunc_t)(void *buf, struct web100stats *stats,
261 + struct web100_var *vp);
263 +/* The printed variable description should look something like this (in ASCII):
264 + * varname offset type
265 + * where offset is the offset into the file.
272 + web100_rwfunc_t read;
273 + unsigned long read_data; /* read handler-specific data */
275 + web100_rwfunc_t write;
276 + unsigned long write_data; /* write handler-specific data */
278 + struct web100_var *next;
281 +struct web100_file {
287 + struct web100_var *first_var;
290 +#define F(name,ino,perm) { sizeof (name) - 1, (name), (ino), (perm), NULL }
291 +static struct web100_file web100_file_arr[] = {
292 + F("spec-ascii", PROC_CONN_SPEC_ASCII, S_IFREG | S_IRUGO),
293 + F("spec", PROC_CONN_SPEC, S_IFREG | S_IRUGO),
294 + F("read", PROC_CONN_READ, 0),
295 + F("test", PROC_CONN_TEST, 0),
296 + F("tune", PROC_CONN_TUNE, 0),
299 +#define WEB100_FILE_ARR_SIZE (sizeof (web100_file_arr) / sizeof (struct web100_file))
301 +/* This works only if the array is built in the correct order. */
302 +static inline struct web100_file *web100_file_lookup(int ino) {
303 + return &web100_file_arr[ino - 1];
306 +static void add_var(struct web100_file *file, char *name, int type,
307 + web100_rwfunc_t read, unsigned long read_data,
308 + web100_rwfunc_t write, unsigned long write_data)
310 + struct web100_var *var;
312 + /* Again, assuming add_var is only called at init. */
313 + if ((var = kmalloc(sizeof (struct web100_var), GFP_KERNEL)) == NULL)
314 + panic("No memory available for Web100 var.\n");
319 + case WEB100_TYPE_INET_PORT_NUMBER:
322 + case WEB100_TYPE_INTEGER:
323 + case WEB100_TYPE_INTEGER32:
324 + case WEB100_TYPE_COUNTER32:
325 + case WEB100_TYPE_GAUGE32:
326 + case WEB100_TYPE_UNSIGNED32:
327 + case WEB100_TYPE_TIME_TICKS:
330 + case WEB100_TYPE_COUNTER64:
333 + case WEB100_TYPE_INET_ADDRESS:
337 + printk("Web100: Warning: Adding variable of unknown type.\n");
342 + var->read_data = read_data;
344 + var->write = write;
345 + var->write_data = write_data;
347 + var->next = file->first_var;
348 + file->first_var = var;
353 + * proc filesystem routines
356 +static struct inode *proc_web100_make_inode(struct super_block *sb, int ino)
358 + struct inode *inode;
360 + inode = new_inode(sb);
364 + inode->i_mtime = inode->i_atime = inode->i_ctime = CURRENT_TIME;
365 + inode->i_ino = ino;
374 +static inline ino_t ino_from_cid(int cid)
376 + return (cid << 8) | 0x80000000;
379 +static inline ino_t ino_from_parts(ino_t dir_ino, __u16 low_ino)
381 + return (dir_ino & ~0xff) | low_ino;
384 +static inline int cid_from_ino(ino_t ino)
386 + return (ino & 0x7fffff00) >> 8;
389 +static inline int low_from_ino(ino_t ino)
394 +static int connection_file_open(struct inode *inode, struct file *file)
396 + int cid = cid_from_ino(inode->i_ino);
397 + struct web100stats *stats;
399 + read_lock_bh(&web100_linkage_lock);
400 + stats = web100stats_lookup(cid);
401 + if (stats == NULL || stats->wc_dead) {
402 + read_unlock_bh(&web100_linkage_lock);
405 + web100_stats_use(stats);
406 + read_unlock_bh(&web100_linkage_lock);
411 +static int connection_file_release(struct inode *inode, struct file *file)
413 + int cid = cid_from_ino(inode->i_ino);
414 + struct web100stats *stats;
416 + read_lock_bh(&web100_linkage_lock);
417 + stats = web100stats_lookup(cid);
418 + if (stats == NULL) {
419 + read_unlock_bh(&web100_linkage_lock);
422 + read_unlock_bh(&web100_linkage_lock);
423 + web100_stats_unuse(stats);
428 +/** /proc/web100/<connection>/<binary variable files> **/
429 +static ssize_t connection_file_rw(int read, struct file *file,
430 + char *buf, size_t nbytes, loff_t *ppos)
432 + int low_ino = low_from_ino(file->f_dentry->d_inode->i_ino);
433 + int cid = cid_from_ino(file->f_dentry->d_inode->i_ino);
434 + struct web100stats *stats;
435 + struct web100_file *fp;
436 + struct web100_var *vp;
440 + web100_rwfunc_t rwfunc;
443 + /* We're only going to let them read one page at a time.
444 + * We shouldn't ever read more than a page, anyway, though.
446 + if (nbytes > PAGE_SIZE)
447 + nbytes = PAGE_SIZE;
449 + if (!access_ok(read ? VERIFY_WRITE : VERIFY_READ, buf, nbytes))
452 + if ((page = (char *)__get_free_page(GFP_KERNEL)) == NULL)
456 + if (copy_from_user(page, buf, nbytes))
460 + fp = web100_file_lookup(low_ino);
462 + printk("Unregistered Web100 file.\n");
466 + read_lock_bh(&web100_linkage_lock);
467 + stats = web100stats_lookup(cid);
468 + read_unlock_bh(&web100_linkage_lock);
472 + lock_sock(stats->wc_sk);
474 + /* TODO: seek in constant time, not linear. -JWH */
477 + vp = fp->first_var;
478 + while (vp && nbytes > n) {
483 + if (pos == *ppos) {
484 + if (vp->len > nbytes - n)
490 + rwfunc = vp->write;
491 + if (rwfunc == NULL) {
496 + err = rwfunc(page + n, stats, vp);
507 + release_sock(stats->wc_sk);
510 + if (copy_to_user(buf, page, n))
513 + free_page((unsigned long)page);
518 + release_sock(stats->wc_sk);
523 +static ssize_t connection_file_read(struct file *file,
524 + char *buf, size_t nbytes, loff_t *ppos)
526 + return connection_file_rw(1, file, buf, nbytes, ppos);
529 +static ssize_t connection_file_write(struct file *file,
530 + const char *buf, size_t nbytes, loff_t *ppos)
532 + return connection_file_rw(0, file, (char *)buf, nbytes, ppos);
535 +static struct file_operations connection_file_fops = {
536 + open: connection_file_open,
537 + release: connection_file_release,
538 + read: connection_file_read,
539 + write: connection_file_write
543 +static size_t v6addr_str(char *dest, short *addr)
545 + int start = -1, end = -1;
549 + /* Find longest subsequence of 0's in addr */
550 + for (i = 0; i < 8; i++) {
551 + if (addr[i] == 0) {
552 + for (j = i + 1; addr[j] == 0 && j < 8; j++);
553 + if (j - i > end - start) {
560 + if (end - start == 1)
564 + for (i = 0; i < 8; i++) {
566 + pos += sprintf(dest + pos, ":");
568 + pos += sprintf(dest + pos, ":");
569 + i += end - start - 1;
571 + pos += sprintf(dest + pos, "%hx", ntohs(addr[i]));
578 +/** /proc/web100/<connection>/spec_ascii **/
579 +static ssize_t connection_spec_ascii_read(struct file * file, char * buf,
580 + size_t nbytes, loff_t *ppos)
582 + __u32 local_addr, remote_addr;
583 + __u16 local_port, remote_port;
585 + struct web100stats *stats;
586 + struct web100directs *vars;
593 + cid = cid_from_ino(file->f_dentry->d_parent->d_inode->i_ino);
595 + read_lock_bh(&web100_linkage_lock);
596 + stats = web100stats_lookup(cid);
597 + read_unlock_bh(&web100_linkage_lock);
600 + vars = &stats->wc_vars;
602 + if (vars->LocalAddressType == WC_ADDRTYPE_IPV4) {
603 + /* These values should not change while stats are linked.
604 + * We don't need to lock the sock. */
605 + local_addr = ntohl(vars->LocalAddress.v4addr);
606 + remote_addr = ntohl(vars->RemAddress.v4addr);
607 + local_port = vars->LocalPort;
608 + remote_port = vars->RemPort;
610 + len = sprintf(tmpbuf, "%d.%d.%d.%d:%d %d.%d.%d.%d:%d\n",
611 + (local_addr >> 24) & 0xff,
612 + (local_addr >> 16) & 0xff,
613 + (local_addr >> 8) & 0xff,
616 + (remote_addr >> 24) & 0xff,
617 + (remote_addr >> 16) & 0xff,
618 + (remote_addr >> 8) & 0xff,
619 + remote_addr & 0xff,
621 + } else if (vars->LocalAddressType == WC_ADDRTYPE_IPV6) {
622 + local_port = vars->LocalPort;
623 + remote_port = vars->RemPort;
625 + len += v6addr_str(tmpbuf + len, (short *)&vars->LocalAddress.v6addr.addr);
626 + len += sprintf(tmpbuf + len, ".%d ", local_port);
627 + len += v6addr_str(tmpbuf + len, (short *)&vars->RemAddress.v6addr.addr);
628 + len += sprintf(tmpbuf + len, ".%d\n", remote_port);
630 + printk(KERN_ERR "connection_spec_ascii_read: LocalAddressType invalid\n");
634 + len = len > nbytes ? nbytes : len;
635 + if (copy_to_user(buf, tmpbuf, len))
641 +static struct file_operations connection_spec_ascii_fops = {
642 + open: connection_file_open,
643 + release: connection_file_release,
644 + read: connection_spec_ascii_read
648 +/** /proc/web100/<connection>/ **/
649 +static int connection_dir_readdir(struct file *filp,
650 + void *dirent, filldir_t filldir)
653 + struct inode *inode = filp->f_dentry->d_inode;
654 + struct web100_file *p;
659 + if (filldir(dirent, ".", 1, i, inode->i_ino, DT_DIR) < 0)
665 + if (filldir(dirent, "..", 2, i, proc_web100_dir->low_ino, DT_DIR) < 0)
672 + if (i >= WEB100_FILE_ARR_SIZE)
674 + p = &web100_file_arr[i];
676 + if (filldir(dirent, p->name, p->len, filp->f_pos,
677 + ino_from_parts(inode->i_ino, p->low_ino),
678 + p->mode >> 12) < 0)
688 +static struct dentry *connection_dir_lookup(struct inode *dir,
689 + struct dentry *dentry, struct nameidata *nd)
691 + struct inode *inode;
692 + struct web100_file *p;
693 + struct web100stats *stats;
697 + for (p = &web100_file_arr[0]; p->name; p++) {
698 + if (p->len != dentry->d_name.len)
700 + if (!memcmp(dentry->d_name.name, p->name, p->len))
704 + return ERR_PTR(-ENOENT);
706 + read_lock_bh(&web100_linkage_lock);
707 + if ((stats = web100stats_lookup(cid_from_ino(dir->i_ino))) == NULL) {
708 + read_unlock_bh(&web100_linkage_lock);
709 + printk("connection_dir_lookup: stats == NULL\n");
710 + return ERR_PTR(-ENOENT);
712 + uid = sock_i_uid(stats->wc_sk);
713 + read_unlock_bh(&web100_linkage_lock);
715 + inode = proc_web100_make_inode(dir->i_sb, ino_from_parts(dir->i_ino, p->low_ino));
717 + return ERR_PTR(-ENOMEM);
718 + inode->i_mode = p->mode ? p->mode : S_IFREG | sysctl_web100_fperms;
719 + inode->i_uid = uid;
720 + inode->i_gid = sysctl_web100_gid;
722 + switch (p->low_ino) {
723 + case PROC_CONN_SPEC_ASCII:
724 + inode->i_fop = &connection_spec_ascii_fops;
726 + case PROC_CONN_SPEC:
727 + case PROC_CONN_READ:
728 + case PROC_CONN_TEST:
729 + case PROC_CONN_TUNE:
730 + inode->i_fop = &connection_file_fops;
733 + printk("Web100: impossible type (%d)\n", p->low_ino);
735 + return ERR_PTR(-EINVAL);
738 + d_add(dentry, inode);
742 +static struct inode_operations connection_dir_iops = {
743 + .lookup = connection_dir_lookup
746 +static struct file_operations connection_dir_fops = {
747 + .readdir = connection_dir_readdir
751 +/** /proc/web100/header **/
752 +static ssize_t header_read(struct file * file, char * buf,
753 + size_t nbytes, loff_t *ppos)
758 + struct web100_file *fp;
759 + struct web100_var *vp;
764 + /* We will assume the variable description list will not change
765 + * after init. (True at least right now.) Otherwise, we would have
766 + * to have a lock on it.
769 + if ((tmpbuf = (char *)__get_free_page(GFP_KERNEL)) == NULL)
772 + offset = sprintf(tmpbuf, "%s\n", web100_version_string);
774 + for (i = 0; i < WEB100_FILE_ARR_SIZE; i++) {
775 + int file_offset = 0;
777 + if ((fp = &web100_file_arr[i]) == NULL)
780 + if (fp->first_var == NULL)
783 + offset += sprintf(tmpbuf + offset, "\n/%s\n", fp->name);
785 + vp = fp->first_var;
787 + if (offset > WEB100MIB_BLOCK_SIZE) {
790 + n = min(offset, min_t(loff_t, nbytes, len - *ppos));
791 + if (copy_to_user(buf, tmpbuf + max_t(loff_t, *ppos - len + offset, 0), n))
803 + offset += sprintf(tmpbuf + offset, "%s %d %d %d\n",
804 + vp->name, file_offset, vp->type, vp->len);
805 + file_offset += vp->len;
812 + n = min(offset, min_t(loff_t, nbytes, len - *ppos));
813 + if (copy_to_user(buf, tmpbuf + max_t(loff_t, *ppos - len + offset, 0), n))
815 + if (nbytes <= len - *ppos) {
828 + free_page((unsigned long)tmpbuf);
832 +static struct file_operations header_file_operations = {
837 +/** /proc/web100/ **/
838 +#define FIRST_CONNECTION_ENTRY 256
839 +#define NUMBUF_LEN 11
841 +static int get_connection_list(int pos, int *cids, int max)
843 + struct web100stats *stats;
846 + pos -= FIRST_CONNECTION_ENTRY;
849 + read_lock_bh(&web100_linkage_lock);
851 + stats = web100stats_first;
852 + while (stats && n < max) {
853 + if (!stats->wc_dead) {
855 + cids[n++] = stats->wc_cid;
860 + stats = stats->wc_next;
863 + read_unlock_bh(&web100_linkage_lock);
868 +static int cid_to_str(int cid, char *buf)
872 + if (cid == 0) { /* a special case */
876 + for (len = 0; len < NUMBUF_LEN - 1 && tmp > 0; len++)
880 + for (i = 0; i < len; i++) {
881 + buf[len - i - 1] = '0' + (cid % 10);
889 +static int web100_dir_readdir(struct file *filp,
890 + void *dirent, filldir_t filldir)
897 + char name[NUMBUF_LEN];
900 + if (filp->f_pos < FIRST_CONNECTION_ENTRY) {
901 + if ((err = proc_readdir(filp, dirent, filldir)) < 0)
903 + filp->f_pos = FIRST_CONNECTION_ENTRY;
905 + n_conns = WEB100_MAX_CONNS * 2;
908 + cids = kmalloc(n_conns * sizeof (int), GFP_KERNEL);
909 + } while (cids == NULL && n_conns > 0);
912 + n = get_connection_list(filp->f_pos, cids, n_conns);
914 + for (i = 0; i < n; i++) {
915 + ino = ino_from_cid(cids[i]);
916 + len = cid_to_str(cids[i], name);
917 + if (filldir(dirent, name, len, filp->f_pos,
918 + ino, DT_DIR) < 0) {
929 +static inline struct dentry *web100_dir_dent(void)
933 + qstr.name = "web100";
935 + qstr.hash = full_name_hash(qstr.name, qstr.len);
937 + return d_lookup(proc_mnt->mnt_sb->s_root, &qstr);
940 +void web100_proc_nlink_update(nlink_t nlink)
942 + struct dentry *dent;
944 + dent = web100_dir_dent();
946 + dent->d_inode->i_nlink = nlink;
950 +int web100_proc_dointvec_update(ctl_table *ctl, int write, struct file *filp,
951 + void *buffer, size_t *lenp, loff_t *ppos)
957 + struct dentry *web100_dent, *conn_dent, *dent;
958 + struct inode *inode;
959 + struct web100_file *p;
960 + char name[NUMBUF_LEN];
962 + if ((err = proc_dointvec(ctl, write, filp, buffer, lenp, ppos)) != 0)
965 + if ((web100_dent = web100_dir_dent()) == NULL)
968 + if ((cids = kmalloc(WEB100_MAX_CONNS * sizeof (int), GFP_KERNEL)) == NULL)
970 + n = get_connection_list(FIRST_CONNECTION_ENTRY, cids, WEB100_MAX_CONNS);
971 + for (i = 0; i < n; i++) {
972 + qstr.len = cid_to_str(cids[i], name);
974 + qstr.hash = full_name_hash(qstr.name, qstr.len);
975 + if ((conn_dent = d_lookup(web100_dent, &qstr)) != NULL) {
976 + for (p = &web100_file_arr[0]; p->name; p++) {
977 + qstr.name = p->name;
979 + qstr.hash = full_name_hash(qstr.name, qstr.len);
980 + if ((dent = d_lookup(conn_dent, &qstr)) != NULL) {
981 + inode = dent->d_inode;
982 + if ((inode->i_mode = p->mode) == 0)
983 + inode->i_mode = S_IFREG | sysctl_web100_fperms;
984 + inode->i_gid = sysctl_web100_gid;
997 +static int web100_proc_connection_revalidate(struct dentry *dentry, struct nameidata *nd)
1001 + if (dentry->d_inode == NULL)
1003 + read_lock_bh(&web100_linkage_lock);
1004 + if (web100stats_lookup(cid_from_ino(dentry->d_inode->i_ino)) == NULL) {
1008 + read_unlock_bh(&web100_linkage_lock);
1013 +static struct dentry_operations web100_dir_dentry_operations = {
1014 + d_revalidate: web100_proc_connection_revalidate
1017 +static struct dentry *web100_dir_lookup(struct inode *dir,
1018 + struct dentry *dentry, struct nameidata *nd)
1024 + struct inode *inode;
1025 + unsigned long ino;
1026 + struct web100stats *stats;
1028 + if (proc_lookup(dir, dentry, nd) == NULL)
1032 + name = (char *)(dentry->d_name.name);
1033 + len = dentry->d_name.len;
1034 + if (len <= 0) /* I don't think this can happen */
1035 + return ERR_PTR(-EINVAL);
1036 + while (len-- > 0) {
1041 + if (c > 9 || c < 0 || (cid == 0 && len != 0) || cid >= WEB100_MAX_CONNS) {
1047 + return ERR_PTR(-ENOENT);
1049 + read_lock_bh(&web100_linkage_lock);
1050 + stats = web100stats_lookup(cid);
1051 + if (stats == NULL || stats->wc_dead) {
1052 + read_unlock_bh(&web100_linkage_lock);
1053 + return ERR_PTR(-ENOENT);
1055 + read_unlock_bh(&web100_linkage_lock);
1057 + ino = ino_from_cid(cid);
1058 + inode = proc_web100_make_inode(dir->i_sb, ino);
1059 + if (inode == NULL)
1060 + return ERR_PTR(-ENOMEM);
1061 + inode->i_nlink = 2;
1062 + inode->i_mode = S_IFDIR | S_IRUGO | S_IXUGO;
1063 + inode->i_flags |= S_IMMUTABLE; /* ? */
1064 + inode->i_op = &connection_dir_iops;
1065 + inode->i_fop = &connection_dir_fops;
1067 + dentry->d_op = &web100_dir_dentry_operations;
1068 + d_add(dentry, inode);
1072 +static struct file_operations web100_dir_fops = {
1073 + .readdir = web100_dir_readdir
1076 +static struct inode_operations web100_dir_iops = {
1077 + .lookup = web100_dir_lookup
1082 + * Read/write handlers
1085 +/* A read handler for reading directly from the stats */
1086 +/* read_data is the byte offset into struct web100stats */
1087 +static int read_stats(void *buf, struct web100stats *stats,
1088 + struct web100_var *vp)
1090 + memcpy(buf, (char *)stats + vp->read_data, vp->len);
1095 +/* A write handler for writing directly to the stats */
1096 +/* write_data is a byte offset into struct web100stats */
1097 +static int write_stats(void *buf, struct web100stats *stats,
1098 + struct web100_var *vp)
1100 + memcpy((char *)stats + vp->read_data, buf, vp->len);
1105 +int read_LimCwnd(void *buf, struct web100stats *stats, struct web100_var *vp)
1107 + struct tcp_sock *tp = tcp_sk(stats->wc_sk);
1108 + __u32 tmp = (__u32)(tp->snd_cwnd_clamp * tp->mss_cache);
1110 + memcpy(buf, &tmp, 4);
1115 +int write_LimCwnd(void *buf, struct web100stats *stats, struct web100_var *vp)
1117 + struct tcp_sock *tp = tcp_sk(stats->wc_sk);
1119 + tp->snd_cwnd_clamp = min(*(__u32 *)buf / tp->mss_cache, 65535U);
1124 +int write_LimRwin(void *buf, struct web100stats *stats, struct web100_var *vp)
1126 + __u32 val = *(__u32 *)buf;
1127 + struct tcp_sock *tp = tcp_sk(stats->wc_sk);
1129 + stats->wc_vars.LimRwin = tp->window_clamp =
1130 + min(val, 65535U << tp->rx_opt.rcv_wscale);
1135 +int write_Sndbuf(void *buf, struct web100stats *stats, struct web100_var *vp)
1138 + struct sock *sk = stats->wc_sk;
1140 + memcpy(&val, buf, sizeof (int));
1142 + sk->sk_userlocks |= SOCK_SNDBUF_LOCK;
1143 + sk->sk_sndbuf = max_t(int, SOCK_MIN_SNDBUF, min_t(int, sysctl_wmem_max, val));
1144 + sk->sk_write_space(sk);
1149 +int write_Rcvbuf(void *buf, struct web100stats *stats, struct web100_var *vp)
1152 + struct sock *sk = stats->wc_sk;
1154 + memcpy(&val, buf, sizeof (int));
1156 + sk->sk_userlocks |= SOCK_RCVBUF_LOCK;
1157 + sk->sk_rcvbuf = max_t(int, SOCK_MIN_RCVBUF, min_t(int, sysctl_rmem_max, val));
1162 +int write_State(void *buf, struct web100stats *stats, struct web100_var *vp)
1165 + struct sock *sk = stats->wc_sk;
1167 + memcpy(&val, buf, sizeof (int));
1168 + if (val != 12) /* deleteTCB, RFC 2012 */
1170 + sk->sk_prot->disconnect(sk, 0);
1175 +extern __u32 sysctl_wmem_default;
1176 +extern __u32 sysctl_rmem_default;
1178 +/* A read handler for reading directly from the sk */
1179 +/* read_data is a byte offset into the sk */
1180 +static int read_sk(void *buf, struct web100stats *stats,
1181 + struct web100_var *vp)
1183 + /* Fill data with 0's if the connection is gone. */
1184 + if (stats->wc_sk == NULL)
1185 + memset(buf, 0, vp->len);
1187 + memcpy(buf, (char *)(stats->wc_sk) + vp->read_data, vp->len);
1192 +static int write_sk(void *buf, struct web100stats *stats, struct web100_var *vp)
1194 + if (stats->wc_sk == NULL)
1197 + memcpy((char *)(stats->wc_sk) + vp->write_data, buf, vp->len);
1202 +__u64 web100_mono_time()
1205 + struct timespec now;
1207 + do_posix_clock_monotonic_gettime(&now);
1209 + return 1000000ULL * (__u64)now.tv_sec + now.tv_nsec / 1000;
1211 + struct timeval now;
1212 + static struct timeval before;
1214 + do_gettimeofday(&now);
1216 + /* assure monotonic, no matter what */
1217 + if ((now.tv_sec > before.tv_sec) ||
1218 + ((now.tv_sec == before.tv_sec) && (now.tv_usec > before.tv_usec))) {
1222 + if (before.tv_usec >= 1000000) {
1223 + before.tv_usec -= 1000000;
1228 + return (1000000ULL * (__u64)before.tv_sec + before.tv_usec);
1232 +/* A read handler to get the low part of the current time in usec */
1233 +static int read_now(void *buf, struct web100stats *stats,
1234 + struct web100_var *vp)
1238 + val = web100_mono_time();
1239 + val -= stats->wc_start_monotime;
1240 + memcpy(buf, (char *)&val, vp->len);
1245 +#ifdef CONFIG_WEB100_NET100
1246 +static int write_mss(void *buf, struct web100stats *stats, struct web100_var *vp)
1248 + struct sock *sk = stats->wc_sk;
1249 + struct tcp_sock *tp;
1250 + __u32 val = *(__u32 *)buf;
1256 + if (val > tp->mss_cache)
1261 + tp->mss_cache = val;
1262 + web100_update_mss(tp);
1267 +static int write_CwndAdjust(void *buf, struct web100stats *stats, struct web100_var *vp)
1269 + struct sock *sk = stats->wc_sk;
1270 + struct tcp_sock *tp;
1276 + memcpy(&stats->wc_vars.WAD_CwndAdjust, buf, 4);
1277 + tp->snd_ssthresh = min_t(__u32, tp->snd_ssthresh,
1278 + tp->snd_cwnd + stats->wc_vars.WAD_CwndAdjust);
1285 +static int rw_noop(void *buf, struct web100stats *stats, struct web100_var *vp)
1295 +void __init proc_web100_init(void)
1297 + /* Set up the proc files. */
1298 + proc_web100_dir = proc_mkdir("web100", NULL);
1299 + proc_web100_dir->proc_iops = &web100_dir_iops;
1300 + proc_web100_dir->proc_fops = &web100_dir_fops;
1302 + proc_web100_header = create_proc_entry("header", S_IFREG | S_IRUGO,
1304 + proc_web100_header->proc_fops = &header_file_operations;
1306 + /* Set up the contents of the proc files. */
1307 +#define OFFSET_IN(type,var) ((unsigned long)(&(((type *)NULL)->var)))
1308 +#define OFFSET_ST(field) ((unsigned long)(&(((struct web100stats *)NULL)->wc_vars.field)))
1309 +#define OFFSET_SK(field) ((unsigned long)(&(((struct sock *)NULL)->field)))
1310 +#define OFFSET_TP(field) ((unsigned long)(&(tcp_sk(NULL)->field)))
1312 +#define ADD_RO_STATSVAR(ino,name,type) \
1313 +add_var(web100_file_lookup(ino), #name, type, \
1314 + read_stats, OFFSET_ST(name), NULL, 0)
1316 +#define ADD_RO_STATSRENAME(ino,name,type,var) \
1317 +add_var(web100_file_lookup(ino), name, type, \
1318 + read_stats, OFFSET_ST(var), NULL, 0)
1320 +#define ADD_RO_STATSVAR_DEP(ino,name,type) \
1321 +add_var(web100_file_lookup(ino), "_" #name, type, \
1322 + read_stats, OFFSET_ST(name), NULL, 0)
1324 +#define ADD_WO_STATSVAR(ino,name,type) \
1325 +add_var(web100_file_lookup(ino), #name, type, NULL, 0, \
1326 + write_stats, OFFSET_ST(name))
1328 +#define ADD_WO_STATSVAR_DEP(ino,name,type) \
1329 +add_var(web100_file_lookup(ino), "_" #name, type, NULL, 0, \
1330 + write_stats, OFFSET_ST(name))
1332 +#define ADD_RW_STATSVAR(ino,name,type) \
1333 +add_var(web100_file_lookup(ino), #name, type, \
1334 + read_stats, OFFSET_ST(name), \
1335 + write_stats, OFFSET_ST(name))
1337 +#define ADD_RW_STATSVAR_DEP(ino,name,type) \
1338 +add_var(web100_file_lookup(ino), "_" #name, type, \
1339 + read_stats, OFFSET_ST(name), \
1340 + write_stats, OFFSET_ST(name))
1342 +#define ADD_RO_SKVAR(ino,name,type,var) \
1343 +add_var(web100_file_lookup(ino), #name, type, \
1344 + read_sk, OFFSET_SK(var), NULL, 0)
1346 +#define ADD_RW_SKVAR(ino,name,type,var) \
1347 +add_var(web100_file_lookup(ino), #name, type, \
1348 + read_sk, OFFSET_SK(var), write_sk, OFFSET_SK(var))
1350 +#define ADD_RO_TPVAR(ino,name,type,var) \
1351 +add_var(web100_file_lookup(ino), #name, type, \
1352 + read_sk, OFFSET_TP(var), write_sk, OFFSET_TP(var))
1354 +#define ADD_NOOP(ino,name,type) \
1355 +add_var(web100_file_lookup(ino), #name, type, \
1356 + rw_noop, 0, rw_noop, 0)
1359 + ADD_RO_STATSVAR(PROC_CONN_SPEC, LocalAddressType, WEB100_TYPE_INTEGER);
1360 + ADD_RO_STATSVAR(PROC_CONN_SPEC, LocalAddress, WEB100_TYPE_INET_ADDRESS);
1361 + ADD_RO_STATSVAR(PROC_CONN_SPEC, LocalPort, WEB100_TYPE_INET_PORT_NUMBER);
1362 + ADD_RO_STATSVAR(PROC_CONN_SPEC, RemAddress, WEB100_TYPE_INET_ADDRESS);
1363 + ADD_RO_STATSVAR(PROC_CONN_SPEC, RemPort, WEB100_TYPE_INET_PORT_NUMBER);
1364 + ADD_RO_STATSRENAME(PROC_CONN_SPEC, "_RemoteAddress", WEB100_TYPE_INET_ADDRESS, RemAddress);
1365 + ADD_RO_STATSRENAME(PROC_CONN_SPEC, "_RemotePort", WEB100_TYPE_INET_PORT_NUMBER, RemPort);
1369 + ADD_RO_STATSVAR(PROC_CONN_READ, State, WEB100_TYPE_INTEGER);
1370 + ADD_RO_STATSVAR(PROC_CONN_READ, SACKEnabled, WEB100_TYPE_INTEGER);
1371 + ADD_RO_STATSVAR(PROC_CONN_READ, TimestampsEnabled, WEB100_TYPE_INTEGER);
1372 + ADD_RO_STATSVAR(PROC_CONN_READ, NagleEnabled, WEB100_TYPE_INTEGER);
1373 + ADD_RO_STATSVAR(PROC_CONN_READ, ECNEnabled, WEB100_TYPE_INTEGER);
1374 + ADD_RO_STATSVAR(PROC_CONN_READ, SndWinScale, WEB100_TYPE_INTEGER);
1375 + ADD_RO_STATSVAR(PROC_CONN_READ, RcvWinScale, WEB100_TYPE_INTEGER);
1378 + ADD_RO_STATSVAR(PROC_CONN_READ, ActiveOpen, WEB100_TYPE_INTEGER);
1379 + ADD_RO_STATSVAR(PROC_CONN_READ, MSSRcvd, WEB100_TYPE_GAUGE32);
1380 + ADD_RO_STATSVAR(PROC_CONN_READ, WinScaleRcvd, WEB100_TYPE_INTEGER);
1381 + ADD_RO_STATSVAR(PROC_CONN_READ, WinScaleSent, WEB100_TYPE_INTEGER);
1384 + ADD_RO_STATSVAR(PROC_CONN_READ, PktsOut, WEB100_TYPE_COUNTER32);
1385 + ADD_RO_STATSVAR(PROC_CONN_READ, DataPktsOut, WEB100_TYPE_COUNTER32);
1386 + ADD_RO_STATSVAR_DEP(PROC_CONN_READ, AckPktsOut, WEB100_TYPE_COUNTER32);
1387 + ADD_RO_STATSVAR(PROC_CONN_READ, DataBytesOut, WEB100_TYPE_COUNTER64);
1388 + ADD_RO_STATSVAR(PROC_CONN_READ, PktsIn, WEB100_TYPE_COUNTER32);
1389 + ADD_RO_STATSVAR(PROC_CONN_READ, DataPktsIn, WEB100_TYPE_COUNTER32);
1390 + ADD_RO_STATSVAR_DEP(PROC_CONN_READ, AckPktsIn, WEB100_TYPE_COUNTER32);
1391 + ADD_RO_STATSVAR(PROC_CONN_READ, DataBytesIn, WEB100_TYPE_COUNTER64);
1392 + ADD_RO_STATSVAR(PROC_CONN_READ, SndUna, WEB100_TYPE_COUNTER32);
1393 + ADD_RO_STATSVAR(PROC_CONN_READ, SndNxt, WEB100_TYPE_UNSIGNED32);
1394 + ADD_RO_STATSVAR(PROC_CONN_READ, SndMax, WEB100_TYPE_COUNTER32);
1395 + ADD_RO_STATSRENAME(PROC_CONN_READ, "_snd_una", WEB100_TYPE_COUNTER32, SndUna);
1396 + ADD_RO_STATSRENAME(PROC_CONN_READ, "_snd_nxt", WEB100_TYPE_COUNTER32, SndNxt);
1397 + ADD_RO_STATSRENAME(PROC_CONN_READ, "_snd_max", WEB100_TYPE_COUNTER32, SndMax);
1398 + ADD_RO_STATSVAR(PROC_CONN_READ, ThruBytesAcked, WEB100_TYPE_COUNTER64);
1399 + ADD_RO_STATSRENAME(PROC_CONN_READ, "_ThruBytesSent", WEB100_TYPE_COUNTER64, ThruBytesAcked);
1400 + ADD_RO_STATSVAR(PROC_CONN_READ, SndISS, WEB100_TYPE_COUNTER32);
1401 + ADD_RO_STATSVAR_DEP(PROC_CONN_READ, SendWraps, WEB100_TYPE_COUNTER32);
1402 + ADD_RO_STATSVAR(PROC_CONN_READ, RcvNxt, WEB100_TYPE_COUNTER32);
1403 + ADD_RO_STATSRENAME(PROC_CONN_READ, "_rcv_nxt", WEB100_TYPE_COUNTER32, RcvNxt);
1404 + ADD_RO_STATSVAR(PROC_CONN_READ, ThruBytesReceived, WEB100_TYPE_COUNTER64);
1405 + ADD_RO_STATSVAR(PROC_CONN_READ, RecvISS, WEB100_TYPE_COUNTER32);
1406 + ADD_RO_STATSVAR_DEP(PROC_CONN_READ, RecvWraps, WEB100_TYPE_COUNTER32);
1407 + ADD_RO_STATSVAR_DEP(PROC_CONN_READ, StartTime, WEB100_TYPE_INTEGER32);
1408 + ADD_RO_STATSVAR(PROC_CONN_READ, StartTimeSec, WEB100_TYPE_INTEGER32);
1409 + ADD_RO_STATSVAR(PROC_CONN_READ, StartTimeUsec, WEB100_TYPE_INTEGER32);
1410 + add_var(web100_file_lookup(PROC_CONN_READ), "Duration", WEB100_TYPE_COUNTER64, read_now, 0, NULL, 0);
1411 + add_var(web100_file_lookup(PROC_CONN_READ), "_CurrTime", WEB100_TYPE_COUNTER64, read_now, 0, NULL, 0);
1413 + /* SENDER CONGESTION */
1414 + ADD_RO_STATSRENAME(PROC_CONN_READ, "SndLimTransSender", WEB100_TYPE_COUNTER32, SndLimTrans[WC_SNDLIM_SENDER]);
1415 + ADD_RO_STATSRENAME(PROC_CONN_READ, "SndLimBytesSender", WEB100_TYPE_COUNTER64, SndLimBytes[WC_SNDLIM_SENDER]);
1416 + ADD_RO_STATSRENAME(PROC_CONN_READ, "SndLimTimeSender", WEB100_TYPE_COUNTER32, SndLimTime[WC_SNDLIM_SENDER]);
1417 + ADD_RO_STATSRENAME(PROC_CONN_READ, "SndLimTransCwnd", WEB100_TYPE_COUNTER32, SndLimTrans[WC_SNDLIM_CWND]);
1418 + ADD_RO_STATSRENAME(PROC_CONN_READ, "SndLimBytesCwnd", WEB100_TYPE_COUNTER64, SndLimBytes[WC_SNDLIM_CWND]);
1419 + ADD_RO_STATSRENAME(PROC_CONN_READ, "SndLimTimeCwnd", WEB100_TYPE_COUNTER32, SndLimTime[WC_SNDLIM_CWND]);
1420 + ADD_RO_STATSRENAME(PROC_CONN_READ, "SndLimTransRwin", WEB100_TYPE_COUNTER32, SndLimTrans[WC_SNDLIM_RWIN]);
1421 + ADD_RO_STATSRENAME(PROC_CONN_READ, "SndLimBytesRwin", WEB100_TYPE_COUNTER64, SndLimBytes[WC_SNDLIM_RWIN]);
1422 + ADD_RO_STATSRENAME(PROC_CONN_READ, "SndLimTimeRwin", WEB100_TYPE_COUNTER32, SndLimTime[WC_SNDLIM_RWIN]);
1423 + ADD_RO_STATSVAR(PROC_CONN_READ, SlowStart, WEB100_TYPE_COUNTER32);
1424 + ADD_RO_STATSVAR(PROC_CONN_READ, CongAvoid, WEB100_TYPE_COUNTER32);
1425 + ADD_RO_STATSVAR(PROC_CONN_READ, CongestionSignals, WEB100_TYPE_COUNTER32);
1426 + ADD_RO_STATSVAR(PROC_CONN_READ, OtherReductions, WEB100_TYPE_COUNTER32);
1427 + ADD_RO_STATSVAR(PROC_CONN_READ, X_OtherReductionsCV, WEB100_TYPE_COUNTER32);
1428 + ADD_RO_STATSVAR(PROC_CONN_READ, X_OtherReductionsCM, WEB100_TYPE_COUNTER32);
1429 + ADD_RO_STATSVAR(PROC_CONN_READ, CongestionOverCount, WEB100_TYPE_COUNTER32);
1430 + ADD_RO_STATSRENAME(PROC_CONN_READ, "_Recoveries", WEB100_TYPE_COUNTER32, CongestionSignals);
1431 + ADD_RO_STATSVAR(PROC_CONN_READ, CurCwnd, WEB100_TYPE_GAUGE32);
1432 + ADD_RO_STATSRENAME(PROC_CONN_READ, "_CurrentCwnd", WEB100_TYPE_GAUGE32, CurCwnd);
1433 + ADD_RO_STATSVAR(PROC_CONN_READ, MaxCwnd, WEB100_TYPE_GAUGE32);
1434 + ADD_RO_STATSVAR(PROC_CONN_READ, CurSsthresh, WEB100_TYPE_GAUGE32);
1435 + ADD_RO_STATSRENAME(PROC_CONN_READ, "_CurrentSsthresh", WEB100_TYPE_GAUGE32, CurSsthresh);
1436 + add_var(web100_file_lookup(PROC_CONN_READ), "LimCwnd", WEB100_TYPE_GAUGE32, read_LimCwnd, 0, NULL, 0);
1437 + ADD_RO_STATSVAR(PROC_CONN_READ, MaxSsthresh, WEB100_TYPE_GAUGE32);
1438 + ADD_RO_STATSVAR(PROC_CONN_READ, MinSsthresh, WEB100_TYPE_GAUGE32);
1440 + /* SENDER PATH MODEL */
1441 + ADD_RO_STATSVAR(PROC_CONN_READ, FastRetran, WEB100_TYPE_COUNTER32);
1442 + ADD_RO_STATSVAR(PROC_CONN_READ, Timeouts, WEB100_TYPE_COUNTER32);
1443 + ADD_RO_STATSVAR(PROC_CONN_READ, SubsequentTimeouts, WEB100_TYPE_COUNTER32);
1444 + ADD_RO_STATSVAR(PROC_CONN_READ, CurTimeoutCount, WEB100_TYPE_GAUGE32);
1445 + ADD_RO_STATSRENAME(PROC_CONN_READ, "_CurrTimeoutCount", WEB100_TYPE_GAUGE32, CurTimeoutCount);
1446 + ADD_RO_STATSVAR(PROC_CONN_READ, AbruptTimeouts, WEB100_TYPE_COUNTER32);
1447 + ADD_RO_STATSVAR(PROC_CONN_READ, PktsRetrans, WEB100_TYPE_COUNTER32);
1448 + ADD_RO_STATSVAR(PROC_CONN_READ, BytesRetrans, WEB100_TYPE_COUNTER32);
1449 + ADD_RO_STATSVAR(PROC_CONN_READ, DupAcksIn, WEB100_TYPE_COUNTER32);
1450 + ADD_RO_STATSVAR(PROC_CONN_READ, SACKsRcvd, WEB100_TYPE_COUNTER32);
1451 + ADD_RO_STATSVAR(PROC_CONN_READ, SACKBlocksRcvd, WEB100_TYPE_COUNTER32);
1452 + ADD_RO_STATSVAR(PROC_CONN_READ, PreCongSumCwnd, WEB100_TYPE_COUNTER32);
1453 + ADD_RO_STATSRENAME(PROC_CONN_READ, "_SumCwndAtCong", WEB100_TYPE_COUNTER32, PreCongSumCwnd);
1454 + ADD_RO_STATSVAR(PROC_CONN_READ, PreCongSumRTT, WEB100_TYPE_COUNTER32);
1455 + ADD_RO_STATSVAR_DEP(PROC_CONN_READ, PreCongCountRTT, WEB100_TYPE_COUNTER32);
1456 + ADD_RO_STATSVAR(PROC_CONN_READ, PostCongSumRTT, WEB100_TYPE_COUNTER32);
1457 + ADD_RO_STATSVAR(PROC_CONN_READ, PostCongCountRTT, WEB100_TYPE_COUNTER32);
1458 + ADD_RO_STATSVAR(PROC_CONN_READ, ECERcvd, WEB100_TYPE_COUNTER32);
1459 + ADD_RO_STATSVAR(PROC_CONN_READ, SendStall, WEB100_TYPE_COUNTER32);
1460 + ADD_RO_STATSVAR(PROC_CONN_READ, QuenchRcvd, WEB100_TYPE_COUNTER32);
1461 + ADD_RO_STATSVAR(PROC_CONN_READ, RetranThresh, WEB100_TYPE_GAUGE32);
1462 + ADD_RO_STATSVAR(PROC_CONN_READ, NonRecovDA, WEB100_TYPE_COUNTER32);
1463 + ADD_RO_STATSVAR(PROC_CONN_READ, AckAfterFR, WEB100_TYPE_COUNTER32);
1464 + ADD_RO_STATSVAR(PROC_CONN_READ, DSACKDups, WEB100_TYPE_COUNTER32);
1465 + ADD_RO_STATSVAR(PROC_CONN_READ, SampleRTT, WEB100_TYPE_GAUGE32);
1466 + ADD_RO_STATSRENAME(PROC_CONN_READ, "_SampledRTT", WEB100_TYPE_GAUGE32, SampleRTT);
1467 + ADD_RO_STATSVAR(PROC_CONN_READ, SmoothedRTT, WEB100_TYPE_GAUGE32);
1468 + ADD_RO_STATSVAR(PROC_CONN_READ, RTTVar, WEB100_TYPE_GAUGE32);
1469 + ADD_RO_STATSVAR(PROC_CONN_READ, MaxRTT, WEB100_TYPE_GAUGE32);
1470 + ADD_RO_STATSVAR(PROC_CONN_READ, MinRTT, WEB100_TYPE_GAUGE32);
1471 + ADD_RO_STATSVAR(PROC_CONN_READ, SumRTT, WEB100_TYPE_COUNTER64);
1472 + ADD_RO_STATSVAR(PROC_CONN_READ, CountRTT, WEB100_TYPE_COUNTER32);
1473 + ADD_RO_STATSVAR(PROC_CONN_READ, CurRTO, WEB100_TYPE_GAUGE32);
1474 + ADD_RO_STATSRENAME(PROC_CONN_READ, "_CurrentRTO", WEB100_TYPE_GAUGE32, CurRTO);
1475 + ADD_RO_STATSVAR(PROC_CONN_READ, MaxRTO, WEB100_TYPE_GAUGE32);
1476 + ADD_RO_STATSVAR(PROC_CONN_READ, MinRTO, WEB100_TYPE_GAUGE32);
1477 + ADD_RO_STATSVAR(PROC_CONN_READ, CurMSS, WEB100_TYPE_GAUGE32);
1478 + ADD_RO_STATSRENAME(PROC_CONN_READ, "_CurrentMSS", WEB100_TYPE_GAUGE32, CurMSS);
1479 + ADD_RO_STATSVAR(PROC_CONN_READ, MaxMSS, WEB100_TYPE_GAUGE32);
1480 + ADD_RO_STATSVAR(PROC_CONN_READ, MinMSS, WEB100_TYPE_GAUGE32);
1482 + /* SENDER BUFFER */
1483 +#define PROC_CONN_XTEST PROC_CONN_READ /* lazy */
1484 + ADD_RO_SKVAR(PROC_CONN_READ, _Sndbuf, WEB100_TYPE_GAUGE32, sk_sndbuf);
1485 + ADD_RO_SKVAR(PROC_CONN_READ, X_Sndbuf, WEB100_TYPE_GAUGE32, sk_sndbuf);
1486 + ADD_RO_SKVAR(PROC_CONN_READ, X_Rcvbuf, WEB100_TYPE_GAUGE32, sk_rcvbuf);
1487 + ADD_RO_STATSVAR(PROC_CONN_READ, CurRetxQueue, WEB100_TYPE_GAUGE32);
1488 + ADD_RO_STATSRENAME(PROC_CONN_READ, "_CurRetranQueue", WEB100_TYPE_GAUGE32, CurRetxQueue);
1489 + ADD_RO_STATSVAR(PROC_CONN_READ, MaxRetxQueue, WEB100_TYPE_GAUGE32);
1490 + ADD_RO_STATSRENAME(PROC_CONN_READ, "_MaxRetranQueue", WEB100_TYPE_GAUGE32, MaxRetxQueue);
1491 + ADD_RO_STATSVAR(PROC_CONN_READ, CurAppWQueue, WEB100_TYPE_GAUGE32);
1492 + ADD_RO_STATSVAR(PROC_CONN_READ, MaxAppWQueue, WEB100_TYPE_GAUGE32);
1494 + /* SENDER BUFFER TUNING - See below */
1496 + /* LOCAL RECEIVER */
1497 + ADD_RO_STATSVAR(PROC_CONN_READ, CurRwinSent, WEB100_TYPE_GAUGE32);
1498 + ADD_RO_STATSRENAME(PROC_CONN_READ, "_CurrentRwinSent", WEB100_TYPE_GAUGE32, CurRwinSent);
1499 + ADD_RO_STATSVAR(PROC_CONN_READ, MaxRwinSent, WEB100_TYPE_GAUGE32);
1500 + ADD_RO_STATSVAR(PROC_CONN_READ, MinRwinSent, WEB100_TYPE_GAUGE32);
1501 + ADD_RO_STATSVAR(PROC_CONN_READ, LimRwin, WEB100_TYPE_GAUGE32);
1502 + ADD_RO_STATSVAR(PROC_CONN_READ, DupAcksOut, WEB100_TYPE_COUNTER32);
1503 + ADD_RO_SKVAR(PROC_CONN_READ, _Rcvbuf, WEB100_TYPE_GAUGE32, sk_rcvbuf);
1504 + ADD_RO_STATSVAR(PROC_CONN_READ, CurReasmQueue, WEB100_TYPE_GAUGE32);
1505 + ADD_RO_STATSVAR(PROC_CONN_READ, MaxReasmQueue, WEB100_TYPE_GAUGE32);
1506 + ADD_RO_STATSVAR(PROC_CONN_READ, CurAppRQueue, WEB100_TYPE_GAUGE32);
1507 + ADD_RO_STATSVAR(PROC_CONN_READ, MaxAppRQueue, WEB100_TYPE_GAUGE32);
1508 + ADD_RO_TPVAR(PROC_CONN_XTEST, X_rcv_ssthresh, WEB100_TYPE_GAUGE32, rcv_ssthresh);
1509 + ADD_RO_TPVAR(PROC_CONN_XTEST, X_wnd_clamp, WEB100_TYPE_GAUGE32, window_clamp);
1510 + ADD_RO_STATSVAR(PROC_CONN_XTEST, X_dbg1, WEB100_TYPE_GAUGE32);
1511 + ADD_RO_STATSVAR(PROC_CONN_XTEST, X_dbg2, WEB100_TYPE_GAUGE32);
1512 + ADD_RO_STATSVAR(PROC_CONN_XTEST, X_dbg3, WEB100_TYPE_GAUGE32);
1513 + ADD_RO_STATSVAR(PROC_CONN_XTEST, X_dbg4, WEB100_TYPE_GAUGE32);
1515 + /* OBSERVED RECEIVER */
1516 + ADD_RO_STATSVAR(PROC_CONN_READ, CurRwinRcvd, WEB100_TYPE_GAUGE32);
1517 + ADD_RO_STATSRENAME(PROC_CONN_READ, "_CurrentRwinRcvd", WEB100_TYPE_GAUGE32, CurRwinRcvd);
1518 + ADD_RO_STATSVAR(PROC_CONN_READ, MaxRwinRcvd, WEB100_TYPE_GAUGE32);
1519 + ADD_RO_STATSVAR(PROC_CONN_READ, MinRwinRcvd, WEB100_TYPE_GAUGE32);
1521 + /* CONNECTION ID */
1522 + ADD_RO_STATSVAR(PROC_CONN_READ, LocalAddressType, WEB100_TYPE_INTEGER);
1523 + ADD_RO_STATSVAR(PROC_CONN_READ, LocalAddress, WEB100_TYPE_INET_ADDRESS);
1524 + ADD_RO_STATSVAR(PROC_CONN_READ, LocalPort, WEB100_TYPE_INET_PORT_NUMBER);
1525 + ADD_RO_STATSVAR(PROC_CONN_READ, RemAddress, WEB100_TYPE_INET_ADDRESS);
1526 + ADD_RO_STATSRENAME(PROC_CONN_READ, "_RemoteAddress", WEB100_TYPE_INET_ADDRESS, RemAddress);
1527 + ADD_RO_STATSVAR(PROC_CONN_READ, RemPort, WEB100_TYPE_INET_PORT_NUMBER);
1528 + ADD_RO_STATSRENAME(PROC_CONN_READ, "_RemotePort", WEB100_TYPE_INET_PORT_NUMBER, RemPort);
1530 + ADD_RO_STATSVAR(PROC_CONN_READ, X_RcvRTT, WEB100_TYPE_GAUGE32);
1533 + add_var(web100_file_lookup(PROC_CONN_TUNE), "LimCwnd",
1534 + WEB100_TYPE_GAUGE32, read_LimCwnd, 0,
1535 + write_LimCwnd, 0);
1536 + add_var(web100_file_lookup(PROC_CONN_TUNE), "LimRwin",
1537 + WEB100_TYPE_GAUGE32, read_stats, OFFSET_ST(LimRwin),
1538 + write_LimRwin, 0);
1539 + add_var(web100_file_lookup(PROC_CONN_TUNE), "X_Sndbuf",
1540 + WEB100_TYPE_GAUGE32, read_sk, OFFSET_SK(sk_sndbuf),
1542 + add_var(web100_file_lookup(PROC_CONN_TUNE), "X_Rcvbuf",
1543 + WEB100_TYPE_GAUGE32, read_sk, OFFSET_SK(sk_rcvbuf),
1545 + add_var(web100_file_lookup(PROC_CONN_TUNE), "State",
1546 + WEB100_TYPE_INTEGER, read_stats, OFFSET_ST(State),
1548 +#ifdef CONFIG_WEB100_NET100
1549 + add_var(web100_file_lookup(PROC_CONN_TUNE), "CurMSS",
1550 + WEB100_TYPE_GAUGE32, read_stats, OFFSET_ST(CurMSS),
1554 +#ifdef CONFIG_WEB100_NET100
1555 + ADD_RW_STATSVAR(PROC_CONN_TUNE, WAD_IFQ, WEB100_TYPE_GAUGE32);
1556 + ADD_RW_STATSVAR(PROC_CONN_TUNE, WAD_MaxBurst, WEB100_TYPE_GAUGE32);
1557 + ADD_RW_STATSVAR(PROC_CONN_TUNE, WAD_MaxSsthresh, WEB100_TYPE_GAUGE32);
1558 + ADD_RW_STATSVAR(PROC_CONN_TUNE, WAD_NoAI, WEB100_TYPE_INTEGER);
1559 + add_var(web100_file_lookup(PROC_CONN_TUNE), "WAD_CwndAdjust",
1560 + WEB100_TYPE_INTEGER32, read_stats, OFFSET_ST(WAD_CwndAdjust),
1561 + write_CwndAdjust, 0);
1564 diff -Nurp linux-2.6.22-680/include/linux/netlink.h linux-2.6.22-690/include/linux/netlink.h
1565 --- linux-2.6.22-680/include/linux/netlink.h 2008-11-12 17:40:01.000000000 +0100
1566 +++ linux-2.6.22-690/include/linux/netlink.h 2008-11-14 21:20:17.000000000 +0100
1568 /* leave room for NETLINK_DM (DM Events) */
1569 #define NETLINK_SCSITRANSPORT 18 /* SCSI Transports */
1570 #define NETLINK_ECRYPTFS 19
1571 +#define NETLINK_WEB100 29
1573 #define MAX_LINKS 32
1575 diff -Nurp linux-2.6.22-680/include/linux/proc_fs.h linux-2.6.22-690/include/linux/proc_fs.h
1576 --- linux-2.6.22-680/include/linux/proc_fs.h 2008-11-12 17:40:23.000000000 +0100
1577 +++ linux-2.6.22-690/include/linux/proc_fs.h 2008-11-14 21:20:17.000000000 +0100
1578 @@ -97,6 +97,10 @@ extern spinlock_t proc_subdir_lock;
1579 extern void proc_root_init(void);
1580 extern void proc_misc_init(void);
1582 +#ifdef CONFIG_WEB100_STATS
1583 +extern void proc_web100_init(void);
1588 void proc_flush_task(struct task_struct *task);
1589 diff -Nurp linux-2.6.22-680/include/linux/sysctl.h linux-2.6.22-690/include/linux/sysctl.h
1590 --- linux-2.6.22-680/include/linux/sysctl.h 2008-11-12 17:41:48.000000000 +0100
1591 +++ linux-2.6.22-690/include/linux/sysctl.h 2008-11-14 21:21:18.000000000 +0100
1592 @@ -448,7 +448,15 @@ enum
1593 NET_IPV4_ICMP_IPOD_ENABLED,
1594 NET_IPV4_ICMP_IPOD_HOST,
1595 NET_IPV4_ICMP_IPOD_MASK,
1596 - NET_IPV4_ICMP_IPOD_KEY
1597 + NET_IPV4_ICMP_IPOD_KEY,
1599 +#ifdef CONFIG_WEB100_NET100
1601 + NET_IPV4_WAD_MAX_BURST,
1603 +#ifdef CONFIG_WEB100_STATS
1604 + NET_IPV4_WEB100_FPERMS,
1605 + NET_IPV4_WEB100_GID,
1609 @@ -971,6 +979,10 @@ extern int proc_doulongvec_minmax(ctl_ta
1610 void __user *, size_t *, loff_t *);
1611 extern int proc_doulongvec_ms_jiffies_minmax(ctl_table *table, int,
1612 struct file *, void __user *, size_t *, loff_t *);
1613 +#ifdef CONFIG_WEB100_STATS
1614 +extern int web100_proc_dointvec_update(ctl_table *, int, struct file *,
1615 + void *, size_t *, loff_t *);
1618 extern int do_sysctl (int __user *name, int nlen,
1619 void __user *oldval, size_t __user *oldlenp,
1620 diff -Nurp linux-2.6.22-680/include/linux/tcp.h linux-2.6.22-690/include/linux/tcp.h
1621 --- linux-2.6.22-680/include/linux/tcp.h 2007-07-09 01:32:17.000000000 +0200
1622 +++ linux-2.6.22-690/include/linux/tcp.h 2008-11-14 21:20:17.000000000 +0100
1623 @@ -402,6 +402,10 @@ struct tcp_sock {
1624 /* TCP MD5 Signagure Option information */
1625 struct tcp_md5sig_info *md5sig_info;
1628 +#ifdef CONFIG_WEB100_STATS
1629 + struct web100stats *tcp_stats;
1633 static inline struct tcp_sock *tcp_sk(const struct sock *sk)
1634 diff -Nurp linux-2.6.22-680/include/net/tcp.h linux-2.6.22-690/include/net/tcp.h
1635 --- linux-2.6.22-680/include/net/tcp.h 2008-11-12 17:40:02.000000000 +0100
1636 +++ linux-2.6.22-690/include/net/tcp.h 2008-11-14 21:20:17.000000000 +0100
1639 #include <linux/seq_file.h>
1641 +#include <net/web100.h>
1643 extern struct inet_hashinfo tcp_hashinfo;
1645 extern atomic_t tcp_orphan_count;
1646 @@ -232,6 +234,14 @@ extern int sysctl_tcp_base_mss;
1647 extern int sysctl_tcp_workaround_signed_windows;
1648 extern int sysctl_tcp_slow_start_after_idle;
1649 extern int sysctl_tcp_max_ssthresh;
1650 +#ifdef CONFIG_WEB100_NET100
1651 +extern int sysctl_WAD_IFQ;
1652 +extern int sysctl_WAD_MaxBurst;
1654 +#ifdef CONFIG_WEB100_STATS
1655 +extern int sysctl_web100_fperms;
1656 +extern int sysctl_web100_gid;
1659 extern atomic_t tcp_memory_allocated;
1660 extern atomic_t tcp_sockets_allocated;
1661 @@ -755,6 +765,9 @@ extern __u32 tcp_init_cwnd(struct tcp_so
1663 static __inline__ __u32 tcp_max_burst(const struct tcp_sock *tp)
1665 +#ifdef CONFIG_WEB100_NET100
1666 + return (NET100_WAD(tp, WAD_MaxBurst, sysctl_WAD_MaxBurst));
1671 @@ -902,6 +915,8 @@ static inline void tcp_set_state(struct
1673 int oldstate = sk->sk_state;
1675 + WEB100_VAR_SET(tcp_sk(sk), State, web100_state(state));
1678 case TCP_ESTABLISHED:
1679 if (oldstate != TCP_ESTABLISHED)
1680 diff -Nurp linux-2.6.22-680/include/net/web100.h linux-2.6.22-690/include/net/web100.h
1681 --- linux-2.6.22-680/include/net/web100.h 1970-01-01 01:00:00.000000000 +0100
1682 +++ linux-2.6.22-690/include/net/web100.h 2008-11-14 21:20:17.000000000 +0100
1685 + * include/net/web100.h
1687 + * Copyright (C) 2001 Matt Mathis <mathis@psc.edu>
1688 + * Copyright (C) 2001 John Heffner <jheffner@psc.edu>
1690 + * The Web 100 project. See http://www.web100.org
1692 + * This program is free software; you can redistribute it and/or
1693 + * modify it under the terms of the GNU General Public License
1694 + * as published by the Free Software Foundation; either version
1695 + * 2 of the License, or (at your option) any later version.
1702 +#include <net/sock.h>
1703 +#include <net/web100_stats.h>
1704 +#include <linux/tcp.h>
1706 +#ifdef CONFIG_WEB100_STATS
1708 +#define WEB100_MAX_CONNS (1<<15)
1710 +#define WEB100_DELAY_MAX HZ
1713 +#define WC_NL_TYPE_CONNECT 0
1714 +#define WC_NL_TYPE_DISCONNECT 1
1716 +struct web100_netlink_msg {
1721 +/* The syntax of this version string is subject to future changes */
1722 +extern char *web100_version_string;
1724 +/* Stats structures */
1725 +extern struct web100stats *web100stats_arr[];
1726 +extern struct web100stats *web100stats_first;
1728 +/* For locking the creation and destruction of stats structures. */
1729 +extern rwlock_t web100_linkage_lock;
1731 +/* For /proc/web100 */
1732 +extern struct web100stats *web100stats_lookup(int cid);
1734 +/* For the TCP code */
1735 +extern int web100_stats_create(struct sock *sk);
1736 +extern void web100_stats_destroy(struct web100stats *stats);
1737 +extern void web100_stats_free(struct web100stats *stats);
1738 +extern void web100_stats_establish(struct sock *sk);
1740 +extern void web100_tune_sndbuf_ack(struct sock *sk);
1741 +extern void web100_tune_sndbuf_snd(struct sock *sk);
1742 +extern void web100_tune_rcvbuf(struct sock *sk);
1744 +extern void web100_update_snd_nxt(struct tcp_sock *tp);
1745 +extern void web100_update_snd_una(struct tcp_sock *tp);
1746 +extern void web100_update_rtt(struct sock *sk, unsigned long rtt_sample);
1747 +extern void web100_update_timeout(struct sock *sk);
1748 +extern void web100_update_mss(struct tcp_sock *tp);
1749 +extern void web100_update_cwnd(struct tcp_sock *tp);
1750 +extern void web100_update_rwin_rcvd(struct tcp_sock *tp);
1751 +extern void web100_update_sndlim(struct tcp_sock *tp, int why);
1752 +extern void web100_update_rcv_nxt(struct tcp_sock *tp);
1753 +extern void web100_update_rwin_sent(struct tcp_sock *tp);
1754 +extern void web100_update_congestion(struct tcp_sock *tp, int why);
1755 +extern void web100_update_segsend(struct sock *sk, int len, int pcount,
1756 + __u32 seq, __u32 end_seq, int flags);
1757 +extern void web100_update_segrecv(struct tcp_sock *tp, struct sk_buff *skb);
1758 +extern void web100_update_rcvbuf(struct sock *sk, int rcvbuf);
1759 +extern void web100_update_writeq(struct sock *sk);
1760 +extern void web100_update_recvq(struct sock *sk);
1761 +extern void web100_update_ofoq(struct sock *sk);
1763 +extern void web100_stats_init(void);
1765 +/* For the IP code */
1766 +extern int web100_delay_output(struct sk_buff *skb, int (*output)(struct sk_buff *));
1768 +extern __u64 web100_mono_time(void);
1770 +/* You may have to hold web100_linkage_lock here to prevent
1771 + stats from disappearing. */
1772 +static inline void web100_stats_use(struct web100stats *stats)
1774 + atomic_inc(&stats->wc_users);
1777 +/* You MUST NOT hold web100_linkage_lock here. */
1778 +static inline void web100_stats_unuse(struct web100stats *stats)
1780 + if (atomic_dec_and_test(&stats->wc_users))
1781 + web100_stats_free(stats);
1784 +/* A mapping between Linux and Web100 states. This could easily just
1786 +static inline int web100_state(int state)
1789 + case TCP_ESTABLISHED: return WC_STATE_ESTABLISHED;
1790 + case TCP_SYN_SENT: return WC_STATE_SYNSENT;
1791 + case TCP_SYN_RECV: return WC_STATE_SYNRECEIVED;
1792 + case TCP_FIN_WAIT1: return WC_STATE_FINWAIT1;
1793 + case TCP_FIN_WAIT2: return WC_STATE_FINWAIT2;
1794 + case TCP_TIME_WAIT: return WC_STATE_TIMEWAIT;
1795 + case TCP_CLOSE: return WC_STATE_CLOSED;
1796 + case TCP_CLOSE_WAIT: return WC_STATE_CLOSEWAIT;
1797 + case TCP_LAST_ACK: return WC_STATE_LASTACK;
1798 + case TCP_LISTEN: return WC_STATE_LISTEN;
1799 + case TCP_CLOSING: return WC_STATE_CLOSING;
1800 + default: return 0;
1804 +#endif /* CONFIG_WEB100_STATS */
1806 +#endif /* _WEB100_H */
1807 diff -Nurp linux-2.6.22-680/include/net/web100_stats.h linux-2.6.22-690/include/net/web100_stats.h
1808 --- linux-2.6.22-680/include/net/web100_stats.h 1970-01-01 01:00:00.000000000 +0100
1809 +++ linux-2.6.22-690/include/net/web100_stats.h 2008-11-14 21:20:17.000000000 +0100
1812 + * include/net/web100_stats.h
1814 + * Copyright (C) 2001 Matt Mathis <mathis@psc.edu>
1815 + * Copyright (C) 2001 John Heffner <jheffner@psc.edu>
1816 + * Copyright (C) 2000 Jeff Semke <semke@psc.edu>
1818 + * The Web 100 project. See http://www.web100.org
1820 + * This program is free software; you can redistribute it and/or
1821 + * modify it under the terms of the GNU General Public License
1822 + * as published by the Free Software Foundation; either version
1823 + * 2 of the License, or (at your option) any later version.
1827 +/* TODO: make sure that the time duration states below include:
1828 + Congestion Avoidance, Slow Start, Timeouts, Idle Application, and
1829 + Window Limited cases */
1830 +/* TODO: Consider adding sysctl variable to enable/disable WC stats updates.
1831 + Probably should still create stats structures if compiled with WC support,
1832 + even if sysctl(wc) is turned off. That would allow the stats to be updated
1833 + if the sysctl(wc) is turned back on. */
1834 +/* TODO: Add all variables needed to do user-level auto-tuning, including
1835 + writeable parameters */
1838 +#ifndef _WEB100_STATS_H
1839 +#define _WEB100_STATS_H
1841 +enum wc_sndlim_states {
1842 + WC_SNDLIM_NONE = -1,
1846 + WC_SNDLIM_STARTUP,
1847 + WC_SNDLIM_NSTATES /* Keep at end */
1850 +#ifndef CONFIG_WEB100_STATS
1852 +#define WEB100_VAR_INC(tp,var) do {} while (0)
1853 +#define WEB100_VAR_DEC(tp,var) do {} while (0)
1854 +#define WEB100_VAR_SET(tp,var,val) do {} while (0)
1855 +#define WEB100_VAR_ADD(tp,var,val) do {} while (0)
1856 +#define WEB100_UPDATE_FUNC(tp,func) do {} while (0)
1857 +#define NET100_WAD(tp, var, def) (def)
1859 +#else /* CONFIG_WEB100_STATS */ /* { */
1861 +#include <linux/spinlock.h>
1863 +#define WEB100_CHECK(tp,expr) \
1864 + do { if ((tp)->tcp_stats) (expr); } while (0)
1865 +#define WEB100_VAR_INC(tp,var) \
1866 + WEB100_CHECK(tp, ((tp)->tcp_stats->wc_vars.var)++)
1867 +#define WEB100_VAR_DEC(tp,var) \
1868 + WEB100_CHECK(tp, ((tp)->tcp_stats->wc_vars.var)--)
1869 +#define WEB100_VAR_ADD(tp,var,val) \
1870 + WEB100_CHECK(tp, ((tp)->tcp_stats->wc_vars.var) += (val))
1871 +#define WEB100_VAR_SET(tp,var,val) \
1872 + WEB100_CHECK(tp, ((tp)->tcp_stats->wc_vars.var) = (val))
1873 +#define WEB100_UPDATE_FUNC(tp,func) \
1874 + WEB100_CHECK(tp, func)
1875 +#ifdef CONFIG_WEB100_NET100
1876 +#define NET100_WAD(tp, var, def) \
1877 + (((tp)->tcp_stats && (tp)->tcp_stats->wc_vars.var) ? (tp)->tcp_stats->wc_vars.var : (def))
1879 +#define NET100_WAD(tp, var, def) (def)
1882 +/* SMIv2 types - RFC 1902 */
1883 +typedef __s32 INTEGER;
1884 +typedef INTEGER Integer32;
1885 +typedef __u32 IpAddress;
1886 +typedef __u32 Counter32;
1887 +typedef __u32 Unsigned32;
1888 +typedef Unsigned32 Gauge32;
1889 +typedef __u32 TimeTicks;
1890 +typedef __u64 Counter64;
1891 +typedef __u16 Unsigned16;
1893 +/* New inet address types specified in INET-ADDRESS-MIB */
1894 +typedef Unsigned16 InetPortNumber;
1896 + WC_ADDRTYPE_UNKNOWN = 0,
1899 + WC_ADDRTYPE_DNS = 16
1901 +typedef IpAddress InetAddresIPv4;
1907 + InetAddresIPv4 v4addr;
1908 + InetAddresIPv6 v6addr;
1912 + truthValueTrue = 1,
1913 + truthValueFalse = 2
1917 + WC_STATE_CLOSED = 1,
1920 + WC_STATE_SYNRECEIVED,
1921 + WC_STATE_ESTABLISHED,
1922 + WC_STATE_FINWAIT1,
1923 + WC_STATE_FINWAIT2,
1924 + WC_STATE_CLOSEWAIT,
1927 + WC_STATE_TIMEWAIT,
1931 +enum wc_stunemodes {
1932 + WC_STUNEMODE_DEFAULT = 0, /* OS native */
1933 + WC_STUNEMODE_SETSOCKOPT, /* OS native setsockopt() */
1934 + WC_STUNEMODE_FIXED, /* Manual via the web100 API */
1935 + WC_STUNEMODE_AUTO,
1936 + WC_STUNEMODE_EXP1,
1940 +enum wc_rtunemodes {
1941 + WC_RTUNEMODE_DEFAULT = 0,
1942 + WC_RTUNEMODE_SETSOCKOPT,
1943 + WC_RTUNEMODE_FIXED,
1944 + WC_RTUNEMODE_AUTO,
1945 + WC_RTUNEMODE_EXP1,
1950 + WC_BUFMODE_OS = 0,
1951 + WC_BUFMODE_WEB100,
1955 + WC_SE_BELOW_DATA_WINDOW = 1,
1956 + WC_SE_ABOVE_DATA_WINDOW,
1957 + WC_SE_BELOW_ACK_WINDOW,
1958 + WC_SE_ABOVE_ACK_WINDOW,
1959 + WC_SE_BELOW_TSW_WINDOW,
1960 + WC_SE_ABOVE_TSW_WINDOW,
1961 + WC_SE_DATA_CHECKSUM
1966 + * Variables that can be read and written directly.
1968 + * Should contain most variables from TCP-KIS 0.1. Commented feilds are
1969 + * either not implemented or have handlers and do not need struct storage.
1971 +struct web100directs {
1974 + TruthValue SACKEnabled;
1975 + TruthValue TimestampsEnabled;
1976 + TruthValue NagleEnabled;
1977 + TruthValue ECNEnabled;
1978 + Integer32 SndWinScale;
1979 + Integer32 RcvWinScale;
1982 + INTEGER ActiveOpen;
1983 + /* Gauge32 MSSSent; */
1985 + Integer32 WinScaleRcvd;
1986 + Integer32 WinScaleSent;
1987 + /* INTEGER SACKokSent; */
1988 + /* INTEGER SACKokRcvd; */
1989 + /* INTEGER TimestampSent; */
1990 + /* INTEGER TimestampRcvd; */
1993 + Counter32 PktsOut;
1994 + Counter32 DataPktsOut;
1995 + Counter32 AckPktsOut; /* DEPRICATED */
1996 + Counter64 DataBytesOut;
1998 + Counter32 DataPktsIn;
1999 + Counter32 AckPktsIn; /* DEPRICATED */
2000 + Counter64 DataBytesIn;
2001 + /* Counter32 SoftErrors; */
2002 + /* INTEGER SoftErrorReason; */
2004 + Unsigned32 SndNxt;
2006 + Counter64 ThruBytesAcked;
2007 + Counter32 SndISS; /* SndInitial */
2008 + Counter32 SendWraps; /* DEPRICATED */
2010 + Counter64 ThruBytesReceived;
2011 + Counter32 RecvISS; /* RecInitial */
2012 + Counter32 RecvWraps; /* DEPRICATED */
2013 + /* Counter64 Duration; */
2014 + Integer32 StartTime; /* DEPRICATED */
2015 + Integer32 StartTimeSec;
2016 + Integer32 StartTimeUsec;
2018 + /* SENDER CONGESTION */
2019 + Counter32 SndLimTrans[WC_SNDLIM_NSTATES];
2020 + Counter32 SndLimTime[WC_SNDLIM_NSTATES];
2021 + Counter64 SndLimBytes[WC_SNDLIM_NSTATES];
2022 + Counter32 SlowStart;
2023 + Counter32 CongAvoid;
2024 + Counter32 CongestionSignals;
2025 + Counter32 OtherReductions;
2026 + Counter32 X_OtherReductionsCV;
2027 + Counter32 X_OtherReductionsCM;
2028 + Counter32 CongestionOverCount;
2031 + /* Gauge32 LimCwnd; */
2032 + Gauge32 CurSsthresh;
2033 + Gauge32 MaxSsthresh;
2034 + Gauge32 MinSsthresh;
2036 + /* SENDER PATH MODEL */
2037 + Counter32 FastRetran;
2038 + Counter32 Timeouts;
2039 + Counter32 SubsequentTimeouts;
2040 + Gauge32 CurTimeoutCount;
2041 + Counter32 AbruptTimeouts;
2042 + Counter32 PktsRetrans;
2043 + Counter32 BytesRetrans;
2044 + Counter32 DupAcksIn;
2045 + Counter32 SACKsRcvd;
2046 + Counter32 SACKBlocksRcvd;
2047 + Counter32 PreCongSumCwnd;
2048 + Counter32 PreCongSumRTT;
2049 + Counter32 PreCongCountRTT; /* DEPRICATED */
2050 + Counter32 PostCongSumRTT;
2051 + Counter32 PostCongCountRTT;
2052 + /* Counter32 ECNsignals; */
2053 + Counter32 ECERcvd;
2054 + Counter32 SendStall;
2055 + Counter32 QuenchRcvd;
2056 + Gauge32 RetranThresh;
2057 + /* Counter32 SndDupAckEpisodes; */
2058 + /* Counter64 SumBytesReordered; */
2059 + Counter32 NonRecovDA;
2060 + Counter32 AckAfterFR;
2061 + Counter32 DSACKDups;
2062 + Gauge32 SampleRTT;
2063 + Gauge32 SmoothedRTT;
2068 + Counter32 CountRTT;
2076 + /* LOCAL SENDER BUFFER */
2077 + Gauge32 CurRetxQueue;
2078 + Gauge32 MaxRetxQueue;
2079 + Gauge32 CurAppWQueue;
2080 + Gauge32 MaxAppWQueue;
2082 + /* LOCAL RECEIVER */
2083 + Gauge32 CurRwinSent;
2084 + Gauge32 MaxRwinSent;
2085 + Gauge32 MinRwinSent;
2086 + Integer32 LimRwin;
2087 + /* Counter32 DupAckEpisodes; */
2088 + Counter32 DupAcksOut;
2089 + /* Counter32 CERcvd; */
2090 + /* Counter32 ECNSent; */
2091 + /* Counter32 ECNNonceRcvd; */
2092 + Gauge32 CurReasmQueue;
2093 + Gauge32 MaxReasmQueue;
2094 + Gauge32 CurAppRQueue;
2095 + Gauge32 MaxAppRQueue;
2096 + Gauge32 X_rcv_ssthresh;
2097 + Gauge32 X_wnd_clamp;
2103 + /* OBSERVED RECEIVER */
2104 + Gauge32 CurRwinRcvd;
2105 + Gauge32 MaxRwinRcvd;
2106 + Gauge32 MinRwinRcvd;
2108 + /* CONNECTION ID */
2109 + InetAddressType LocalAddressType;
2110 + InetAddress LocalAddress;
2111 + InetPortNumber LocalPort;
2112 + /* InetAddressType RemAddressType; */
2113 + InetAddress RemAddress;
2114 + InetPortNumber RemPort;
2115 + /* Integer32 IdId; */
2119 +#ifdef CONFIG_WEB100_NET100
2120 + /* support for the NET100 Work Around Deamon (WAD) */
2122 + Gauge32 WAD_MaxBurst;
2123 + Gauge32 WAD_MaxSsthresh;
2125 + Integer32 WAD_CwndAdjust;
2129 +struct web100stats {
2132 + struct sock *wc_sk;
2134 + atomic_t wc_users;
2137 + struct web100stats *wc_next;
2138 + struct web100stats *wc_prev;
2140 + struct web100stats *wc_hash_next;
2141 + struct web100stats *wc_hash_prev;
2143 + struct web100stats *wc_death_next;
2146 + __u64 wc_limstate_bytes;
2147 + __u64 wc_limstate_time;
2149 + __u64 wc_start_monotime;
2151 + struct web100directs wc_vars;
2154 +#endif /* CONFIG_WEB100_STATS */ /* } */
2156 +#endif /*_WEB100_STATS_H */
2157 diff -Nurp linux-2.6.22-680/net/ipv4/Kconfig linux-2.6.22-690/net/ipv4/Kconfig
2158 --- linux-2.6.22-680/net/ipv4/Kconfig 2008-11-12 17:40:46.000000000 +0100
2159 +++ linux-2.6.22-690/net/ipv4/Kconfig 2008-11-14 21:20:17.000000000 +0100
2160 @@ -658,6 +658,70 @@ config TCP_MD5SIG
2165 + bool "IP: Web100 networking enhancements"
2166 + depends on INET && EXPERIMENTAL
2170 +config WEB100_STATS
2171 + bool "Web100: Extended TCP statistics"
2174 + Support for the Web100 implementation of the TCP extended stastics
2175 + MIB (see http://www.web100.org/mib/).
2179 +config WEB100_FPERMS
2180 + int "Web100: Default file permissions"
2181 + depends on WEB100_STATS
2184 + This controls the default file permission bits on the Web100
2185 + files in /proc/web100. This value can be changed at runtime using
2186 + the sysctl variable net.ipv4.web100_fperms. Unless all users on
2187 + the system are trusted, it is safest to limit both readability
2188 + and writability to trusted users.
2190 + Due to limitations of the kernel config scripts, this is a decimal
2191 + value rather than octal. Some common values:
2193 + 384 = 0600 = rw-------
2194 + 416 = 0640 = rw-r-----
2195 + 432 = 0660 = rw-rw----
2196 + 436 = 0664 = rw-rw-r--
2197 + 438 = 0666 = rw-rw-rw-
2200 + int "Web100: Default gid"
2201 + depends on WEB100_STATS
2204 + This will be the default group of the Web100 files in /proc/web100.
2205 + It may be useful to create a "web100" group on your system, and set
2206 + CONFIG_WEB100_FPERMS (above) with special group permissions. This
2207 + value can be changed at runtime using the sysctl variable
2208 + net.ipv4.web100_gid.
2210 +config WEB100_NET100
2211 + bool "Web100: Net100 extensions"
2212 + depends on WEB100_STATS
2214 + Enables certain "Net100" extensions to TCP that are controlled by
2215 + writable MIB variables. These controls may be particularly useful
2216 + for specially tuning a flow on a long fast network.
2220 +config WEB100_NETLINK
2221 + bool "Web100: Netlink event notification service"
2224 + Required by the Net100 Work Around Daemon (WAD).
2228 source "net/ipv4/ipvs/Kconfig"
2231 diff -Nurp linux-2.6.22-680/net/ipv4/Makefile linux-2.6.22-690/net/ipv4/Makefile
2232 --- linux-2.6.22-680/net/ipv4/Makefile 2007-07-09 01:32:17.000000000 +0200
2233 +++ linux-2.6.22-690/net/ipv4/Makefile 2008-11-14 21:20:17.000000000 +0100
2234 @@ -29,6 +29,7 @@ obj-$(CONFIG_INET_TUNNEL) += tunnel4.o
2235 obj-$(CONFIG_INET_XFRM_MODE_TRANSPORT) += xfrm4_mode_transport.o
2236 obj-$(CONFIG_INET_XFRM_MODE_TUNNEL) += xfrm4_mode_tunnel.o
2237 obj-$(CONFIG_IP_PNP) += ipconfig.o
2238 +obj-$(CONFIG_WEB100_STATS) += web100_stats.o
2239 obj-$(CONFIG_IP_ROUTE_MULTIPATH_RR) += multipath_rr.o
2240 obj-$(CONFIG_IP_ROUTE_MULTIPATH_RANDOM) += multipath_random.o
2241 obj-$(CONFIG_IP_ROUTE_MULTIPATH_WRANDOM) += multipath_wrandom.o
2242 diff -Nurp linux-2.6.22-680/net/ipv4/sysctl_net_ipv4.c linux-2.6.22-690/net/ipv4/sysctl_net_ipv4.c
2243 --- linux-2.6.22-680/net/ipv4/sysctl_net_ipv4.c 2008-11-12 17:40:46.000000000 +0100
2244 +++ linux-2.6.22-690/net/ipv4/sysctl_net_ipv4.c 2008-11-14 21:20:17.000000000 +0100
2245 @@ -862,6 +862,42 @@ ctl_table ipv4_table[] = {
2247 .proc_handler = &proc_dointvec,
2249 +#ifdef CONFIG_WEB100_NET100
2251 + .ctl_name = NET_IPV4_WAD_IFQ,
2252 + .procname = "WAD_IFQ",
2253 + .data = &sysctl_WAD_IFQ,
2254 + .maxlen = sizeof(int),
2256 + .proc_handler = &proc_dointvec,
2259 + .ctl_name = NET_IPV4_WAD_MAX_BURST,
2260 + .procname = "WAD_MaxBurst",
2261 + .data = &sysctl_WAD_MaxBurst,
2262 + .maxlen = sizeof(int),
2264 + .proc_handler = &proc_dointvec,
2267 +#ifdef CONFIG_WEB100_STATS
2269 + .ctl_name = NET_IPV4_WEB100_FPERMS,
2270 + .procname = "web100_fperms",
2271 + .data = &sysctl_web100_fperms,
2272 + .maxlen = sizeof(int),
2274 + .proc_handler = &web100_proc_dointvec_update,
2277 + .ctl_name = NET_IPV4_WEB100_GID,
2278 + .procname = "web100_gid",
2279 + .data = &sysctl_web100_gid,
2280 + .maxlen = sizeof(int),
2282 + .proc_handler = &web100_proc_dointvec_update,
2288 diff -Nurp linux-2.6.22-680/net/ipv4/tcp.c linux-2.6.22-690/net/ipv4/tcp.c
2289 --- linux-2.6.22-680/net/ipv4/tcp.c 2008-11-12 17:40:30.000000000 +0100
2290 +++ linux-2.6.22-690/net/ipv4/tcp.c 2008-11-14 21:20:17.000000000 +0100
2291 @@ -285,6 +285,16 @@ EXPORT_SYMBOL(sysctl_tcp_mem);
2292 EXPORT_SYMBOL(sysctl_tcp_rmem);
2293 EXPORT_SYMBOL(sysctl_tcp_wmem);
2295 +#ifdef CONFIG_WEB100_NET100
2296 +int sysctl_WAD_IFQ = 0;
2297 +int sysctl_WAD_MaxBurst = 3;
2298 +EXPORT_SYMBOL(sysctl_WAD_MaxBurst);
2300 +#ifdef CONFIG_WEB100_STATS
2301 +int sysctl_web100_fperms = CONFIG_WEB100_FPERMS;
2302 +int sysctl_web100_gid = CONFIG_WEB100_GID;
2305 atomic_t tcp_memory_allocated; /* Current allocated memory. */
2306 atomic_t tcp_sockets_allocated; /* Current number of TCP sockets. */
2308 @@ -596,8 +606,12 @@ new_segment:
2310 set_bit(SOCK_NOSPACE, &sk->sk_socket->flags);
2314 tcp_push(sk, flags & ~MSG_MORE, mss_now, TCP_NAGLE_PUSH);
2315 +#ifdef CONFIG_WEB100_STATS
2316 + web100_update_writeq(sk);
2320 if ((err = sk_stream_wait_memory(sk, &timeo)) != 0)
2322 @@ -842,8 +856,12 @@ new_segment:
2324 set_bit(SOCK_NOSPACE, &sk->sk_socket->flags);
2328 tcp_push(sk, flags & ~MSG_MORE, mss_now, TCP_NAGLE_PUSH);
2329 +#ifdef CONFIG_WEB100_STATS
2330 + web100_update_writeq(sk);
2334 if ((err = sk_stream_wait_memory(sk, &timeo)) != 0)
2336 @@ -1191,6 +1209,9 @@ int tcp_recvmsg(struct kiocb *iocb, stru
2337 BUG_TRAP(flags & MSG_PEEK);
2339 } while (skb != (struct sk_buff *)&sk->sk_receive_queue);
2340 +#ifdef CONFIG_WEB100_STATS
2341 + web100_update_recvq(sk);
2344 /* Well, if we have backlog, try to process it now yet. */
2346 @@ -1838,6 +1859,7 @@ static int do_tcp_setsockopt(struct sock
2348 tp->nonagle &= ~TCP_NAGLE_OFF;
2350 + WEB100_VAR_SET(tp, NagleEnabled, !tp->nonagle);
2354 @@ -1860,6 +1882,7 @@ static int do_tcp_setsockopt(struct sock
2355 tp->nonagle |= TCP_NAGLE_PUSH;
2356 tcp_push_pending_frames(sk);
2358 + WEB100_VAR_SET(tp, NagleEnabled, !tp->nonagle);
2362 @@ -2507,6 +2530,10 @@ void __init tcp_init(void)
2363 tcp_hashinfo.ehash_size, tcp_hashinfo.bhash_size);
2365 tcp_register_congestion_control(&tcp_reno);
2367 +#ifdef CONFIG_WEB100_STATS
2368 + web100_stats_init();
2372 EXPORT_SYMBOL(tcp_close);
2373 diff -Nurp linux-2.6.22-680/net/ipv4/tcp_cong.c linux-2.6.22-690/net/ipv4/tcp_cong.c
2374 --- linux-2.6.22-680/net/ipv4/tcp_cong.c 2007-07-09 01:32:17.000000000 +0200
2375 +++ linux-2.6.22-690/net/ipv4/tcp_cong.c 2008-11-14 21:20:17.000000000 +0100
2376 @@ -297,7 +297,8 @@ void tcp_slow_start(struct tcp_sock *tp)
2379 if (sysctl_tcp_max_ssthresh > 0 && tp->snd_cwnd > sysctl_tcp_max_ssthresh)
2380 - cnt = sysctl_tcp_max_ssthresh >> 1; /* limited slow start */
2381 + /* limited slow start */
2382 + cnt = NET100_WAD(tp, WAD_MaxSsthresh, sysctl_tcp_max_ssthresh) >> 1;
2384 cnt = tp->snd_cwnd; /* exponential increase */
2386 @@ -333,8 +334,10 @@ void tcp_reno_cong_avoid(struct sock *sk
2389 /* In "safe" area, increase. */
2390 - if (tp->snd_cwnd <= tp->snd_ssthresh)
2391 + if (tp->snd_cwnd <= tp->snd_ssthresh) {
2393 + WEB100_VAR_INC(tp, SlowStart);
2396 /* In dangerous area, increase slowly. */
2397 else if (sysctl_tcp_abc) {
2398 @@ -346,6 +349,7 @@ void tcp_reno_cong_avoid(struct sock *sk
2399 if (tp->snd_cwnd < tp->snd_cwnd_clamp)
2402 + WEB100_VAR_INC(tp, CongAvoid);
2404 /* In theory this is tp->snd_cwnd += 1 / tp->snd_cwnd */
2405 if (tp->snd_cwnd_cnt >= tp->snd_cwnd) {
2406 @@ -354,6 +358,7 @@ void tcp_reno_cong_avoid(struct sock *sk
2407 tp->snd_cwnd_cnt = 0;
2410 + WEB100_VAR_INC(tp, CongAvoid);
2413 EXPORT_SYMBOL_GPL(tcp_reno_cong_avoid);
2414 diff -Nurp linux-2.6.22-680/net/ipv4/tcp_input.c linux-2.6.22-690/net/ipv4/tcp_input.c
2415 --- linux-2.6.22-680/net/ipv4/tcp_input.c 2008-11-12 17:40:03.000000000 +0100
2416 +++ linux-2.6.22-690/net/ipv4/tcp_input.c 2008-11-14 21:20:17.000000000 +0100
2417 @@ -415,6 +415,7 @@ static void tcp_rcv_rtt_update(struct tc
2419 if (tp->rcv_rtt_est.rtt != new_sample)
2420 tp->rcv_rtt_est.rtt = new_sample;
2421 + WEB100_VAR_SET(tp, X_RcvRTT, ((1000000*tp->rcv_rtt_est.rtt)/HZ)>>3);
2424 static inline void tcp_rcv_rtt_measure(struct tcp_sock *tp)
2425 @@ -553,6 +554,7 @@ static void tcp_event_data_recv(struct s
2427 if (skb->len >= 128)
2428 tcp_grow_window(sk, skb);
2429 + WEB100_UPDATE_FUNC(tp, web100_update_rcv_nxt(tp));
2432 /* Called to compute a smoothed rtt estimate. The data fed to this
2433 @@ -790,6 +792,7 @@ void tcp_enter_cwr(struct sock *sk, cons
2435 tcp_set_ca_state(sk, TCP_CA_CWR);
2437 + WEB100_UPDATE_FUNC(tp, web100_update_congestion(tp, 0));
2440 /* Initialize metrics on socket. */
2441 @@ -815,6 +818,7 @@ static void tcp_init_metrics(struct sock
2442 tp->reordering != dst_metric(dst, RTAX_REORDERING)) {
2443 tp->rx_opt.sack_ok &= ~2;
2444 tp->reordering = dst_metric(dst, RTAX_REORDERING);
2445 + WEB100_VAR_SET(tp, RetranThresh, tp->reordering);
2448 if (dst_metric(dst, RTAX_RTT) == 0)
2449 @@ -871,6 +875,7 @@ static void tcp_update_reordering(struct
2450 struct tcp_sock *tp = tcp_sk(sk);
2451 if (metric > tp->reordering) {
2452 tp->reordering = min(TCP_MAX_REORDERING, metric);
2453 + WEB100_VAR_SET(tp, RetranThresh, tp->reordering);
2455 /* This exciting event is worth to be remembered. 8) */
2457 @@ -961,6 +966,9 @@ tcp_sacktag_write_queue(struct sock *sk,
2459 int first_sack_index;
2461 + WEB100_VAR_INC(tp, SACKsRcvd);
2462 + WEB100_VAR_ADD(tp, SACKBlocksRcvd, num_sacks);
2464 if (!tp->sacked_out)
2465 tp->fackets_out = 0;
2466 prior_fackets = tp->fackets_out;
2467 @@ -980,6 +988,9 @@ tcp_sacktag_write_queue(struct sock *sk,
2468 NET_INC_STATS_BH(LINUX_MIB_TCPDSACKOFORECV);
2471 + if (found_dup_sack)
2472 + WEB100_VAR_INC(tp, DSACKDups);
2474 /* D-SACK for already forgotten data...
2475 * Do dumb counting. */
2476 if (found_dup_sack &&
2477 @@ -1472,6 +1483,8 @@ void tcp_enter_loss(struct sock *sk, int
2478 struct sk_buff *skb;
2481 + WEB100_UPDATE_FUNC(tp, web100_update_congestion(tp, 0));
2483 /* Reduce ssthresh if it has not yet been made inside this window. */
2484 if (icsk->icsk_ca_state <= TCP_CA_Disorder || tp->snd_una == tp->high_seq ||
2485 (icsk->icsk_ca_state == TCP_CA_Loss && !icsk->icsk_retransmits)) {
2486 @@ -1511,6 +1524,7 @@ void tcp_enter_loss(struct sock *sk, int
2488 tp->reordering = min_t(unsigned int, tp->reordering,
2489 sysctl_tcp_reordering);
2490 + WEB100_VAR_SET(tp, RetranThresh, tp->reordering);
2491 tcp_set_ca_state(sk, TCP_CA_Loss);
2492 tp->high_seq = tp->snd_nxt;
2493 TCP_ECN_queue_cwr(tp);
2494 @@ -1845,8 +1859,19 @@ static void tcp_update_scoreboard(struct
2496 static inline void tcp_moderate_cwnd(struct tcp_sock *tp)
2498 +#ifdef CONFIG_WEB100_STATS
2500 + u32 t = tcp_packets_in_flight(tp) + tcp_max_burst(tp);
2501 + if (t < tp->snd_cwnd) {
2503 + WEB100_VAR_INC(tp, OtherReductions);
2504 + WEB100_VAR_INC(tp, X_OtherReductionsCM);
2508 tp->snd_cwnd = min(tp->snd_cwnd,
2509 tcp_packets_in_flight(tp)+tcp_max_burst(tp));
2511 tp->snd_cwnd_stamp = tcp_time_stamp;
2514 @@ -1929,6 +1954,7 @@ static void tcp_undo_cwr(struct sock *sk
2516 tcp_moderate_cwnd(tp);
2517 tp->snd_cwnd_stamp = tcp_time_stamp;
2518 + WEB100_VAR_INC(tp, CongestionOverCount); /* XXX This is wrong. -JWH */
2520 /* There is something screwy going on with the retrans hints after
2522 @@ -2271,6 +2297,8 @@ tcp_fastretrans_alert(struct sock *sk, u
2523 tp->bytes_acked = 0;
2524 tp->snd_cwnd_cnt = 0;
2525 tcp_set_ca_state(sk, TCP_CA_Recovery);
2526 + WEB100_UPDATE_FUNC(tp, web100_update_congestion(tp, 0));
2527 + WEB100_VAR_INC(tp, FastRetran); /* WEB100_XXX */
2530 if (do_lost || tcp_head_timedout(sk))
2531 @@ -2303,6 +2331,7 @@ static void tcp_ack_saw_tstamp(struct so
2532 const __u32 seq_rtt = tcp_time_stamp - tp->rx_opt.rcv_tsecr;
2533 tcp_rtt_estimator(sk, seq_rtt);
2535 + WEB100_UPDATE_FUNC(tp, web100_update_rtt(sk, seq_rtt));
2536 inet_csk(sk)->icsk_backoff = 0;
2539 @@ -2323,6 +2352,7 @@ static void tcp_ack_no_tstamp(struct soc
2541 tcp_rtt_estimator(sk, seq_rtt);
2543 + WEB100_UPDATE_FUNC(tcp_sk(sk), web100_update_rtt(sk, seq_rtt));
2544 inet_csk(sk)->icsk_backoff = 0;
2547 @@ -2342,6 +2372,27 @@ static void tcp_cong_avoid(struct sock *
2548 u32 in_flight, int good)
2550 const struct inet_connection_sock *icsk = inet_csk(sk);
2551 +#ifdef CONFIG_WEB100_STATS
2552 + struct tcp_sock *tp = tcp_sk(sk);
2553 + struct web100stats *stats = tp->tcp_stats;
2554 + struct web100directs *vars = &stats->wc_vars;
2556 + if (tp->snd_cwnd > tp->snd_cwnd_clamp) {
2561 +#ifdef CONFIG_WEB100_NET100
2562 + if (vars->WAD_NoAI) {
2563 + tp->snd_cwnd += vars->WAD_CwndAdjust;
2564 + vars->WAD_CwndAdjust = 0;
2565 + tp->snd_cwnd_stamp = tcp_time_stamp;
2566 + tp->snd_cwnd = min(tp->snd_cwnd, (__u32)tp->snd_cwnd_clamp);
2572 icsk->icsk_ca_ops->cong_avoid(sk, ack, rtt, in_flight, good);
2573 tcp_sk(sk)->snd_cwnd_stamp = tcp_time_stamp;
2575 @@ -2620,10 +2671,12 @@ static int tcp_ack_update_window(struct
2576 tp->max_window = nwin;
2577 tcp_sync_mss(sk, inet_csk(sk)->icsk_pmtu_cookie);
2579 + WEB100_UPDATE_FUNC(tp, web100_update_rwin_rcvd(tp));
2584 + WEB100_UPDATE_FUNC(tp, web100_update_snd_una(tp));
2588 @@ -2804,6 +2857,7 @@ static int tcp_ack(struct sock *sk, stru
2590 tcp_update_wl(tp, ack, ack_seq);
2592 + WEB100_UPDATE_FUNC(tp, web100_update_snd_una(tp));
2593 flag |= FLAG_WIN_UPDATE;
2595 tcp_ca_event(sk, CA_EVENT_FAST_ACK);
2596 @@ -2820,8 +2874,10 @@ static int tcp_ack(struct sock *sk, stru
2597 if (TCP_SKB_CB(skb)->sacked)
2598 flag |= tcp_sacktag_write_queue(sk, skb, prior_snd_una);
2600 - if (TCP_ECN_rcv_ecn_echo(tp, tcp_hdr(skb)))
2601 + if (TCP_ECN_rcv_ecn_echo(tp, tcp_hdr(skb))) {
2603 + WEB100_VAR_INC(tp, ECERcvd);
2606 tcp_ca_event(sk, CA_EVENT_SLOW_ACK);
2608 @@ -3240,6 +3296,8 @@ static void tcp_send_dupack(struct sock
2610 struct tcp_sock *tp = tcp_sk(sk);
2612 + WEB100_VAR_INC(tp, DupAcksOut);
2614 if (TCP_SKB_CB(skb)->end_seq != TCP_SKB_CB(skb)->seq &&
2615 before(TCP_SKB_CB(skb)->seq, tp->rcv_nxt)) {
2616 NET_INC_STATS_BH(LINUX_MIB_DELAYEDACKLOST);
2617 @@ -3499,6 +3557,10 @@ queue_and_out:
2619 tcp_fast_path_check(sk);
2621 +#ifdef CONFIG_WEB100_STATS
2622 + web100_update_recvq(sk);
2627 else if (!sock_flag(sk, SOCK_DEAD))
2628 @@ -3557,6 +3619,9 @@ drop:
2629 SOCK_DEBUG(sk, "out of order segment: rcv_next %X seq %X - %X\n",
2630 tp->rcv_nxt, TCP_SKB_CB(skb)->seq, TCP_SKB_CB(skb)->end_seq);
2632 +#ifdef CONFIG_WEB100_STATS
2633 + web100_update_recvq(sk);
2635 sk_stream_set_owner_r(skb, sk);
2637 if (!skb_peek(&tp->out_of_order_queue)) {
2638 @@ -3851,6 +3916,8 @@ void tcp_cwnd_application_limited(struct
2639 if (win_used < tp->snd_cwnd) {
2640 tp->snd_ssthresh = tcp_current_ssthresh(sk);
2641 tp->snd_cwnd = (tp->snd_cwnd + win_used) >> 1;
2642 + WEB100_VAR_INC(tp, OtherReductions);
2643 + WEB100_VAR_INC(tp, X_OtherReductionsCV);
2645 tp->snd_cwnd_used = 0;
2647 @@ -4323,6 +4390,9 @@ int tcp_rcv_established(struct sock *sk,
2648 tp->rcv_nxt = TCP_SKB_CB(skb)->end_seq;
2651 +#ifdef CONFIG_WEB100_STATS
2652 + web100_update_recvq(sk);
2654 tcp_event_data_recv(sk, skb);
2656 if (TCP_SKB_CB(skb)->ack_seq != tp->snd_una) {
2657 @@ -4529,6 +4599,9 @@ static int tcp_rcv_synsent_state_process
2658 tp->copied_seq = tp->rcv_nxt;
2660 tcp_set_state(sk, TCP_ESTABLISHED);
2661 +#ifdef CONFIG_WEB100_STATS
2662 + web100_stats_establish(sk);
2665 security_inet_conn_established(sk, skb);
2667 @@ -4780,6 +4853,9 @@ int tcp_rcv_state_process(struct sock *s
2669 tcp_set_state(sk, TCP_ESTABLISHED);
2670 sk->sk_state_change(sk);
2671 +#ifdef CONFIG_WEB100_STATS
2672 + web100_stats_establish(sk);
2675 /* Note, that this wakeup is only for marginal
2676 * crossed SYN case. Passively open sockets
2677 diff -Nurp linux-2.6.22-680/net/ipv4/tcp_ipv4.c linux-2.6.22-690/net/ipv4/tcp_ipv4.c
2678 --- linux-2.6.22-680/net/ipv4/tcp_ipv4.c 2008-11-12 17:40:30.000000000 +0100
2679 +++ linux-2.6.22-690/net/ipv4/tcp_ipv4.c 2008-11-14 21:20:17.000000000 +0100
2680 @@ -266,6 +266,10 @@ int tcp_v4_connect(struct sock *sk, stru
2684 + WEB100_VAR_SET(tp, SndISS, tp->write_seq);
2685 + WEB100_VAR_SET(tp, SndMax, tp->write_seq);
2686 + WEB100_VAR_SET(tp, SndNxt, tp->write_seq);
2687 + WEB100_VAR_SET(tp, SndUna, tp->write_seq);
2689 inet->id = tp->write_seq ^ jiffies;
2691 @@ -399,6 +403,7 @@ void tcp_v4_err(struct sk_buff *skb, u32
2694 case ICMP_SOURCE_QUENCH:
2695 + WEB100_VAR_INC(tp, QuenchRcvd);
2696 /* Just silently ignore these. */
2698 case ICMP_PARAMETERPROB:
2699 @@ -1433,6 +1438,13 @@ struct sock *tcp_v4_syn_recv_sock(struct
2700 newsk = tcp_create_openreq_child(sk, req, skb);
2703 +#ifdef CONFIG_WEB100_STATS
2704 + if (web100_stats_create(newsk)) {
2708 + tcp_sk(newsk)->tcp_stats->wc_vars.LocalAddressType = WC_ADDRTYPE_IPV4;
2711 newsk->sk_gso_type = SKB_GSO_TCPV4;
2712 sk_setup_caps(newsk, dst);
2713 @@ -1675,6 +1687,7 @@ process:
2716 bh_lock_sock_nested(sk);
2717 + WEB100_UPDATE_FUNC(tcp_sk(sk), web100_update_segrecv(tcp_sk(sk), skb));
2719 if (!sock_owned_by_user(sk)) {
2720 #ifdef CONFIG_NET_DMA
2721 @@ -1691,6 +1704,7 @@ process:
2724 sk_add_backlog(sk, skb);
2725 + WEB100_UPDATE_FUNC(tcp_sk(sk), web100_update_cwnd(tcp_sk(sk)));
2729 @@ -1882,6 +1896,16 @@ static int tcp_v4_init_sock(struct sock
2730 sk->sk_sndbuf = sysctl_tcp_wmem[1];
2731 sk->sk_rcvbuf = sysctl_tcp_rmem[1];
2733 +#ifdef CONFIG_WEB100_STATS
2736 + if ((err = web100_stats_create(sk))) {
2739 + tcp_sk(sk)->tcp_stats->wc_vars.LocalAddressType = WC_ADDRTYPE_IPV4;
2743 atomic_inc(&tcp_sockets_allocated);
2746 @@ -1922,6 +1946,10 @@ int tcp_v4_destroy_sock(struct sock *sk)
2747 if (inet_csk(sk)->icsk_bind_hash)
2748 inet_put_port(&tcp_hashinfo, sk);
2750 +#ifdef CONFIG_WEB100_STATS
2751 + web100_stats_destroy(tcp_sk(sk)->tcp_stats);
2755 * If sendmsg cached page exists, toss it.
2757 diff -Nurp linux-2.6.22-680/net/ipv4/tcp_minisocks.c linux-2.6.22-690/net/ipv4/tcp_minisocks.c
2758 --- linux-2.6.22-680/net/ipv4/tcp_minisocks.c 2008-11-12 17:40:30.000000000 +0100
2759 +++ linux-2.6.22-690/net/ipv4/tcp_minisocks.c 2008-11-14 21:20:17.000000000 +0100
2760 @@ -336,6 +336,8 @@ void tcp_time_wait(struct sock *sk, int
2764 + WEB100_VAR_SET(tp, State, WC_STATE_TIMEWAIT);
2766 /* Linkage updates. */
2767 __inet_twsk_hashdance(tw, sk, &tcp_hashinfo);
2769 diff -Nurp linux-2.6.22-680/net/ipv4/tcp_output.c linux-2.6.22-690/net/ipv4/tcp_output.c
2770 --- linux-2.6.22-680/net/ipv4/tcp_output.c 2008-11-12 17:40:03.000000000 +0100
2771 +++ linux-2.6.22-690/net/ipv4/tcp_output.c 2008-11-14 21:20:17.000000000 +0100
2772 @@ -67,6 +67,7 @@ static void update_send_head(struct sock
2774 tcp_advance_send_head(sk, skb);
2775 tp->snd_nxt = TCP_SKB_CB(skb)->end_seq;
2776 + WEB100_UPDATE_FUNC(tp, web100_update_snd_nxt(tp));
2777 tcp_packets_out_inc(sk, skb);
2780 @@ -250,6 +251,7 @@ static u16 tcp_select_window(struct sock
2782 tp->rcv_wnd = new_win;
2783 tp->rcv_wup = tp->rcv_nxt;
2784 + WEB100_UPDATE_FUNC(tp, web100_update_rwin_sent(tp));
2786 /* Make sure we do not exceed the maximum possible
2788 @@ -544,11 +546,32 @@ static int tcp_transmit_skb(struct sock
2789 if (after(tcb->end_seq, tp->snd_nxt) || tcb->seq == tcb->end_seq)
2790 TCP_INC_STATS(TCP_MIB_OUTSEGS);
2792 +#ifdef CONFIG_WEB100_STATS
2794 + /* If the skb isn't cloned, we can't reference it after
2795 + * calling queue_xmit, so copy everything we need here. */
2796 + int len = skb->len;
2797 + int pcount = tcp_skb_pcount(skb);
2798 + __u32 seq = TCP_SKB_CB(skb)->seq;
2799 + __u32 end_seq = TCP_SKB_CB(skb)->end_seq;
2800 + int flags = TCP_SKB_CB(skb)->flags;
2802 err = icsk->icsk_af_ops->queue_xmit(skb, 0);
2803 + if (likely(err == 0))
2804 + WEB100_UPDATE_FUNC(tp, web100_update_segsend(sk, len, pcount,
2805 + seq, end_seq, flags));
2808 + err = icsk->icsk_af_ops->queue_xmit(skb, 0);
2810 if (likely(err <= 0))
2813 +#ifdef CONFIG_WEB100_NET100
2814 + if (!NET100_WAD(tp, WAD_IFQ, sysctl_WAD_IFQ))
2816 tcp_enter_cwr(sk, 1);
2817 + WEB100_VAR_INC(tp, SendStall);
2819 return net_xmit_eval(err);
2821 @@ -868,6 +891,7 @@ unsigned int tcp_sync_mss(struct sock *s
2822 if (icsk->icsk_mtup.enabled)
2823 mss_now = min(mss_now, tcp_mtu_to_mss(sk, icsk->icsk_mtup.search_low));
2824 tp->mss_cache = mss_now;
2825 + WEB100_UPDATE_FUNC(tp, web100_update_mss(tp));
2829 @@ -1062,21 +1086,22 @@ static inline int tcp_snd_wnd_test(struc
2830 * should be put on the wire right now. If so, it returns the number of
2831 * packets allowed by the congestion window.
2833 -static unsigned int tcp_snd_test(struct sock *sk, struct sk_buff *skb,
2834 +static int tcp_snd_wait(struct sock *sk, struct sk_buff *skb,
2835 unsigned int cur_mss, int nonagle)
2837 struct tcp_sock *tp = tcp_sk(sk);
2838 - unsigned int cwnd_quota;
2841 tcp_init_tso_segs(sk, skb, cur_mss);
2843 if (!tcp_nagle_test(tp, skb, cur_mss, nonagle))
2845 + return -WC_SNDLIM_SENDER;
2847 cwnd_quota = tcp_cwnd_test(tp, skb);
2849 - !tcp_snd_wnd_test(tp, skb, cur_mss))
2852 + return -WC_SNDLIM_CWND;
2853 + if (!tcp_snd_wnd_test(tp, skb, cur_mss))
2854 + return -WC_SNDLIM_RWIN;
2858 @@ -1087,10 +1112,10 @@ int tcp_may_send_now(struct sock *sk)
2859 struct sk_buff *skb = tcp_send_head(sk);
2862 - tcp_snd_test(sk, skb, tcp_current_mss(sk, 1),
2863 + tcp_snd_wait(sk, skb, tcp_current_mss(sk, 1),
2864 (tcp_skb_is_last(sk, skb) ?
2867 + tp->nonagle)) > 0);
2870 /* Trim TSO SKB to LEN bytes, put the remaining data into a new packet
2871 @@ -1357,6 +1382,7 @@ static int tcp_write_xmit(struct sock *s
2872 unsigned int tso_segs, sent_pkts;
2875 + int why = WC_SNDLIM_NONE;
2877 /* If we are closed, the bytes will have to remain here.
2878 * In time closedown will finish, we empty the write queue and all
2879 @@ -1381,21 +1407,30 @@ static int tcp_write_xmit(struct sock *s
2882 cwnd_quota = tcp_cwnd_test(tp, skb);
2884 + if (!cwnd_quota) {
2885 + why = WC_SNDLIM_CWND;
2889 - if (unlikely(!tcp_snd_wnd_test(tp, skb, mss_now)))
2890 + if (unlikely(!tcp_snd_wnd_test(tp, skb, mss_now))) {
2891 + why = WC_SNDLIM_RWIN;
2895 if (tso_segs == 1) {
2896 if (unlikely(!tcp_nagle_test(tp, skb, mss_now,
2897 (tcp_skb_is_last(sk, skb) ?
2898 - nonagle : TCP_NAGLE_PUSH))))
2899 + nonagle : TCP_NAGLE_PUSH)))) {
2900 + why = WC_SNDLIM_SENDER;
2904 - if (tcp_tso_should_defer(sk, skb))
2905 + if (tcp_tso_should_defer(sk, skb)) {
2906 + /* XXX: is this sender or cwnd? */
2907 + why = WC_SNDLIM_SENDER;
2914 @@ -1416,8 +1451,10 @@ static int tcp_write_xmit(struct sock *s
2916 TCP_SKB_CB(skb)->when = tcp_time_stamp;
2918 - if (unlikely(tcp_transmit_skb(sk, skb, 1, GFP_ATOMIC)))
2919 + if (unlikely(tcp_transmit_skb(sk, skb, 1, GFP_ATOMIC))) {
2920 + why = WC_SNDLIM_SENDER;
2924 /* Advance the send_head. This one is sent out.
2925 * This call will increment packets_out.
2926 @@ -1427,6 +1464,9 @@ static int tcp_write_xmit(struct sock *s
2927 tcp_minshall_update(tp, mss_now, skb);
2930 + if (why == WC_SNDLIM_NONE)
2931 + why = WC_SNDLIM_SENDER;
2932 + WEB100_UPDATE_FUNC(tp, web100_update_sndlim(tp, why));
2934 if (likely(sent_pkts)) {
2935 tcp_cwnd_validate(sk);
2936 @@ -1457,14 +1497,15 @@ void tcp_push_one(struct sock *sk, unsig
2938 struct tcp_sock *tp = tcp_sk(sk);
2939 struct sk_buff *skb = tcp_send_head(sk);
2940 - unsigned int tso_segs, cwnd_quota;
2941 + unsigned int tso_segs;
2944 BUG_ON(!skb || skb->len < mss_now);
2946 tso_segs = tcp_init_tso_segs(sk, skb, mss_now);
2947 - cwnd_quota = tcp_snd_test(sk, skb, mss_now, TCP_NAGLE_PUSH);
2948 + cwnd_quota = tcp_snd_wait(sk, skb, mss_now, TCP_NAGLE_PUSH);
2950 - if (likely(cwnd_quota)) {
2951 + if (likely(cwnd_quota > 0)) {
2955 @@ -1483,8 +1524,10 @@ void tcp_push_one(struct sock *sk, unsig
2958 if (skb->len > limit &&
2959 - unlikely(tso_fragment(sk, skb, limit, mss_now)))
2960 + unlikely(tso_fragment(sk, skb, limit, mss_now))) {
2961 + WEB100_UPDATE_FUNC(tp, web100_update_sndlim(tp, WC_SNDLIM_SENDER));
2965 /* Send it out now. */
2966 TCP_SKB_CB(skb)->when = tcp_time_stamp;
2967 @@ -1493,7 +1536,11 @@ void tcp_push_one(struct sock *sk, unsig
2968 update_send_head(sk, skb);
2969 tcp_cwnd_validate(sk);
2972 + WEB100_UPDATE_FUNC(tp, web100_update_sndlim(tp, WC_SNDLIM_SENDER));
2975 + WEB100_UPDATE_FUNC(tp, web100_update_sndlim(tp, -cwnd_quota));
2979 @@ -1610,6 +1657,9 @@ u32 __tcp_select_window(struct sock *sk)
2980 window = free_space;
2983 + WEB100_VAR_SET(tp, X_dbg3, free_space);
2984 + WEB100_VAR_SET(tp, X_dbg2, mss);
2985 + WEB100_VAR_SET(tp, X_dbg1, window);
2989 @@ -2248,6 +2298,7 @@ static void tcp_connect_init(struct sock
2991 tcp_init_wl(tp, tp->write_seq, 0);
2992 tp->snd_una = tp->write_seq;
2993 + WEB100_VAR_SET(tp, SndUna, tp->snd_una);
2994 tp->snd_sml = tp->write_seq;
2997 @@ -2299,6 +2350,7 @@ int tcp_connect(struct sock *sk)
2998 * in order to make this packet get counted in tcpOutSegs.
3000 tp->snd_nxt = tp->write_seq;
3001 + WEB100_UPDATE_FUNC(tp, web100_update_snd_nxt(tp));
3002 tp->pushed_seq = tp->write_seq;
3003 TCP_INC_STATS(TCP_MIB_ACTIVEOPENS);
3005 diff -Nurp linux-2.6.22-680/net/ipv4/tcp_timer.c linux-2.6.22-690/net/ipv4/tcp_timer.c
3006 --- linux-2.6.22-680/net/ipv4/tcp_timer.c 2007-07-09 01:32:17.000000000 +0200
3007 +++ linux-2.6.22-690/net/ipv4/tcp_timer.c 2008-11-14 21:20:17.000000000 +0100
3008 @@ -332,6 +332,7 @@ static void tcp_retransmit_timer(struct
3009 NET_INC_STATS_BH(LINUX_MIB_TCPTIMEOUTS);
3012 + WEB100_UPDATE_FUNC(tp, web100_update_timeout(sk));
3014 if (tcp_use_frto(sk)) {
3016 @@ -367,6 +368,7 @@ static void tcp_retransmit_timer(struct
3017 * the 120 second clamps though!
3019 icsk->icsk_backoff++;
3020 + WEB100_VAR_SET(tcp_sk(sk), CurTimeoutCount, icsk->icsk_backoff);
3021 icsk->icsk_retransmits++;
3024 diff -Nurp linux-2.6.22-680/net/ipv4/web100_stats.c linux-2.6.22-690/net/ipv4/web100_stats.c
3025 --- linux-2.6.22-680/net/ipv4/web100_stats.c 1970-01-01 01:00:00.000000000 +0100
3026 +++ linux-2.6.22-690/net/ipv4/web100_stats.c 2008-11-14 21:20:17.000000000 +0100
3029 + * net/ipv4/web100_stats.c
3031 + * Copyright (C) 2001 Matt Mathis <mathis@psc.edu>
3032 + * Copyright (C) 2001 John Heffner <jheffner@psc.edu>
3033 + * Copyright (C) 2000 Jeffrey Semke <semke@psc.edu>
3035 + * The Web 100 project. See http://www.web100.org
3037 + * Functions for creating, destroying, and updating the Web100
3038 + * statistics structure.
3040 + * This program is free software; you can redistribute it and/or
3041 + * modify it under the terms of the GNU General Public License
3042 + * as published by the Free Software Foundation; either version
3043 + * 2 of the License, or (at your option) any later version.
3047 +#include <linux/types.h>
3048 +#include <linux/socket.h>
3049 +#include <net/web100.h>
3050 +#include <net/tcp.h>
3051 +#include <linux/string.h>
3052 +#include <linux/proc_fs.h>
3053 +#include <asm/atomic.h>
3055 +#define WC_INF32 0xffffffff
3057 +#define WC_DEATH_SLOTS 8
3058 +#define WC_PERSIST_TIME 60
3060 +/* BEWARE: The release process updates the version string */
3061 +char *web100_version_string = "2.5.17 200710051837"
3062 +#ifdef CONFIG_WEB100_NET100
3067 +static void death_cleanup(unsigned long dummy);
3069 +/* Global stats reader-writer lock */
3070 +rwlock_t web100_linkage_lock = RW_LOCK_UNLOCKED;
3072 +/* Data structures for tying together stats */
3073 +static int web100stats_next_cid;
3074 +static int web100stats_conn_num;
3075 +static int web100stats_htsize;
3076 +struct web100stats **web100stats_ht;
3077 +struct web100stats *web100stats_first = NULL;
3079 +static struct web100stats *death_slots[WC_DEATH_SLOTS];
3080 +static int cur_death_slot;
3081 +static spinlock_t death_lock = SPIN_LOCK_UNLOCKED;
3082 +static struct timer_list stats_persist_timer = TIMER_INITIALIZER(death_cleanup, 0, 0);
3083 +static int ndeaths;
3085 +#ifdef CONFIG_WEB100_NETLINK
3086 +static struct sock *web100_nlsock;
3089 +extern struct proc_dir_entry *proc_web100_dir;
3093 + * Structural maintainance
3096 +static inline int web100stats_hash(int cid)
3098 + return cid % web100stats_htsize;
3101 +struct web100stats *web100stats_lookup(int cid)
3103 + struct web100stats *stats;
3105 + /* Let's ensure safety here. It's not too expensive and may change. */
3106 + if (cid < 0 || cid >= WEB100_MAX_CONNS)
3109 + stats = web100stats_ht[web100stats_hash(cid)];
3110 + while (stats && stats->wc_cid != cid)
3111 + stats = stats->wc_hash_next;
3115 +/* This will get really slow as the cid space fills. This can be done
3116 + * better, but it's just not worth it right now.
3117 + * The caller must hold the lock.
3119 +static int get_next_cid(void)
3123 + if (web100stats_conn_num >= WEB100_MAX_CONNS)
3126 + i = web100stats_next_cid;
3128 + if (web100stats_lookup(i) == NULL)
3130 + i = (i + 1) % WEB100_MAX_CONNS;
3131 + } while (i != web100stats_next_cid);
3132 + web100stats_next_cid = (i + 1) % WEB100_MAX_CONNS;
3137 +static void stats_link(struct web100stats *stats)
3141 + write_lock_bh(&web100_linkage_lock);
3143 + if ((stats->wc_cid = get_next_cid()) < 0) {
3144 + write_unlock_bh(&web100_linkage_lock);
3148 + hash = web100stats_hash(stats->wc_cid);
3149 + stats->wc_hash_next = web100stats_ht[hash];
3150 + stats->wc_hash_prev = NULL;
3151 + if (web100stats_ht[hash])
3152 + web100stats_ht[hash]->wc_hash_prev = stats;
3153 + web100stats_ht[hash] = stats;
3155 + stats->wc_next = web100stats_first;
3156 + stats->wc_prev = NULL;
3157 + if (web100stats_first)
3158 + web100stats_first->wc_prev = stats;
3159 + web100stats_first = stats;
3161 + web100stats_conn_num++;
3162 + proc_web100_dir->nlink = web100stats_conn_num + 2;
3164 + write_unlock_bh(&web100_linkage_lock);
3167 +static void stats_unlink(struct web100stats *stats)
3171 + write_lock_bh(&web100_linkage_lock);
3173 + hash = web100stats_hash(stats->wc_cid);
3174 + if (stats->wc_hash_next)
3175 + stats->wc_hash_next->wc_hash_prev = stats->wc_hash_prev;
3176 + if (stats->wc_hash_prev)
3177 + stats->wc_hash_prev->wc_hash_next = stats->wc_hash_next;
3178 + if (stats == web100stats_ht[hash])
3179 + web100stats_ht[hash] = stats->wc_hash_next ?
3180 + stats->wc_hash_next :
3181 + stats->wc_hash_prev;
3183 + if (stats->wc_next)
3184 + stats->wc_next->wc_prev = stats->wc_prev;
3185 + if (stats->wc_prev)
3186 + stats->wc_prev->wc_next = stats->wc_next;
3187 + if (stats == web100stats_first)
3188 + web100stats_first = stats->wc_next ? stats->wc_next :
3191 + web100stats_conn_num--;
3192 + proc_web100_dir->nlink = web100stats_conn_num + 2;
3194 + write_unlock_bh(&web100_linkage_lock);
3197 +static void stats_persist(struct web100stats *stats)
3199 + spin_lock_bh(&death_lock);
3201 + stats->wc_death_next = death_slots[cur_death_slot];
3202 + death_slots[cur_death_slot] = stats;
3203 + if (ndeaths <= 0) {
3204 + stats_persist_timer.expires = jiffies + WC_PERSIST_TIME * HZ / WC_DEATH_SLOTS;
3205 + add_timer(&stats_persist_timer);
3209 + spin_unlock_bh(&death_lock);
3212 +static void death_cleanup(unsigned long dummy)
3214 + struct web100stats *stats, *next;
3216 + spin_lock_bh(&death_lock);
3218 + cur_death_slot = (cur_death_slot + 1) % WC_DEATH_SLOTS;
3219 + stats = death_slots[cur_death_slot];
3221 + stats->wc_dead = 1;
3223 + next = stats->wc_death_next;
3224 + web100_stats_unuse(stats);
3227 + death_slots[cur_death_slot] = NULL;
3229 + if (ndeaths > 0) {
3230 + stats_persist_timer.expires = jiffies + WC_PERSIST_TIME * HZ / WC_DEATH_SLOTS;
3231 + add_timer(&stats_persist_timer);
3234 + spin_unlock_bh(&death_lock);
3238 +/* Tom Dunigan's (slightly modified) netlink code. Notifies listening apps
3239 + * of Web100 events.
3241 + * NOTE: we are currently squatting on netlink family 10 (NETLINK_WEB100) in
3242 + * include/linux/netlink.h
3245 +#ifdef CONFIG_WEB100_NETLINK
3246 +void web100_netlink_event(int type, int cid)
3248 + struct web100_netlink_msg *msg;
3249 + struct sk_buff *tmpskb;
3251 + if (web100_nlsock == NULL)
3254 + if ((tmpskb = alloc_skb((sizeof (struct web100_netlink_msg)), GFP_ATOMIC)) == NULL) {
3255 + printk(KERN_INFO "web100_netlink_event: alloc_skb failure\n");
3259 + skb_put(tmpskb, sizeof (struct web100_netlink_msg));
3260 + msg = (struct web100_netlink_msg *)tmpskb->data;
3263 + netlink_broadcast(web100_nlsock, tmpskb, 0, ~0, GFP_ATOMIC);
3265 +#endif /* CONFIG_WEB100_NETLINK */
3267 +extern __u32 sysctl_wmem_default;
3268 +extern __u32 sysctl_rmem_default;
3270 +/* Called whenever a TCP/IPv4 sock is created.
3271 + * net/ipv4/tcp_ipv4.c: tcp_v4_syn_recv_sock,
3272 + * tcp_v4_init_sock
3273 + * Allocates a stats structure and initializes values.
3275 +int web100_stats_create(struct sock *sk)
3277 + struct web100stats *stats;
3278 + struct web100directs *vars;
3279 + struct tcp_sock *tp = tcp_sk(sk);
3280 + struct timeval tv;
3282 + if ((stats = kmalloc(sizeof (struct web100stats), gfp_any())) == NULL)
3284 + tp->tcp_stats = stats;
3285 + vars = &stats->wc_vars;
3287 + memset(stats, 0, sizeof (struct web100stats));
3289 + stats->wc_cid = -1;
3290 + stats->wc_sk = sk;
3291 + atomic_set(&stats->wc_users, 0);
3293 + stats->wc_limstate = WC_SNDLIM_STARTUP;
3294 + stats->wc_limstate_time = web100_mono_time();
3296 + vars->NagleEnabled = !(tp->nonagle);
3297 + vars->ActiveOpen = !in_interrupt();
3299 + vars->SndUna = tp->snd_una;
3300 + vars->SndNxt = tp->snd_nxt;
3301 + vars->SndMax = tp->snd_nxt;
3302 + vars->SndISS = tp->snd_nxt;
3304 + do_gettimeofday(&tv);
3305 + vars->StartTime = tv.tv_sec * 10 + tv.tv_usec / 100000;
3306 + vars->StartTimeSec = tv.tv_sec;
3307 + vars->StartTimeUsec = tv.tv_usec;
3308 + stats->wc_start_monotime = web100_mono_time();
3310 + vars->MinRTT = vars->MinRTO = vars->MinMSS = vars->MinRwinRcvd =
3311 + vars->MinRwinSent = vars->MinSsthresh = WC_INF32;
3313 + vars->LimRwin = tp->window_clamp;
3316 + web100_stats_use(stats);
3321 +void web100_stats_destroy(struct web100stats *stats)
3323 + /* Attribute final sndlim time. */
3324 + web100_update_sndlim(tcp_sk(stats->wc_sk), stats->wc_limstate);
3326 + if (stats->wc_cid >= 0) {
3327 +#ifdef CONFIG_WEB100_NETLINK
3328 + web100_netlink_event(WC_NL_TYPE_DISCONNECT, stats->wc_cid);
3330 + stats_persist(stats);
3332 + web100_stats_unuse(stats);
3336 +/* Do not call directly. Called from web100_stats_unuse(). */
3337 +void web100_stats_free(struct web100stats *stats)
3339 + if (stats->wc_cid >= 0) {
3340 + stats_unlink(stats);
3342 + sock_put(stats->wc_sk);
3346 +extern __u32 sysctl_wmem_default;
3347 +extern __u32 sysctl_rmem_default;
3349 +/* Called when a connection enters the ESTABLISHED state, and has all its
3350 + * state initialized.
3351 + * net/ipv4/tcp_input.c: tcp_rcv_state_process,
3352 + * tcp_rcv_synsent_state_process
3353 + * Here we link the statistics structure in so it is visible in the /proc
3354 + * fs, and do some final init.
3356 +void web100_stats_establish(struct sock *sk)
3358 + struct inet_sock *inet = inet_sk(sk);
3359 + struct tcp_sock *tp = tcp_sk(sk);
3360 + struct web100stats *stats = tp->tcp_stats;
3361 + struct web100directs *vars = &stats->wc_vars;
3363 + if (stats == NULL)
3366 + /* Let's set these here, since they can't change once the
3367 + * connection is established.
3369 + vars->LocalPort = inet->num;
3370 + vars->RemPort = ntohs(inet->dport);
3372 + if (vars->LocalAddressType == WC_ADDRTYPE_IPV4) {
3373 + vars->LocalAddress.v4addr = inet->rcv_saddr;
3374 + vars->RemAddress.v4addr = inet->daddr;
3376 +#if defined(CONFIG_IPV6) || defined(CONFIG_IPV6_MODULE)
3377 + else if (vars->LocalAddressType == WC_ADDRTYPE_IPV6) {
3378 + memcpy(&vars->LocalAddress.v6addr.addr, &(inet6_sk(sk)->saddr), 16);
3379 + memcpy(&vars->RemAddress.v6addr.addr, &(inet6_sk(sk)->daddr), 16);
3383 + printk(KERN_ERR "Web100: LocalAddressType not valid.\n");
3385 + vars->LocalAddress.v6addr.type = vars->RemAddress.v6addr.type = vars->LocalAddressType;
3387 + vars->SACKEnabled = tp->rx_opt.sack_ok;
3388 + vars->TimestampsEnabled = tp->rx_opt.tstamp_ok;
3389 +#ifdef CONFIG_INET_ECN
3390 + vars->ECNEnabled = tp->ecn_flags & TCP_ECN_OK;
3393 + if (tp->rx_opt.wscale_ok) {
3394 + vars->WinScaleRcvd = tp->rx_opt.snd_wscale;
3395 + vars->WinScaleSent = tp->rx_opt.rcv_wscale;
3397 + vars->WinScaleRcvd = -1;
3398 + vars->WinScaleSent = -1;
3400 + vars->SndWinScale = vars->WinScaleRcvd;
3401 + vars->RcvWinScale = vars->WinScaleSent;
3403 + vars->CurCwnd = tp->snd_cwnd * tp->mss_cache;
3404 + vars->CurSsthresh = tp->snd_ssthresh * tp->mss_cache;
3405 + web100_update_cwnd(tp);
3406 + web100_update_rwin_rcvd(tp);
3407 + web100_update_rwin_sent(tp);
3409 + vars->RecvISS = vars->RcvNxt = tp->rcv_nxt;
3411 + vars->RetranThresh = tp->reordering;
3413 + vars->LimRwin = min_t(__u32, vars->LimRwin, 65355U << tp->rx_opt.rcv_wscale);
3415 + stats_link(stats);
3417 + web100_update_sndlim(tp, WC_SNDLIM_SENDER);
3419 +#ifdef CONFIG_WEB100_NETLINK
3420 + web100_netlink_event(WC_NL_TYPE_CONNECT, stats->wc_cid);
3425 + * Statistics update functions
3428 +void web100_update_snd_nxt(struct tcp_sock *tp)
3430 + struct web100stats *stats = tp->tcp_stats;
3432 + if (after(tp->snd_nxt, stats->wc_vars.SndMax)) {
3433 + if (before(stats->wc_vars.SndMax, stats->wc_vars.SndISS) &&
3434 + after(tp->snd_nxt, stats->wc_vars.SndISS))
3435 + stats->wc_vars.SendWraps++;
3436 + stats->wc_vars.SndMax = tp->snd_nxt;
3438 + stats->wc_vars.SndNxt = tp->snd_nxt;
3441 +void web100_update_snd_una(struct tcp_sock *tp)
3443 + struct web100stats *stats = tp->tcp_stats;
3445 + stats->wc_vars.ThruBytesAcked += (__u32)(tp->snd_una - stats->wc_vars.SndUna);
3446 + stats->wc_vars.SndUna = tp->snd_una;
3449 +void web100_update_rtt(struct sock *sk, unsigned long rtt_sample)
3451 + struct web100stats *stats = tcp_sk(sk)->tcp_stats;
3452 + unsigned long rtt_sample_msec = rtt_sample * 1000 / HZ;
3455 + stats->wc_vars.SampleRTT = rtt_sample_msec;
3457 + if (rtt_sample_msec > stats->wc_vars.MaxRTT)
3458 + stats->wc_vars.MaxRTT = rtt_sample_msec;
3459 + if (rtt_sample_msec < stats->wc_vars.MinRTT)
3460 + stats->wc_vars.MinRTT = rtt_sample_msec;
3462 + stats->wc_vars.CountRTT++;
3463 + stats->wc_vars.SumRTT += rtt_sample_msec;
3465 + if (stats->wc_vars.PreCongCountRTT != stats->wc_vars.PostCongCountRTT) {
3466 + stats->wc_vars.PostCongCountRTT++;
3467 + stats->wc_vars.PostCongSumRTT += rtt_sample_msec;
3470 + /* srtt is stored as 8 * the smoothed estimate */
3471 + stats->wc_vars.SmoothedRTT =
3472 + (tcp_sk(sk)->srtt >> 3) * 1000 / HZ;
3474 + rto = inet_csk(sk)->icsk_rto * 1000 / HZ;
3475 + if (rto > stats->wc_vars.MaxRTO)
3476 + stats->wc_vars.MaxRTO = rto;
3477 + if (rto < stats->wc_vars.MinRTO)
3478 + stats->wc_vars.MinRTO = rto;
3479 + stats->wc_vars.CurRTO = rto;
3481 + stats->wc_vars.CurTimeoutCount = 0;
3483 + stats->wc_vars.RTTVar = (tcp_sk(sk)->rttvar >> 2) * 1000 / HZ;
3486 +void web100_update_timeout(struct sock *sk) {
3487 + struct web100stats *stats = tcp_sk(sk)->tcp_stats;
3489 + stats->wc_vars.CurTimeoutCount++;
3490 + if (inet_csk(sk)->icsk_backoff)
3491 + stats->wc_vars.SubsequentTimeouts++;
3493 + stats->wc_vars.Timeouts++;
3494 + if (inet_csk(sk)->icsk_ca_state == TCP_CA_Open)
3495 + stats->wc_vars.AbruptTimeouts++;
3498 +void web100_update_mss(struct tcp_sock *tp)
3500 + struct web100stats *stats = tp->tcp_stats;
3501 + int mss = tp->mss_cache;
3503 + stats->wc_vars.CurMSS = mss;
3504 + if (mss > stats->wc_vars.MaxMSS)
3505 + stats->wc_vars.MaxMSS = mss;
3506 + if (mss < stats->wc_vars.MinMSS)
3507 + stats->wc_vars.MinMSS = mss;
3510 +void web100_update_cwnd(struct tcp_sock *tp)
3512 + struct web100stats *stats = tp->tcp_stats;
3513 + __u16 mss = tp->mss_cache;
3518 + printk("Web100: web100_update_cwnd: mss == 0\n");
3522 + cwnd = min(WC_INF32 / mss, tp->snd_cwnd) * mss;
3523 + stats->wc_vars.CurCwnd = cwnd;
3524 + if (cwnd > stats->wc_vars.MaxCwnd)
3525 + stats->wc_vars.MaxCwnd = cwnd;
3527 + ssthresh = min(WC_INF32 / mss, tp->snd_ssthresh) * mss;
3528 + stats->wc_vars.CurSsthresh = ssthresh;
3530 + /* Discard initiail ssthresh set at infinity. */
3531 + if (tp->snd_ssthresh >= 0x7ffffff) {
3534 + if (ssthresh > stats->wc_vars.MaxSsthresh)
3535 + stats->wc_vars.MaxSsthresh = ssthresh;
3536 + if (ssthresh < stats->wc_vars.MinSsthresh)
3537 + stats->wc_vars.MinSsthresh = ssthresh;
3540 +void web100_update_rwin_rcvd(struct tcp_sock *tp)
3542 + struct web100stats *stats = tp->tcp_stats;
3543 + __u32 win = tp->snd_wnd;
3545 + stats->wc_vars.CurRwinRcvd = win;
3546 + if (win > stats->wc_vars.MaxRwinRcvd)
3547 + stats->wc_vars.MaxRwinRcvd = win;
3548 + if (win < stats->wc_vars.MinRwinRcvd)
3549 + stats->wc_vars.MinRwinRcvd = win;
3552 +void web100_update_rwin_sent(struct tcp_sock *tp)
3554 + struct web100stats *stats = tp->tcp_stats;
3555 + __u32 win = tp->rcv_wnd;
3557 + /* Update our advertised window. */
3558 + stats->wc_vars.CurRwinSent = win;
3559 + if (win > stats->wc_vars.MaxRwinSent)
3560 + stats->wc_vars.MaxRwinSent = win;
3561 + if (win < stats->wc_vars.MinRwinSent)
3562 + stats->wc_vars.MinRwinSent = win;
3566 +/* TODO: change this to a generic state machine instrument */
3567 +static void web100_state_update(struct tcp_sock *tp, int why, __u64 bytes)
3569 + struct web100stats *stats = tp->tcp_stats;
3572 + now = web100_mono_time();
3573 + stats->wc_vars.SndLimTime[stats->wc_limstate] += now - stats->wc_limstate_time;
3574 + stats->wc_limstate_time = now;
3576 + stats->wc_vars.SndLimBytes[why] += bytes - stats->wc_limstate_bytes;
3577 + stats->wc_limstate_bytes = bytes;
3579 + if (stats->wc_limstate != why) {
3580 + stats->wc_limstate = why;
3581 + stats->wc_vars.SndLimTrans[why]++;
3585 +void web100_update_sndlim(struct tcp_sock *tp, int why)
3587 + struct web100stats *stats = tp->tcp_stats;
3590 + printk("web100_update_sndlim: BUG: why < 0\n");
3594 + web100_state_update(tp, why, stats->wc_vars.DataBytesOut);
3595 + /* future instruments on other sender bottlenecks here... */
3596 + /* if (!why) { why = ??? } */
3597 + /* web100_state_update(tp, why, stats->wc_vars.DataBytesOut); */
3600 +void web100_update_congestion(struct tcp_sock *tp, int why_dummy)
3602 + struct web100stats *stats = tp->tcp_stats;
3604 + stats->wc_vars.CongestionSignals++;
3605 + stats->wc_vars.PreCongSumCwnd += stats->wc_vars.CurCwnd;
3607 + /* This may require more control flags */
3608 + stats->wc_vars.PreCongCountRTT++;
3609 + stats->wc_vars.PreCongSumRTT += stats->wc_vars.SampleRTT;
3612 +/* Called from tcp_transmit_skb, whenever we push a segment onto the wire.
3614 +void web100_update_segsend(struct sock *sk, int len, int pcount,
3615 + __u32 seq, __u32 end_seq, int flags)
3617 + struct web100stats *stats = tcp_sk(sk)->tcp_stats;
3619 + /* We know we're sending a segment. */
3620 + stats->wc_vars.PktsOut += pcount;
3622 + /* We know the ack seq is rcv_nxt. web100_XXX bug compatible*/
3623 + web100_update_rcv_nxt(tcp_sk(sk));
3625 + /* A pure ACK contains no data; everything else is data. */
3627 + stats->wc_vars.DataPktsOut += pcount;
3628 + stats->wc_vars.DataBytesOut += len;
3630 + stats->wc_vars.AckPktsOut++;
3633 + /* Check for retransmission. */
3634 + if (flags & TCPCB_FLAG_SYN) {
3635 + if (inet_csk(sk)->icsk_retransmits)
3636 + stats->wc_vars.PktsRetrans++;
3637 + } else if (before(seq, stats->wc_vars.SndMax)) {
3638 + stats->wc_vars.PktsRetrans += pcount;
3639 + stats->wc_vars.BytesRetrans += end_seq - seq;
3643 +void web100_update_segrecv(struct tcp_sock *tp, struct sk_buff *skb)
3645 + struct web100directs *vars = &tp->tcp_stats->wc_vars;
3646 + struct tcphdr *th = tcp_hdr(skb);
3649 + if (skb->len == th->doff*4) {
3650 + vars->AckPktsIn++;
3651 + if (TCP_SKB_CB(skb)->ack_seq == tp->snd_una)
3652 + vars->DupAcksIn++;
3654 + vars->DataPktsIn++;
3655 + vars->DataBytesIn += skb->len - th->doff*4;
3659 +void web100_update_rcv_nxt(struct tcp_sock *tp)
3661 + struct web100stats *stats = tp->tcp_stats;
3663 + if (before(stats->wc_vars.RcvNxt, stats->wc_vars.RecvISS) &&
3664 + after(tp->rcv_nxt, stats->wc_vars.RecvISS))
3665 + stats->wc_vars.RecvWraps++;
3666 + stats->wc_vars.ThruBytesReceived += (__u32) (tp->rcv_nxt - stats->wc_vars.RcvNxt); /* XXX */
3667 + stats->wc_vars.RcvNxt = tp->rcv_nxt;
3670 +void web100_update_writeq(struct sock *sk)
3672 + struct tcp_sock *tp = tcp_sk(sk);
3673 + struct web100directs *vars = &tp->tcp_stats->wc_vars;
3674 + int len = tp->write_seq - vars->SndMax;
3676 + vars->CurAppWQueue = len;
3677 + if (len > vars->MaxAppWQueue)
3678 + vars->MaxAppWQueue = len;
3681 +void web100_update_recvq(struct sock *sk)
3683 + struct tcp_sock *tp = tcp_sk(sk);
3684 + struct web100directs *vars = &tp->tcp_stats->wc_vars;
3685 + int len1 = tp->rcv_nxt - tp->copied_seq;
3687 + vars->CurAppRQueue = len1;
3688 + if (vars->MaxAppRQueue < len1)
3689 + vars->MaxAppRQueue = len1;
3691 +#if 0 /* FIXME!! */
3692 + vars->CurReasmQueue = len2;
3693 + if (vars->MaxReasmQueue < len2)
3694 + vars->MaxReasmQueue = len2;
3699 +void __init web100_stats_init()
3703 + memset(death_slots, 0, sizeof (death_slots));
3705 + web100stats_htsize = tcp_hashinfo.ehash_size;
3706 + for (order = 0; (1UL << order) * PAGE_SIZE < web100stats_htsize *
3707 + sizeof (struct web100stats *); order++)
3709 + printk("Web100: initiailizing hash table of size %d (order %d)\n",
3710 + web100stats_htsize, order);
3711 + if ((web100stats_ht = (struct web100stats **)__get_free_pages(GFP_ATOMIC, order)) == NULL)
3712 + panic("Failed to allocate Web100 stats hash table.\n");
3713 + memset(web100stats_ht, 0, web100stats_htsize * sizeof (struct web100stats *));
3715 +#ifdef CONFIG_WEB100_NETLINK
3716 + if ((web100_nlsock = netlink_kernel_create(NETLINK_WEB100, 0, NULL, NULL, NULL)) == NULL)
3717 + printk(KERN_ERR "web100_stats_init(): cannot initialize netlink socket\n");
3720 + printk("Web100 %s: Initialization successful\n", web100_version_string);
3723 +#ifdef CONFIG_IPV6_MODULE
3724 +EXPORT_SYMBOL(web100_stats_create);
3725 +EXPORT_SYMBOL(web100_stats_destroy);
3726 +EXPORT_SYMBOL(web100_update_segrecv);
3727 +EXPORT_SYMBOL(web100_update_cwnd);
3728 +EXPORT_SYMBOL(web100_update_writeq);
3730 diff -Nurp linux-2.6.22-680/net/ipv6/tcp_ipv6.c linux-2.6.22-690/net/ipv6/tcp_ipv6.c
3731 --- linux-2.6.22-680/net/ipv6/tcp_ipv6.c 2008-11-12 17:40:30.000000000 +0100
3732 +++ linux-2.6.22-690/net/ipv6/tcp_ipv6.c 2008-11-14 21:20:17.000000000 +0100
3733 @@ -312,6 +312,11 @@ static int tcp_v6_connect(struct sock *s
3737 + WEB100_VAR_SET(tp, SndISS, tp->write_seq);
3738 + WEB100_VAR_SET(tp, SndMax, tp->write_seq);
3739 + WEB100_VAR_SET(tp, SndNxt, tp->write_seq);
3740 + WEB100_VAR_SET(tp, SndUna, tp->write_seq);
3742 err = tcp_connect(sk);
3745 @@ -1441,6 +1446,13 @@ static struct sock * tcp_v6_syn_recv_soc
3746 newsk = tcp_create_openreq_child(sk, req, skb);
3749 +#ifdef CONFIG_WEB100_STATS
3750 + if (web100_stats_create(newsk)) {
3754 + tcp_sk(newsk)->tcp_stats->wc_vars.LocalAddressType = WC_ADDRTYPE_IPV6;
3758 * No need to charge this sock to the relevant IPv6 refcnt debug socks
3759 @@ -1754,6 +1766,7 @@ process:
3762 bh_lock_sock_nested(sk);
3763 + WEB100_UPDATE_FUNC(tcp_sk(sk), web100_update_segrecv(tcp_sk(sk), skb));
3765 if (!sock_owned_by_user(sk)) {
3766 #ifdef CONFIG_NET_DMA
3767 @@ -1768,6 +1781,7 @@ process:
3770 sk_add_backlog(sk, skb);
3771 + WEB100_UPDATE_FUNC(tcp_sk(sk), web100_update_cwnd(tcp_sk(sk)));
3775 @@ -1946,6 +1960,16 @@ static int tcp_v6_init_sock(struct sock
3776 sk->sk_sndbuf = sysctl_tcp_wmem[1];
3777 sk->sk_rcvbuf = sysctl_tcp_rmem[1];
3779 +#ifdef CONFIG_WEB100_STATS
3782 + if ((err = web100_stats_create(sk))) {
3785 + tcp_sk(sk)->tcp_stats->wc_vars.LocalAddressType = WC_ADDRTYPE_IPV6;
3789 atomic_inc(&tcp_sockets_allocated);