X-Git-Url: http://git.onelab.eu/?a=blobdiff_plain;f=Documentation%2Fnetworking%2Fip-sysctl.txt;h=713c283c25b747e3796925465f659eeb0e3af330;hb=refs%2Fheads%2Fvserver;hp=26364d06ae927f0262e4cbccda03d1ad2cca35ba;hpb=76828883507a47dae78837ab5dec5a5b4513c667;p=linux-2.6.git diff --git a/Documentation/networking/ip-sysctl.txt b/Documentation/networking/ip-sysctl.txt index 26364d06a..713c283c2 100644 --- a/Documentation/networking/ip-sysctl.txt +++ b/Documentation/networking/ip-sysctl.txt @@ -101,53 +101,67 @@ inet_peer_gc_maxtime - INTEGER TCP variables: +somaxconn - INTEGER + Limit of socket listen() backlog, known in userspace as SOMAXCONN. + Defaults to 128. See also tcp_max_syn_backlog for additional tuning + for TCP sockets. + tcp_abc - INTEGER - Controls Appropriate Byte Count defined in RFC3465. If set to - 0 then does congestion avoid once per ack. 1 is conservative - value, and 2 is more agressive. + Controls Appropriate Byte Count (ABC) defined in RFC3465. + ABC is a way of increasing congestion window (cwnd) more slowly + in response to partial acknowledgments. + Possible values are: + 0 increase cwnd once per acknowledgment (no ABC) + 1 increase cwnd once per acknowledgment of full sized segment + 2 allow increase cwnd by two if acknowledgment is + of two segments to compensate for delayed acknowledgments. + Default: 0 (off) -tcp_syn_retries - INTEGER - Number of times initial SYNs for an active TCP connection attempt - will be retransmitted. Should not be higher than 255. Default value - is 5, which corresponds to ~180seconds. +tcp_abort_on_overflow - BOOLEAN + If listening service is too slow to accept new connections, + reset them. Default state is FALSE. It means that if overflow + occurred due to a burst, connection will recover. Enable this + option _only_ if you are really sure that listening daemon + cannot be tuned to accept connections faster. Enabling this + option can harm clients of your server. -tcp_synack_retries - INTEGER - Number of times SYNACKs for a passive TCP connection attempt will - be retransmitted. Should not be higher than 255. Default value - is 5, which corresponds to ~180seconds. +tcp_adv_win_scale - INTEGER + Count buffering overhead as bytes/2^tcp_adv_win_scale + (if tcp_adv_win_scale > 0) or bytes-bytes/2^(-tcp_adv_win_scale), + if it is <= 0. + Default: 2 -tcp_keepalive_time - INTEGER - How often TCP sends out keepalive messages when keepalive is enabled. - Default: 2hours. +tcp_allowed_congestion_control - STRING + Show/set the congestion control choices available to non-privileged + processes. The list is a subset of those listed in + tcp_available_congestion_control. + Default is "reno" and the default setting (tcp_congestion_control). -tcp_keepalive_probes - INTEGER - How many keepalive probes TCP sends out, until it decides that the - connection is broken. Default value: 9. +tcp_app_win - INTEGER + Reserve max(window/2^tcp_app_win, mss) of window for application + buffer. Value 0 is special, it means that nothing is reserved. + Default: 31 -tcp_keepalive_intvl - INTEGER - How frequently the probes are send out. Multiplied by - tcp_keepalive_probes it is time to kill not responding connection, - after probes started. Default value: 75sec i.e. connection - will be aborted after ~11 minutes of retries. +tcp_available_congestion_control - STRING + Shows the available congestion control choices that are registered. + More congestion control algorithms may be available as modules, + but not loaded. -tcp_retries1 - INTEGER - How many times to retry before deciding that something is wrong - and it is necessary to report this suspicion to network layer. - Minimal RFC value is 3, it is default, which corresponds - to ~3sec-8min depending on RTO. +tcp_congestion_control - STRING + Set the congestion control algorithm to be used for new + connections. The algorithm "reno" is always available, but + additional choices may be available based on kernel configuration. + Default is set as part of kernel configuration. -tcp_retries2 - INTEGER - How may times to retry before killing alive TCP connection. - RFC1122 says that the limit should be longer than 100 sec. - It is too small number. Default value 15 corresponds to ~13-30min - depending on RTO. +tcp_dsack - BOOLEAN + Allows TCP to send "duplicate" SACKs. -tcp_orphan_retries - INTEGER - How may times to retry before killing TCP connection, closed - by our side. Default value 7 corresponds to ~50sec-16min - depending on RTO. If you machine is loaded WEB server, - you should think about lowering this value, such sockets - may consume significant resources. Cf. tcp_max_orphans. +tcp_ecn - BOOLEAN + Enable Explicit Congestion Notification in TCP. + +tcp_fack - BOOLEAN + Enable FACK congestion avoidance and fast retransmission. + The value is not used, if tcp_sack is not enabled. tcp_fin_timeout - INTEGER Time to hold socket in state FIN-WAIT-2, if it was closed @@ -160,24 +174,33 @@ tcp_fin_timeout - INTEGER because they eat maximum 1.5K of memory, but they tend to live longer. Cf. tcp_max_orphans. -tcp_max_tw_buckets - INTEGER - Maximal number of timewait sockets held by system simultaneously. - If this number is exceeded time-wait socket is immediately destroyed - and warning is printed. This limit exists only to prevent - simple DoS attacks, you _must_ not lower the limit artificially, - but rather increase it (probably, after increasing installed memory), - if network conditions require more than default value. +tcp_frto - BOOLEAN + Enables F-RTO, an enhanced recovery algorithm for TCP retransmission + timeouts. It is particularly beneficial in wireless environments + where packet loss is typically due to random radio interference + rather than intermediate router congestion. -tcp_tw_recycle - BOOLEAN - Enable fast recycling TIME-WAIT sockets. Default value is 0. - It should not be changed without advice/request of technical - experts. +tcp_keepalive_time - INTEGER + How often TCP sends out keepalive messages when keepalive is enabled. + Default: 2hours. -tcp_tw_reuse - BOOLEAN - Allow to reuse TIME-WAIT sockets for new connections when it is - safe from protocol viewpoint. Default value is 0. - It should not be changed without advice/request of technical - experts. +tcp_keepalive_probes - INTEGER + How many keepalive probes TCP sends out, until it decides that the + connection is broken. Default value: 9. + +tcp_keepalive_intvl - INTEGER + How frequently the probes are send out. Multiplied by + tcp_keepalive_probes it is time to kill not responding connection, + after probes started. Default value: 75sec i.e. connection + will be aborted after ~11 minutes of retries. + +tcp_low_latency - BOOLEAN + If set, the TCP stack makes decisions that prefer lower + latency as opposed to higher throughput. By default, this + option is not set meaning that higher throughput is preferred. + An example of an application where this default should be + changed would be a Beowulf compute cluster. + Default: 0 tcp_max_orphans - INTEGER Maximal number of TCP sockets not attached to any user file handle, @@ -191,41 +214,6 @@ tcp_max_orphans - INTEGER more aggressively. Let me to remind again: each orphan eats up to ~64K of unswappable memory. -tcp_abort_on_overflow - BOOLEAN - If listening service is too slow to accept new connections, - reset them. Default state is FALSE. It means that if overflow - occurred due to a burst, connection will recover. Enable this - option _only_ if you are really sure that listening daemon - cannot be tuned to accept connections faster. Enabling this - option can harm clients of your server. - -tcp_syncookies - BOOLEAN - Only valid when the kernel was compiled with CONFIG_SYNCOOKIES - Send out syncookies when the syn backlog queue of a socket - overflows. This is to prevent against the common 'syn flood attack' - Default: FALSE - - Note, that syncookies is fallback facility. - It MUST NOT be used to help highly loaded servers to stand - against legal connection rate. If you see synflood warnings - in your logs, but investigation shows that they occur - because of overload with legal connections, you should tune - another parameters until this warning disappear. - See: tcp_max_syn_backlog, tcp_synack_retries, tcp_abort_on_overflow. - - syncookies seriously violate TCP protocol, do not allow - to use TCP extensions, can result in serious degradation - of some services (f.e. SMTP relaying), visible not by you, - but your clients and relays, contacting you. While you see - synflood warnings in logs not being really flooded, your server - is seriously misconfigured. - -tcp_stdurg - BOOLEAN - Use the Host requirements interpretation of the TCP urg pointer field. - Most hosts use the older BSD interpretation, so if you turn this on - Linux might not communicate correctly with them. - Default: FALSE - tcp_max_syn_backlog - INTEGER Maximal number of remembered connection requests, which are still did not receive an acknowledgment from connecting client. @@ -233,24 +221,34 @@ tcp_max_syn_backlog - INTEGER and 128 for low memory machines. If server suffers of overload, try to increase this number. -tcp_window_scaling - BOOLEAN - Enable window scaling as defined in RFC1323. +tcp_max_tw_buckets - INTEGER + Maximal number of timewait sockets held by system simultaneously. + If this number is exceeded time-wait socket is immediately destroyed + and warning is printed. This limit exists only to prevent + simple DoS attacks, you _must_ not lower the limit artificially, + but rather increase it (probably, after increasing installed memory), + if network conditions require more than default value. -tcp_timestamps - BOOLEAN - Enable timestamps as defined in RFC1323. +tcp_mem - vector of 3 INTEGERs: min, pressure, max + min: below this number of pages TCP is not bothered about its + memory appetite. -tcp_sack - BOOLEAN - Enable select acknowledgments (SACKS). + pressure: when amount of memory allocated by TCP exceeds this number + of pages, TCP moderates its memory consumption and enters memory + pressure mode, which is exited when memory consumption falls + under "min". -tcp_fack - BOOLEAN - Enable FACK congestion avoidance and fast retransmission. - The value is not used, if tcp_sack is not enabled. + max: number of pages allowed for queueing by all TCP sockets. -tcp_dsack - BOOLEAN - Allows TCP to send "duplicate" SACKs. + Defaults are calculated at boot time from amount of available + memory. -tcp_ecn - BOOLEAN - Enable Explicit Congestion Notification in TCP. +tcp_orphan_retries - INTEGER + How may times to retry before killing TCP connection, closed + by our side. Default value 7 corresponds to ~50sec-16min + depending on RTO. If you machine is loaded WEB server, + you should think about lowering this value, such sockets + may consume significant resources. Cf. tcp_max_orphans. tcp_reordering - INTEGER Maximal reordering of packets in a TCP stream. @@ -261,20 +259,23 @@ tcp_retrans_collapse - BOOLEAN On retransmit try to send bigger packets to work around bugs in certain TCP stacks. -tcp_wmem - vector of 3 INTEGERs: min, default, max - min: Amount of memory reserved for send buffers for TCP socket. - Each TCP socket has rights to use it due to fact of its birth. - Default: 4K +tcp_retries1 - INTEGER + How many times to retry before deciding that something is wrong + and it is necessary to report this suspicion to network layer. + Minimal RFC value is 3, it is default, which corresponds + to ~3sec-8min depending on RTO. - default: Amount of memory allowed for send buffers for TCP socket - by default. This value overrides net.core.wmem_default used - by other protocols, it is usually lower than net.core.wmem_default. - Default: 16K +tcp_retries2 - INTEGER + How may times to retry before killing alive TCP connection. + RFC1122 says that the limit should be longer than 100 sec. + It is too small number. Default value 15 corresponds to ~13-30min + depending on RTO. - max: Maximal amount of memory allowed for automatically selected - send buffers for TCP socket. This value does not override - net.core.wmem_max, "static" selection via SO_SNDBUF does not use this. - Default: 128K +tcp_rfc1337 - BOOLEAN + If set, the TCP stack behaves conforming to RFC1337. If unset, + we are not conforming to RFC, but prevent TCP TIME_WAIT + assassination. + Default: 0 tcp_rmem - vector of 3 INTEGERs: min, default, max min: Minimal size of receive buffer used by TCP sockets. @@ -293,67 +294,133 @@ tcp_rmem - vector of 3 INTEGERs: min, default, max net.core.rmem_max, "static" selection via SO_RCVBUF does not use this. Default: 87380*2 bytes. -tcp_mem - vector of 3 INTEGERs: min, pressure, max - low: below this number of pages TCP is not bothered about its - memory appetite. +tcp_sack - BOOLEAN + Enable select acknowledgments (SACKS). - pressure: when amount of memory allocated by TCP exceeds this number - of pages, TCP moderates its memory consumption and enters memory - pressure mode, which is exited when memory consumption falls - under "low". +tcp_slow_start_after_idle - BOOLEAN + If set, provide RFC2861 behavior and time out the congestion + window after an idle period. An idle period is defined at + the current RTO. If unset, the congestion window will not + be timed out after an idle period. + Default: 1 - high: number of pages allowed for queueing by all TCP sockets. +tcp_stdurg - BOOLEAN + Use the Host requirements interpretation of the TCP urg pointer field. + Most hosts use the older BSD interpretation, so if you turn this on + Linux might not communicate correctly with them. + Default: FALSE - Defaults are calculated at boot time from amount of available - memory. +tcp_synack_retries - INTEGER + Number of times SYNACKs for a passive TCP connection attempt will + be retransmitted. Should not be higher than 255. Default value + is 5, which corresponds to ~180seconds. -tcp_app_win - INTEGER - Reserve max(window/2^tcp_app_win, mss) of window for application - buffer. Value 0 is special, it means that nothing is reserved. - Default: 31 +tcp_syncookies - BOOLEAN + Only valid when the kernel was compiled with CONFIG_SYNCOOKIES + Send out syncookies when the syn backlog queue of a socket + overflows. This is to prevent against the common 'syn flood attack' + Default: FALSE -tcp_adv_win_scale - INTEGER - Count buffering overhead as bytes/2^tcp_adv_win_scale - (if tcp_adv_win_scale > 0) or bytes-bytes/2^(-tcp_adv_win_scale), - if it is <= 0. - Default: 2 + Note, that syncookies is fallback facility. + It MUST NOT be used to help highly loaded servers to stand + against legal connection rate. If you see synflood warnings + in your logs, but investigation shows that they occur + because of overload with legal connections, you should tune + another parameters until this warning disappear. + See: tcp_max_syn_backlog, tcp_synack_retries, tcp_abort_on_overflow. -tcp_rfc1337 - BOOLEAN - If set, the TCP stack behaves conforming to RFC1337. If unset, - we are not conforming to RFC, but prevent TCP TIME_WAIT - assassination. - Default: 0 + syncookies seriously violate TCP protocol, do not allow + to use TCP extensions, can result in serious degradation + of some services (f.e. SMTP relaying), visible not by you, + but your clients and relays, contacting you. While you see + synflood warnings in logs not being really flooded, your server + is seriously misconfigured. -tcp_low_latency - BOOLEAN - If set, the TCP stack makes decisions that prefer lower - latency as opposed to higher throughput. By default, this - option is not set meaning that higher throughput is preferred. - An example of an application where this default should be - changed would be a Beowulf compute cluster. - Default: 0 +tcp_syn_retries - INTEGER + Number of times initial SYNs for an active TCP connection attempt + will be retransmitted. Should not be higher than 255. Default value + is 5, which corresponds to ~180seconds. + +tcp_timestamps - BOOLEAN + Enable timestamps as defined in RFC1323. tcp_tso_win_divisor - INTEGER - This allows control over what percentage of the congestion window - can be consumed by a single TSO frame. - The setting of this parameter is a choice between burstiness and - building larger TSO frames. - Default: 3 + This allows control over what percentage of the congestion window + can be consumed by a single TSO frame. + The setting of this parameter is a choice between burstiness and + building larger TSO frames. + Default: 3 -tcp_frto - BOOLEAN - Enables F-RTO, an enhanced recovery algorithm for TCP retransmission - timeouts. It is particularly beneficial in wireless environments - where packet loss is typically due to random radio interference - rather than intermediate router congestion. +tcp_tw_recycle - BOOLEAN + Enable fast recycling TIME-WAIT sockets. Default value is 0. + It should not be changed without advice/request of technical + experts. -tcp_congestion_control - STRING - Set the congestion control algorithm to be used for new - connections. The algorithm "reno" is always available, but - additional choices may be available based on kernel configuration. +tcp_tw_reuse - BOOLEAN + Allow to reuse TIME-WAIT sockets for new connections when it is + safe from protocol viewpoint. Default value is 0. + It should not be changed without advice/request of technical + experts. -somaxconn - INTEGER - Limit of socket listen() backlog, known in userspace as SOMAXCONN. - Defaults to 128. See also tcp_max_syn_backlog for additional tuning - for TCP sockets. +tcp_window_scaling - BOOLEAN + Enable window scaling as defined in RFC1323. + +tcp_wmem - vector of 3 INTEGERs: min, default, max + min: Amount of memory reserved for send buffers for TCP socket. + Each TCP socket has rights to use it due to fact of its birth. + Default: 4K + + default: Amount of memory allowed for send buffers for TCP socket + by default. This value overrides net.core.wmem_default used + by other protocols, it is usually lower than net.core.wmem_default. + Default: 16K + + max: Maximal amount of memory allowed for automatically selected + send buffers for TCP socket. This value does not override + net.core.wmem_max, "static" selection via SO_SNDBUF does not use this. + Default: 128K + +tcp_workaround_signed_windows - BOOLEAN + If set, assume no receipt of a window scaling option means the + remote TCP is broken and treats the window as a signed quantity. + If unset, assume the remote TCP is not broken even if we do + not receive a window scaling option from them. + Default: 0 + +CIPSOv4 Variables: + +cipso_cache_enable - BOOLEAN + If set, enable additions to and lookups from the CIPSO label mapping + cache. If unset, additions are ignored and lookups always result in a + miss. However, regardless of the setting the cache is still + invalidated when required when means you can safely toggle this on and + off and the cache will always be "safe". + Default: 1 + +cipso_cache_bucket_size - INTEGER + The CIPSO label cache consists of a fixed size hash table with each + hash bucket containing a number of cache entries. This variable limits + the number of entries in each hash bucket; the larger the value the + more CIPSO label mappings that can be cached. When the number of + entries in a given hash bucket reaches this limit adding new entries + causes the oldest entry in the bucket to be removed to make room. + Default: 10 + +cipso_rbm_optfmt - BOOLEAN + Enable the "Optimized Tag 1 Format" as defined in section 3.4.2.6 of + the CIPSO draft specification (see Documentation/netlabel for details). + This means that when set the CIPSO tag will be padded with empty + categories in order to make the packet data 32-bit aligned. + Default: 0 + +cipso_rbm_structvalid - BOOLEAN + If set, do a very strict check of the CIPSO option when + ip_options_compile() is called. If unset, relax the checks done during + ip_options_compile(). Either way is "safe" as errors are caught else + where in the CIPSO processing code but setting this to 0 (False) should + result in less work (i.e. it should be faster) but could cause problems + with other implementations that require strict checking. + Default: 0 IP Variables: @@ -440,7 +507,7 @@ icmp_errors_use_inbound_ifaddr - BOOLEAN Note that if no primary address exists for the interface selected, then the primary address of the first non-loopback interface that - has one will be used regarldess of this setting. + has one will be used regardless of this setting. Default: 0 @@ -619,6 +686,11 @@ arp_ignore - INTEGER The max value from conf/{all,interface}/arp_ignore is used when ARP request is received on the {interface} +arp_accept - BOOLEAN + Define behavior when gratuitous arp replies are received: + 0 - drop gratuitous arp frames + 1 - accept gratuitous arp frames + app_solicit - INTEGER The maximum number of probes to send to the user space ARP daemon via netlink before dropping back to multicast probes (see @@ -705,6 +777,9 @@ conf/all/forwarding - BOOLEAN This referred to as global forwarding. +proxy_ndp - BOOLEAN + Do proxy ndp. + conf/interface/*: Change special settings per interface. @@ -717,18 +792,54 @@ accept_ra - BOOLEAN Functional default: enabled if local forwarding is disabled. disabled if local forwarding is enabled. +accept_ra_defrtr - BOOLEAN + Learn default router in Router Advertisement. + + Functional default: enabled if accept_ra is enabled. + disabled if accept_ra is disabled. + +accept_ra_pinfo - BOOLEAN + Learn Prefix Information in Router Advertisement. + + Functional default: enabled if accept_ra is enabled. + disabled if accept_ra is disabled. + +accept_ra_rt_info_max_plen - INTEGER + Maximum prefix length of Route Information in RA. + + Route Information w/ prefix larger than or equal to this + variable shall be ignored. + + Functional default: 0 if accept_ra_rtr_pref is enabled. + -1 if accept_ra_rtr_pref is disabled. + +accept_ra_rtr_pref - BOOLEAN + Accept Router Preference in RA. + + Functional default: enabled if accept_ra is enabled. + disabled if accept_ra is disabled. + accept_redirects - BOOLEAN Accept Redirects. Functional default: enabled if local forwarding is disabled. disabled if local forwarding is enabled. +accept_source_route - INTEGER + Accept source routing (routing extension header). + + > 0: Accept routing header. + = 0: Accept only routing header type 2. + < 0: Do not accept routing header. + + Default: 0 + autoconf - BOOLEAN Autoconfigure addresses using Prefix Information in Router Advertisements. - Functional default: enabled if accept_ra is enabled. - disabled if accept_ra is disabled. + Functional default: enabled if accept_ra_pinfo is enabled. + disabled if accept_ra_pinfo is disabled. dad_transmits - INTEGER The amount of Duplicate Address Detection probes to send. @@ -771,6 +882,12 @@ mtu - INTEGER Default Maximum Transfer Unit Default: 1280 (IPv6 required minimum) +router_probe_interval - INTEGER + Minimum interval (in seconds) between Router Probing described + in RFC4191. + + Default: 60 + router_solicitation_delay - INTEGER Number of seconds to wait after interface is brought up before sending Router Solicitations. @@ -878,4 +995,3 @@ no_cong_thresh FIXME slot_timeout FIXME warn_noreply_time FIXME -$Id: ip-sysctl.txt,v 1.20 2001/12/13 09:00:18 davem Exp $