X-Git-Url: http://git.onelab.eu/?a=blobdiff_plain;f=doc%2Fss.sgml;fp=doc%2Fss.sgml;h=0b1b53353f12d8ebf172f13d7ec5d2cdb377505c;hb=4f7d3057911ebc5df78113896cb02e1b4ffbfabe;hp=0000000000000000000000000000000000000000;hpb=8a59994861a17eb92c11553d88631757ee8e63c3;p=iproute2.git diff --git a/doc/ss.sgml b/doc/ss.sgml new file mode 100644 index 0000000..0b1b533 --- /dev/null +++ b/doc/ss.sgml @@ -0,0 +1,525 @@ + + +
+ +SS Utility: Quick Intro +<author>Alexey Kuznetosv, <tt/kuznet@ms2.inr.ac.ru/ +<date>some_negative_number, 20 Sep 2001 +<abstract> +<tt/ss/ is one another utility to investigate sockets. +Functionally it is NOT better than <tt/netstat/ combined +with some perl/awk scripts and though it is surely faster +it is not enough to make it much better. :-) +So, stop reading this now and do not waste your time. +Well, certainly, it proposes some functionality, which current +netstat is still not able to do, but surely will soon. +</abstract> + +<sect>Why? + +<p> <tt>/proc</tt> interface is inadequate, unfortunately. +When amount of sockets is enough large, <tt/netstat/ or even +plain <tt>cat /proc/net/tcp/</tt> cause nothing but pains and curses. +In linux-2.4 the desease became worse: even if amount +of sockets is small reading <tt>/proc/net/tcp/</tt> is slow enough. + +This utility presents a new approach, which is supposed to scale +well. I am not going to describe technical details here and +will concentrate on description of the command. +The only important thing to say is that it is not so bad idea +to load module <tt/tcp_diag/, which can be found in directory +<tt/Modules/ of <tt/iproute2/. If you do not make this <tt/ss/ +will work, but it falls back to <tt>/proc</tt> and becomes slow +like <tt/netstat/, well, a bit faster yet (see section "Some numbers"). + +<sect>Old news + +<p> +In the simplest form <tt/ss/ is equivalent to netstat +with some small deviations. + +<itemize> +<item><tt/ss -t -a/ dumps all TCP sockets +<item><tt/ss -u -a/ dumps all UDP sockets +<item><tt/ss -w -a/ dumps all RAW sockets +<item><tt/ss -x -a/ dumps all UNIX sockets +</itemize> + +<p> +Option <tt/-o/ shows TCP timers state. +Option <tt/-e/ shows some extended information. +Etc. etc. etc. Seems, all the options of netstat related to sockets +are supported. Though not AX.25 and other bizarres. :-) +If someone wants, he can make support for decnet and ipx. +Some rudimentary support for them is already present in iproute2 libutils, +and I will be glad to see these new members. + +<p> +However, standard functionality is a bit different: + +<p> +The first: without option <tt/-a/ sockets in states +<tt/TIME-WAIT/ and <tt/SYN-RECV/ are skipped too. +It is more reasonable default, I think. + +<p> +The second: format of UNIX sockets is different. It coincides +with tcp/udp. Though standard kernel still does not allow to +see write/read queues and peer address of connected UNIX sockets, +the patch doing this exists. + +<p> +The third: default is to dump only TCP sockets, rather than all of the types. + +<p> +The next: by default it does not resolve numeric host addresses (like <tt/ip/)! +Resolving is enabled with option <tt/-r/. Service names, usually stored +in local files, are resolved by default. Also, if service database +does not contain references to a port, <tt/ss/ queries system +<tt/rpcbind/. RPC services are prefixed with <tt/rpc./ +Resolution of services may be suppressed with option <tt/-n/. + +<p> +It does not accept "long" options (I dislike them, sorry). +So, address family is given with family identifier following +option <tt/-f/ to be algined to iproute2 conventions. +Mostly, it is to allow option parser to parse +addresses correctly, but as side effect it really limits dumping +to sockets supporting only given family. Option <tt/-A/ followed +by list of socket tables to dump is also supported. +Logically, id of socket table is different of _address_ family, which is +another point of incompatibility. So, id is one of +<tt/all/, <tt/tcp/, <tt/udp/, +<tt/raw/, <tt/inet/, <tt/unix/, <tt/packet/, <tt/netlink/. See? +Well, <tt/inet/ is just abbreviation for <tt/tcp|udp|raw/ +and it is not difficult to guess that <tt/packet/ allows +to look at packet sockets. Actually, there are also some other abbreviations, +f.e. <tt/unix_dgram/ selects only datagram UNIX sockets. + +<p> +The next: well, I still do not know. :-) + + + + +<sect>Time to talk about new functionality. + +<p>It is builtin filtering of socket lists. + +<sect1> Filtering by state. + +<p> +<tt/ss/ allows to filter socket states, using keywords +<tt/state/ and <tt/exclude/, followed by some state +identifier. + +<p> +State identifier are standard TCP state names (not listed, +they are useless for you if you already do not know them) +or abbreviations: + +<itemize> +<item><tt/all/ - for all the states +<item><tt/bucket/ - for TCP minisockets (<tt/TIME-WAIT|SYN-RECV/) +<item><tt/big/ - all except for minisockets +<item><tt/connected/ - not closed and not listening +<item><tt/synchronized/ - connected and not <tt/SYN-SENT/ +</itemize> + +<p> + F.e. to dump all tcp sockets except <tt/SYN-RECV/: + +<tscreen><verb> + ss exclude SYN-RECV +</verb></tscreen> + +<p> + If neither <tt/state/ nor <tt/exclude/ directives + are present, + state filter defaults to <tt/all/ with option <tt/-a/ + or to <tt/all/, + excluding listening, syn-recv, time-wait and closed sockets. + +<sect1> Filtering by addresses and ports. + +<p> +Option list may contain address/port filter. +It is boolean expression which consists of boolean operation +<tt/or/, <tt/and/, <tt/not/ and predicates. +Actually, all the flavors of names for boolean operations are eaten: +<tt/&/, <tt/&&/, <tt/|/, <tt/||/, <tt/!/, but do not forget +about special sense given to these symbols by unix shells and escape +them correctly, when used from command line. + +<p> +Predicates may be of the folowing kinds: + +<itemize> +<item>A. Address/port match, where address is checked against mask + and port is either wildcard or exact. It is one of: + +<tscreen><verb> + dst prefix:port + src prefix:port + src unix:STRING + src link:protocol:ifindex + src nl:channel:pid +</verb></tscreen> + + Both prefix and port may be absent or replaced with <tt/*/, + which means wildcard. UNIX socket use more powerful scheme + matching to socket names by shell wildcards. Also, prefixes + unix: and link: may be omitted, if address family is evident + from context (with option <tt/-x/ or with <tt/-f unix/ + or with <tt/unix/ keyword) + +<p> + F.e. + +<tscreen><verb> + dst 10.0.0.1 + dst 10.0.0.1: + dst 10.0.0.1/32: + dst 10.0.0.1:* +</verb></tscreen> + are equivalent and mean socket connected to + any port on host 10.0.0.1 + +<tscreen><verb> + dst 10.0.0.0/24:22 +</verb></tscreen> + sockets connected to port 22 on network + 10.0.0.0...255. + +<p> + Note that port separated of address with colon, which creates + troubles with IPv6 addresses. Generally, we interpret the last + colon as splitting port. To allow to give IPv6 addresses, + trick like used in IPv6 HTTP URLs may be used: + +<tscreen><verb> + dst [::1] +</verb></tscreen> + are sockets connected to ::1 on any port + +<p> + Another way is <tt/dst ::1/128/. / helps to understand that + colon is part of IPv6 address. + +<p> + Now we can add another alias for <tt/dst 10.0.0.1/: + <tt/dst [10.0.0.1]/. :-) + +<p> Address may be a DNS name. In this case all the addresses are looked + up (in all the address families, if it is not limited by option <tt/-f/ + or special address prefix <tt/inet:/, <tt/inet6/) and resulting + expression is <tt/or/ over all of them. + +<item> B. Port expressions: +<tscreen><verb> + dport >= :1024 + dport != :22 + sport < :32000 +</verb></tscreen> + etc. + + All the relations: <tt/</, <tt/>/, <tt/=/, <tt/>=/, <tt/=/, <tt/==/, + <tt/!=/, <tt/eq/, <tt/ge/, <tt/lt/, <tt/ne/... + Use variant which you like more, but not forget to escape special + characters when typing them in command line. :-) + + Note that port number syntactically coincides to the case A! + You may even add an IP address, but it will not participate + incomparison, except for <tt/==/ and <tt/!=/, which are equivalent + to corresponding predicates of type A. F.e. +<p> +<tt/dst 10.0.0.1:22/ + is equivalent to <tt/dport eq 10.0.0.1:22/ + and + <tt/not dst 10.0.0.1:22/ is equivalent to + <tt/dport neq 10.0.0.1:22/ + +<item>C. Keyword <tt/autobound/. It matches to sockets bound automatically + on local system. + +</itemize> + + +<sect> Examples + +<p> +<itemize> +<item>1. List all the tcp sockets in state <tt/FIN-WAIT-1/ for our apache + to network 193.233.7/24 and look at their timers: + +<tscreen><verb> + ss -o state fin-wait-1 \( sport = :http or sport = :https \) \ + dst 193.233.7/24 +</verb></tscreen> + + Oops, forgot to say that missing logical operation is + equivalent to <tt/and/. + +<item> 2. Well, now look at the rest... + +<tscreen><verb> + ss -o excl fin-wait-1 + ss state fin-wait-1 \( sport neq :http and sport neq :https \) \ + or not dst 193.233.7/24 +</verb></tscreen> + + Note that we have to do _two_ calls of ss to do this. + State match is always anded to address/port match. + The reason for this is purely technical: ss does fast skip of + not matching states before parsing addresses and I consider the + ability to skip fastly gobs of time-wait and syn-recv sockets + as more important than logical generality. + +<item> 3. So, let's look at all our sockets using autobound ports: + +<tscreen><verb> + ss -a -A all autobound +</verb></tscreen> + + +<item> 4. And eventually find all the local processes connected + to local X servers: + +<tscreen><verb> + ss -xp dst "/tmp/.X11-unix/*" +</verb></tscreen> + + Pardon, this does not work with current kernel, patching is required. + But we still can look at server side: + +<tscreen><verb> + ss -x src "/tmp/.X11-unix/*" +</verb></tscreen> + +</itemize> + + +<sect> Returning to ground: real manual + +<p> +<sect1> Command arguments + +<p> General format of arguments to <tt/ss/ is: + +<tscreen><verb> + ss [ OPTIONS ] [ STATE-FILTER ] [ ADDRESS-FILTER ] +</verb></tscreen> + +<sect2><tt/OPTIONS/ +<p> <tt/OPTIONS/ is list of single letter options, using common unix +conventions. + +<itemize> +<item><tt/-h/ - show help page +<item><tt/-?/ - the same, of course +<item><tt/-v/, <tt/-V/ - print version of <tt/ss/ and exit +<item><tt/-s/ - print summary statistics. This option does not parse +socket lists obtaining summary from various sources. It is useful +when amount of sockets is so huge that parsing <tt>/proc/net/tcp</tt> +is painful. +<item><tt/-D FILE/ - do not display anything, just dump raw information +about TCP sockets to <tt/FILE/ after applying filters. If <tt/FILE/ is <tt/-/ +<tt/stdout/ is used. +<item><tt/-F FILE/ - read continuation of filter from <tt/FILE/. +Each line of <tt/FILE/ is interpreted like single command line option. +If <tt/FILE/ is <tt/-/ <tt/stdin/ is used. +<item><tt/-r/ - try to resolve numeric address/ports +<item><tt/-n/ - do not try to resolve ports +<item><tt/-o/ - show some optional information, f.e. TCP timers +<item><tt/-i/ - show some infomration specific to TCP (RTO, congestion +window, slow start threshould etc.) +<item><tt/-e/ - show even more optional information +<item><tt/-m/ - show extended information on memory used by the socket. +It is available only with <tt/tcp_diag/ enabled. +<item><tt/-p/ - show list of processes owning the socket +<item><tt/-f FAMILY/ - default address family used for parsing addresses. + Also this option limits listing to sockets supporting + given address family. Currently the following families + are supported: <tt/unix/, <tt/inet/, <tt/inet6/, <tt/link/, + <tt/netlink/. +<item><tt/-4/ - alias for <tt/-f inet/ +<item><tt/-6/ - alias for <tt/-f inet6/ +<item><tt/-0/ - alias for <tt/-f link/ +<item><tt/-A LIST-OF-TABLES/ - list of socket tables to dump, separated + by commas. The following identifiers are understood: + <tt/all/, <tt/inet/, <tt/tcp/, <tt/udp/, <tt/raw/, + <tt/unix/, <tt/packet/, <tt/netlink/, <tt/unix_dgram/, + <tt/unix_stream/, <tt/packet_raw/, <tt/packet_dgram/. +<item><tt/-x/ - alias for <tt/-A unix/ +<item><tt/-t/ - alias for <tt/-A tcp/ +<item><tt/-u/ - alias for <tt/-A udp/ +<item><tt/-w/ - alias for <tt/-A raw/ +<item><tt/-a/ - show sockets of all the states. By default sockets + in states <tt/LISTEN/, <tt/TIME-WAIT/, <tt/SYN_RECV/ + and <tt/CLOSE/ are skipped. +<item><tt/-l/ - show only sockets in state <tt/LISTEN/ +</itemize> + +<sect2><tt/STATE-FILTER/ + +<p><tt/STATE-FILTER/ allows to construct arbitrary set of +states to match. Its syntax is sequence of keywords <tt/state/ +and <tt/exclude/ followed by identifier of state. +Available identifiers are: + +<p> +<itemize> +<item> All standard TCP states: <tt/established/, <tt/syn-sent/, +<tt/syn-recv/, <tt/fin-wait-1/, <tt/fin-wait-2/, <tt/time-wait/, +<tt/closed/, <tt/close-wait/, <tt/last-ack/, <tt/listen/ and <tt/closing/. + +<item><tt/all/ - for all the states +<item><tt/connected/ - all the states except for <tt/listen/ and <tt/closed/ +<item><tt/synchronized/ - all the <tt/connected/ states except for +<tt/syn-sent/ +<item><tt/bucket/ - states, which are maintained as minisockets, i.e. +<tt/time-wait/ and <tt/syn-recv/. +<item><tt/big/ - opposite to <tt/bucket/ +</itemize> + +<sect2><tt/ADDRESS_FILTER/ + +<p><tt/ADDRESS_FILTER/ is boolean expression with operations <tt/and/, <tt/or/ +and <tt/not/, which can be abbreviated in C style f.e. as <tt/&/, +<tt/&&/. + +<p> +Predicates check socket addresses, both local and remote. +There are the following kinds of predicates: + +<itemize> +<item> <tt/dst ADDRESS_PATTERN/ - matches remote address and port +<item> <tt/src ADDRESS_PATTERN/ - matches local address and port +<item> <tt/dport RELOP PORT/ - compares remote port to a number +<item> <tt/sport RELOP PORT/ - compares local port to a number +<item> <tt/autobound/ - checks that socket is bound to an ephemeral + port +</itemize> + +<p><tt/RELOP/ is some of <tt/<=/, <tt/>=/, <tt/==/ etc. +To make this more convinient for use in unix shell, alphabetic +FORTRAN-like notations <tt/le/, <tt/gt/ etc. are accepted as well. + +<p>The format and semantics of <tt/ADDRESS_PATTERN/ depends on address +family. + +<itemize> +<item><tt/inet/ - <tt/ADDRESS_PATTERN/ consists of IP prefix, optionally +followed by colon and port. If prefix or port part is absent or replaced +with <tt/*/, this means wildcard match. +<item><tt/inet6/ - The same as <tt/inet/, only prefix refers to an IPv6 +address. Unlike <tt/inet/ colon becomes ambiguous, so that <tt/ss/ allows +to use scheme, like used in URLs, where address is suppounded with +<tt/[/ ... <tt/]/. +<item><tt/unix/ - <tt/ADDRESS_PATTERN/ is shell-style wildcard. +<item><tt/packet/ - format looks like <tt/inet/, only interface index +stays instead of port and link layer protocol id instead of address. +<item><tt/netlink/ - format looks like <tt/inet/, only socket pid +stays instead of port and netlink channel instead of address. +</itemize> + +<p><tt/PORT/ is syntactically <tt/ADDRESS_PATTERN/ with wildcard +address part. Certainly, it is undefined for UNIX sockets. + +<sect1> Environment variables + +<p> +<tt/ss/ allows to change source of information using various +environment variables: + +<p> +<itemize> +<item> <tt/PROC_SLABINFO/ to override <tt>/proc/slabinfo</tt> +<item> <tt/PROC_NET_TCP/ to override <tt>/proc/net/tcp</tt> +<item> <tt/PROC_NET_UDP/ to override <tt>/proc/net/udp</tt> +<item> etc. +</itemize> + +<p> +Variable <tt/PROC_ROOT/ allows to change root of all the <tt>/proc/</tt> +hierarchy. + +<p> +Variable <tt/TCPDIAG_FILE/ prescribes to open a file instead of +requesting kernel to dump information about TCP sockets. + + +<p> This option is used mainly to investigate bug reports, +when dumps of files usually found in <tt>/proc/</tt> are recevied +by e-mail. + +<sect1> Output format + +<p>Six columns. The first is <tt/Netid/, it denotes socket type and +transport protocol, when it is ambiguous: <tt/tcp/, <tt/udp/, <tt/raw/, +<tt/u_str/ is abbreviation for <tt/unix_stream/, <tt/u_dgr/ for UNIX +datagram sockets, <tt/nl/ for netlink, <tt/p_raw/ and <tt/p_dgr/ for +raw and datagram packet sockets. This column is optional, it will +be hidden, if filter selects an unique netid. + +<p> +The second column is <tt/State/. Socket state is displayed here. +The names are standard TCP names, except for <tt/UNCONN/, which +cannot happen for TCP, but normal for not connected sockets +of another types. Again, this column can be hidden. + +<p> +Then two columns (<tt/Recv-Q/ and <tt/Send-Q/) showing amount of data +queued for receive and transmit. + +<p> +And the last two columns display local address and port of the socket +and its peer address, if the socket is connected. + +<p> +If options <tt/-o/, <tt/-e/ or <tt/-p/ were given, options are +displayed not in fixed positions but separated by spaces pairs: +<tt/option:value/. If value is not a single number, it is presented +as list of values, enclosed to <tt/(/ ... <tt/)/ and separated with +commas. F.e. + +<tscreen><verb> + timer:(keepalive,111min,0) +</verb></tscreen> +is typical format for TCP timer (option <tt/-o/). + +<tscreen><verb> + users:((X,113,3)) +</verb></tscreen> +is typical for list of users (option <tt/-p/). + + +<sect>Some numbers + +<p> +Well, let us use <tt/pidentd/ and a tool <tt/ibench/ to measure +its performance. It is 30 requests per second here. Nothing to test, +it is too slow. OK, let us patch pidentd with patch from directory +Patches. After this it handles about 4300 requests per second +and becomes handy tool to pollute socket tables with lots of timewait +buckets. + +<p> +So, each test starts from pollution tables with 30000 sockets +and then doing full dump of the table piped to wc and measuring +timings with time: + +<p>Results: + +<itemize> +<item> <tt/netstat -at/ - 15.6 seconds +<item> <tt/ss -atr/, but without <tt/tcp_diag/ - 5.4 seconds +<item> <tt/ss -atr/ with <tt/tcp_diag/ - 0.47 seconds +</itemize> + +No comments. Though one comment is necessary, most of time +without <tt/tcp_diag/ is wasted inside kernel with completely +blocked networking. More than 10 seconds, yes. <tt/tcp_diag/ +does the same work for 100 milliseconds of system time. + +</article>