1 files changed, 4150 insertions, 0 deletions
diff --git a/doc/netperf.texi b/doc/netperf.texi
new file mode 100644
index 0000000..6b32e60
--- /dev/null
+++ b/doc/netperf.texi
@@ -0,0 +1,4150 @@
+\input texinfo   @c -*-texinfo-*-
+@c %**start of header
+@setfilename netperf.info
+@settitle Care and Feeding of Netperf 2.6.X
+@c %**end of header
+
+@copying
+This is Rick Jones' feeble attempt at a Texinfo-based manual for the
+netperf benchmark. 
+
+Copyright @copyright{} 2005-2012 Hewlett-Packard Company
+@quotation
+Permission is granted to copy, distribute and/or modify this document
+per the terms of the netperf source license, a copy of which can be
+found in the file @file{COPYING} of the basic netperf distribution.
+@end quotation
+@end copying
+
+@titlepage
+@title Care and Feeding of Netperf
+@subtitle Versions 2.6.0 and Later
+@author Rick Jones @email{rick.jones2@@hp.com}
+@c this is here to start the copyright page
+@page
+@vskip 0pt plus 1filll
+@insertcopying
+@end titlepage
+
+@c begin with a table of contents
+@contents
+
+@ifnottex
+@node Top, Introduction, (dir), (dir)
+@top Netperf Manual
+
+@insertcopying
+@end ifnottex
+
+@menu
+* Introduction::                An introduction to netperf - what it
+is and what it is not.
+* Installing Netperf::          How to go about installing netperf.
+* The Design of Netperf::       
+* Global Command-line Options::  
+* Using Netperf to Measure Bulk Data Transfer::  
+* Using Netperf to Measure Request/Response ::  
+* Using Netperf to Measure Aggregate Performance::  
+* Using Netperf to Measure Bidirectional Transfer::  
+* The Omni Tests::              
+* Other Netperf Tests::         
+* Address Resolution::          
+* Enhancing Netperf::           
+* Netperf4::                    
+* Concept Index::               
+* Option Index::                
+@end menu
+
+@node Introduction, Installing Netperf, Top, Top
+@chapter Introduction
+
+@cindex Introduction
+
+Netperf is a benchmark that can be use to measure various aspect of
+networking performance.  The primary foci are bulk (aka
+unidirectional) data transfer and request/response performance using
+either TCP or UDP and the Berkeley Sockets interface.  As of this
+writing, the tests available either unconditionally or conditionally
+include:
+
+@itemize @bullet
+@item
+TCP and UDP unidirectional transfer and request/response over IPv4 and
+IPv6 using the Sockets interface.
+@item
+TCP and UDP unidirectional transfer and request/response over IPv4
+using the XTI interface.
+@item
+Link-level unidirectional transfer and request/response using the DLPI
+interface. 
+@item
+Unix domain sockets
+@item
+SCTP unidirectional transfer and request/response over IPv4 and IPv6
+using the sockets interface.
+@end itemize
+
+While not every revision of netperf will work on every platform
+listed, the intention is that at least some version of netperf will
+work on the following platforms:
+
+@itemize @bullet
+@item
+Unix - at least all the major variants.
+@item
+Linux
+@item
+Windows
+@item
+Others
+@end itemize
+
+Netperf is maintained and informally supported primarily by Rick
+Jones, who can perhaps be best described as Netperf Contributing
+Editor.  Non-trivial and very appreciated assistance comes from others
+in the network performance community, who are too numerous to mention
+here. While it is often used by them, netperf is NOT supported via any
+of the formal Hewlett-Packard support channels.  You should feel free
+to make enhancements and modifications to netperf to suit your
+nefarious porpoises, so long as you stay within the guidelines of the
+netperf copyright.  If you feel so inclined, you can send your changes
+to
+@email{netperf-feedback@@netperf.org,netperf-feedback} for possible
+inclusion into subsequent versions of netperf.
+
+It is the Contributing Editor's belief that the netperf license walks
+like open source and talks like open source. However, the license was
+never submitted for ``certification'' as an open source license.  If
+you would prefer to make contributions to a networking benchmark using
+a certified open source license, please consider netperf4, which is
+distributed under the terms of the GPLv2.
+
+The @email{netperf-talk@@netperf.org,netperf-talk} mailing list is
+available to discuss the care and feeding of netperf with others who
+share your interest in network performance benchmarking. The
+netperf-talk mailing list is a closed list (to deal with spam) and you
+must first subscribe by sending email to
+@email{netperf-talk-request@@netperf.org,netperf-talk-request}.
+
+
+@menu
+* Conventions::                 
+@end menu
+
+@node Conventions,  , Introduction, Introduction
+@section Conventions
+
+A @dfn{sizespec} is a one or two item, comma-separated list used as an
+argument to a command-line option that can set one or two, related
+netperf parameters.  If you wish to set both parameters to separate
+values, items should be separated by a comma:
+
+@example
+parameter1,parameter2
+@end example
+
+If you wish to set the first parameter without altering the value of
+the second from its default, you should follow the first item with a
+comma:
+
+@example
+parameter1,
+@end example
+
+
+Likewise, precede the item with a comma if you wish to set only the
+second parameter:
+
+@example
+,parameter2
+@end example
+
+An item with no commas:
+
+@example
+parameter1and2
+@end example
+
+will set both parameters to the same value.  This last mode is one of
+the most frequently used.
+
+There is another variant of the comma-separated, two-item list called
+a @dfn{optionspec} which is like a sizespec with the exception that a
+single item with no comma:
+
+@example
+parameter1
+@end example
+
+will only set the value of the first parameter and will leave the
+second parameter at its default value.
+
+Netperf has two types of command-line options.  The first are global
+command line options.  They are essentially any option not tied to a
+particular test or group of tests.  An example of a global
+command-line option is the one which sets the test type - @option{-t}.
+
+The second type of options are test-specific options.  These are
+options which are only applicable to a particular test or set of
+tests.  An example of a test-specific option would be the send socket
+buffer size for a TCP_STREAM test.
+
+Global command-line options are specified first with test-specific
+options following after a @code{--} as in:
+
+@example
+netperf <global> -- <test-specific>
+@end example
+
+
+@node Installing Netperf, The Design of Netperf, Introduction, Top
+@chapter Installing Netperf
+
+@cindex Installation
+
+Netperf's primary form of distribution is source code.  This allows
+installation on systems other than those to which the authors have
+ready access and thus the ability to create binaries.  There are two
+styles of netperf installation.  The first runs the netperf server
+program - netserver - as a child of inetd.  This requires the
+installer to have sufficient privileges to edit the files
+@file{/etc/services} and @file{/etc/inetd.conf} or their
+platform-specific equivalents.
+
+The second style is to run netserver as a standalone daemon.  This
+second method does not require edit privileges on @file{/etc/services}
+and @file{/etc/inetd.conf} but does mean you must remember to run the
+netserver program explicitly after every system reboot.
+
+This manual assumes that those wishing to measure networking
+performance already know how to use anonymous FTP and/or a web
+browser. It is also expected that you have at least a passing
+familiarity with the networking protocols and interfaces involved. In
+all honesty, if you do not have such familiarity, likely as not you
+have some experience to gain before attempting network performance
+measurements.  The excellent texts by authors such as Stevens, Fenner
+and Rudoff and/or Stallings would be good starting points. There are
+likely other excellent sources out there as well.
+
+@menu
+* Getting Netperf Bits::        
+* Installing Netperf Bits::     
+* Verifying Installation::      
+@end menu
+
+@node Getting Netperf Bits, Installing Netperf Bits, Installing Netperf, Installing Netperf
+@section Getting Netperf Bits
+
+Gzipped tar files of netperf sources can be retrieved via 
+@uref{ftp://ftp.netperf.org/netperf,anonymous FTP}
+for ``released'' versions of the bits.  Pre-release versions of the
+bits can be retrieved via anonymous FTP from the
+@uref{ftp://ftp.netperf.org/netperf/experimental,experimental} subdirectory.
+
+For convenience and ease of remembering, a link to the download site
+is provided via the 
+@uref{http://www.netperf.org/, NetperfPage}
+
+The bits corresponding to each discrete release of netperf are
+@uref{http://www.netperf.org/svn/netperf2/tags,tagged} for retrieval
+via subversion.  For example, there is a tag for the first version
+corresponding to this version of the manual - 
+@uref{http://www.netperf.org/svn/netperf2/tags/netperf-2.6.0,netperf
+2.6.0}.  Those wishing to be on the bleeding edge of netperf
+development can use subversion to grab the
+@uref{http://www.netperf.org/svn/netperf2/trunk,top of trunk}.  When
+fixing bugs or making enhancements, patches against the top-of-trunk
+are preferred.
+
+There are likely other places around the Internet from which one can
+download netperf bits.  These may be simple mirrors of the main
+netperf site, or they may be local variants on netperf.  As with
+anything one downloads from the Internet, take care to make sure it is
+what you really wanted and isn't some malicious Trojan or whatnot.
+Caveat downloader.
+
+As a general rule, binaries of netperf and netserver are not
+distributed from ftp.netperf.org.  From time to time a kind soul or
+souls has packaged netperf as a Debian package available via the
+apt-get mechanism or as an RPM.  I would be most interested in
+learning how to enhance the makefiles to make that easier for people.
+
+@node Installing Netperf Bits, Verifying Installation, Getting Netperf Bits, Installing Netperf
+@section Installing Netperf
+
+Once you have downloaded the tar file of netperf sources onto your
+system(s), it is necessary to unpack the tar file, cd to the netperf
+directory, run configure and then make.  Most of the time it should be
+sufficient to just:
+
+@example
+gzcat netperf-<version>.tar.gz | tar xf -
+cd netperf-<version>
+./configure
+make
+make install
+@end example
+
+Most of the ``usual'' configure script options should be present
+dealing with where to install binaries and whatnot.  
+@example
+./configure --help
+@end example
+should list all of those and more.  You may find the @code{--prefix}
+option helpful in deciding where the binaries and such will be put
+during the @code{make install}.
+
+@vindex --enable-cpuutil, Configure
+If the netperf configure script does not know how to automagically
+detect which CPU utilization mechanism to use on your platform you may
+want to add a @code{--enable-cpuutil=mumble} option to the configure
+command.   If you have knowledge and/or experience to contribute to
+that area, feel free to contact @email{netperf-feedback@@netperf.org}.
+
+@vindex --enable-xti, Configure
+@vindex --enable-unixdomain, Configure
+@vindex --enable-dlpi, Configure
+@vindex --enable-sctp, Configure
+Similarly, if you want tests using the XTI interface, Unix Domain
+Sockets, DLPI or SCTP it will be necessary to add one or more
+@code{--enable-[xti|unixdomain|dlpi|sctp]=yes} options to the configure
+command.  As of this writing, the configure script will not include
+those tests automagically.
+
+@vindex --enable-omni, Configure
+Starting with version 2.5.0, netperf began migrating most of the
+``classic'' netperf tests found in @file{src/nettest_bsd.c} to the
+so-called ``omni'' tests (aka ``two routines to run them all'') found
+in @file{src/nettest_omni.c}.  This migration enables a number of new
+features such as greater control over what output is included, and new
+things to output.  The ``omni'' test is enabled by default in 2.5.0
+and a number of the classic tests are migrated - you can tell if a
+test has been migrated
+from the presence of @code{MIGRATED} in the test banner.  If you
+encounter problems with either the omni or migrated tests, please
+first attempt to obtain resolution via
+@email{netperf-talk@@netperf.org} or
+@email{netperf-feedback@@netperf.org}.  If that is unsuccessful, you
+can add a @code{--enable-omni=no} to the configure command and the
+omni tests will not be compiled-in and the classic tests will not be
+migrated.
+
+Starting with version 2.5.0, netperf includes the ``burst mode''
+functionality in a default compilation of the bits.  If you encounter
+problems with this, please first attempt to obtain help via
+@email{netperf-talk@@netperf.org} or
+@email{netperf-feedback@@netperf.org}.  If that is unsuccessful, you
+can add a @code{--enable-burst=no} to the configure command and the
+burst mode functionality will not be compiled-in.
+
+On some platforms, it may be necessary to precede the configure
+command with a CFLAGS and/or LIBS variable as the netperf configure
+script is not yet smart enough to set them itself.  Whenever possible,
+these requirements will be found in @file{README.@var{platform}} files.
+Expertise and assistance in making that more automagic in the
+configure script would be most welcome.
+
+@cindex Limiting Bandwidth
+@cindex Bandwidth Limitation
+@vindex --enable-intervals, Configure
+@vindex --enable-histogram, Configure
+Other optional configure-time settings include
+@code{--enable-intervals=yes} to give netperf the ability to ``pace''
+its _STREAM tests and @code{--enable-histogram=yes} to have netperf
+keep a histogram of interesting times.  Each of these will have some
+effect on the measured result.  If your system supports
+@code{gethrtime()} the effect of the histogram measurement should be
+minimized but probably still measurable.  For example, the histogram
+of a netperf TCP_RR test will be of the individual transaction times:
+@example
+netperf -t TCP_RR -H lag -v 2
+TCP REQUEST/RESPONSE TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to lag.hpl.hp.com (15.4.89.214) port 0 AF_INET : histogram
+Local /Remote
+Socket Size   Request  Resp.   Elapsed  Trans.
+Send   Recv   Size     Size    Time     Rate         
+bytes  Bytes  bytes    bytes   secs.    per sec   
+
+16384  87380  1        1       10.00    3538.82   
+32768  32768 
+Alignment      Offset
+Local  Remote  Local  Remote
+Send   Recv    Send   Recv
+    8      0       0      0
+Histogram of request/response times
+UNIT_USEC     :    0:    0:    0:    0:    0:    0:    0:    0:    0:    0
+TEN_USEC      :    0:    0:    0:    0:    0:    0:    0:    0:    0:    0
+HUNDRED_USEC  :    0: 34480:  111:   13:   12:    6:    9:    3:    4:    7
+UNIT_MSEC     :    0:   60:   50:   51:   44:   44:   72:  119:  100:  101
+TEN_MSEC      :    0:  105:    0:    0:    0:    0:    0:    0:    0:    0
+HUNDRED_MSEC  :    0:    0:    0:    0:    0:    0:    0:    0:    0:    0
+UNIT_SEC      :    0:    0:    0:    0:    0:    0:    0:    0:    0:    0
+TEN_SEC       :    0:    0:    0:    0:    0:    0:    0:    0:    0:    0
+>100_SECS: 0
+HIST_TOTAL:      35391
+@end example
+
+The histogram you see above is basically a base-10 log histogram where
+we can see that most of the transaction times were on the order of one
+hundred to one-hundred, ninety-nine microseconds, but they were
+occasionally as long as ten to nineteen milliseconds
+
+The @option{--enable-demo=yes} configure option will cause code to be
+included to report interim results during a test run.  The rate at
+which interim results are reported can then be controlled via the
+global @option{-D} option.  Here is an example of @option{-D} output:
+
+@example
+$ src/netperf -D 1.35 -H tardy.hpl.hp.com -f M
+MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to tardy.hpl.hp.com (15.9.116.144) port 0 AF_INET : demo
+Interim result:    5.41 MBytes/s over 1.35 seconds ending at 1308789765.848
+Interim result:   11.07 MBytes/s over 1.36 seconds ending at 1308789767.206
+Interim result:   16.00 MBytes/s over 1.36 seconds ending at 1308789768.566
+Interim result:   20.66 MBytes/s over 1.36 seconds ending at 1308789769.922
+Interim result:   22.74 MBytes/s over 1.36 seconds ending at 1308789771.285
+Interim result:   23.07 MBytes/s over 1.36 seconds ending at 1308789772.647
+Interim result:   23.77 MBytes/s over 1.37 seconds ending at 1308789774.016
+Recv   Send    Send                          
+Socket Socket  Message  Elapsed              
+Size   Size    Size     Time     Throughput  
+bytes  bytes   bytes    secs.    MBytes/sec  
+
+ 87380  16384  16384    10.06      17.81   
+@end example
+
+Notice how the units of the interim result track that requested by the
+@option{-f} option.  Also notice that sometimes the interval will be
+longer than the value specified in the @option{-D} option.  This is
+normal and stems from how demo mode is implemented not by relying on
+interval timers or frequent calls to get the current time, but by
+calculating how many units of work must be performed to take at least
+the desired interval.
+
+Those familiar with this option in earlier versions of netperf will
+note the addition of the ``ending at'' text.  This is the time as
+reported by a @code{gettimeofday()} call (or its emulation) with a
+@code{NULL} timezone pointer.  This addition is intended to make it
+easier to insert interim results into an
+@uref{http://oss.oetiker.ch/rrdtool/doc/rrdtool.en.html,rrdtool}
+Round-Robin Database (RRD).  A likely bug-riddled example of doing so
+can be found in @file{doc/examples/netperf_interim_to_rrd.sh}.  The
+time is reported out to milliseconds rather than microseconds because
+that is the most rrdtool understands as of the time of this writing.
+
+As of this writing, a @code{make install} will not actually update the
+files @file{/etc/services} and/or @file{/etc/inetd.conf} or their
+platform-specific equivalents.  It remains necessary to perform that
+bit of installation magic by hand.  Patches to the makefile sources to
+effect an automagic editing of the necessary files to have netperf
+installed as a child of inetd would be most welcome.
+
+Starting the netserver as a standalone daemon should be as easy as:
+@example
+$ netserver
+Starting netserver at port 12865
+Starting netserver at hostname 0.0.0.0 port 12865 and family 0
+@end example
+
+Over time the specifics of the messages netserver prints to the screen
+may change but the gist will remain the same.
+
+If the compilation of netperf or netserver happens to fail, feel free
+to contact @email{netperf-feedback@@netperf.org} or join and ask in
+@email{netperf-talk@@netperf.org}.  However, it is quite important
+that you include the actual compilation errors and perhaps even the
+configure log in your email.  Otherwise, it will be that much more
+difficult for someone to assist you.
+
+@node Verifying Installation,  , Installing Netperf Bits, Installing Netperf
+@section Verifying Installation
+
+Basically, once netperf is installed and netserver is configured as a
+child of inetd, or launched as a standalone daemon, simply typing:
+@example
+netperf
+@end example
+should result in output similar to the following:
+@example
+$ netperf
+TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to localhost.localdomain (127.0.0.1) port 0 AF_INET
+Recv   Send    Send                          
+Socket Socket  Message  Elapsed              
+Size   Size    Size     Time     Throughput  
+bytes  bytes   bytes    secs.    10^6bits/sec  
+
+ 87380  16384  16384    10.00    2997.84   
+@end example
+
+
+@node The Design of Netperf, Global Command-line Options, Installing Netperf, Top
+@chapter The Design of Netperf
+
+@cindex Design of Netperf
+
+Netperf is designed around a basic client-server model.  There are
+two executables - netperf and netserver.  Generally you will only
+execute the netperf program, with the netserver program being invoked
+by the remote system's inetd or having been previously started as its
+own standalone daemon.
+
+When you execute netperf it will establish a ``control connection'' to
+the remote system.  This connection will be used to pass test
+configuration information and results to and from the remote system.
+Regardless of the type of test to be run, the control connection will
+be a TCP connection using BSD sockets.  The control connection can use
+either IPv4 or IPv6.
+
+Once the control connection is up and the configuration information
+has been passed, a separate ``data'' connection will be opened for the
+measurement itself using the API's and protocols appropriate for the
+specified test.  When the test is completed, the data connection will
+be torn-down and results from the netserver will be passed-back via the
+control connection and combined with netperf's result for display to
+the user.
+
+Netperf places no traffic on the control connection while a test is in
+progress.  Certain TCP options, such as SO_KEEPALIVE, if set as your
+systems' default, may put packets out on the control connection while
+a test is in progress.  Generally speaking this will have no effect on
+the results.
+
+@menu
+* CPU Utilization::             
+@end menu
+
+@node CPU Utilization,  , The Design of Netperf, The Design of Netperf
+@section CPU Utilization
+@cindex CPU Utilization
+
+CPU utilization is an important, and alas all-too infrequently
+reported component of networking performance.  Unfortunately, it can
+be one of the most difficult metrics to measure accurately and
+portably.  Netperf will do its level best to report accurate
+CPU utilization figures, but some combinations of processor, OS and
+configuration may make that difficult.
+
+CPU utilization in netperf is reported as a value between 0 and 100%
+regardless of the number of CPUs involved.  In addition to CPU
+utilization, netperf will report a metric called a @dfn{service
+demand}.  The service demand is the normalization of CPU utilization
+and work performed.  For a _STREAM test it is the microseconds of CPU
+time consumed to transfer on KB (K == 1024) of data.  For a _RR test
+it is the microseconds of CPU time consumed processing a single
+transaction.   For both CPU utilization and service demand, lower is
+better. 
+
+Service demand can be particularly useful when trying to gauge the
+effect of a performance change.  It is essentially a measure of
+efficiency, with smaller values being more efficient and thus
+``better.''
+
+Netperf is coded to be able to use one of several, generally
+platform-specific CPU utilization measurement mechanisms.  Single
+letter codes will be included in the CPU portion of the test banner to
+indicate which mechanism was used on each of the local (netperf) and
+remote (netserver) system.
+
+As of this writing those codes are:
+
+@table @code
+@item U
+The CPU utilization measurement mechanism was unknown to netperf or
+netperf/netserver was not compiled to include CPU utilization
+measurements. The code for the null CPU utilization mechanism can be
+found in @file{src/netcpu_none.c}.
+@item I
+An HP-UX-specific CPU utilization mechanism whereby the kernel
+incremented a per-CPU counter by one for each trip through the idle
+loop. This mechanism was only available on specially-compiled HP-UX
+kernels prior to HP-UX 10 and is mentioned here only for the sake of
+historical completeness and perhaps as a suggestion to those who might
+be altering other operating systems. While rather simple, perhaps even
+simplistic, this mechanism was quite robust and was not affected by
+the concerns of statistical methods, or methods attempting to track
+time in each of user, kernel, interrupt and idle modes which require
+quite careful accounting.  It can be thought-of as the in-kernel
+version of the looper @code{L} mechanism without the context switch
+overhead. This mechanism required calibration.
+@item P
+An HP-UX-specific CPU utilization mechanism whereby the kernel
+keeps-track of time (in the form of CPU cycles) spent in the kernel
+idle loop (HP-UX 10.0 to 11.31 inclusive), or where the kernel keeps
+track of time spent in idle, user, kernel and interrupt processing
+(HP-UX 11.23 and later).  The former requires calibration, the latter
+does not.  Values in either case are retrieved via one of the pstat(2)
+family of calls, hence the use of the letter @code{P}.  The code for
+these mechanisms is found in @file{src/netcpu_pstat.c} and
+@file{src/netcpu_pstatnew.c} respectively.
+@item K
+A Solaris-specific CPU utilization mechanism whereby the kernel keeps
+track of ticks (eg HZ) spent in the idle loop.  This method is
+statistical and is known to be inaccurate when the interrupt rate is
+above epsilon as time spent processing interrupts is not subtracted
+from idle.  The value is retrieved via a kstat() call - hence the use
+of the letter @code{K}.  Since this mechanism uses units of ticks (HZ)
+the calibration value should invariably match HZ. (Eg 100) The code
+for this mechanism is implemented in @file{src/netcpu_kstat.c}.
+@item M
+A Solaris-specific mechanism available on Solaris 10 and latter which
+uses the new microstate accounting mechanisms.  There are two, alas,
+overlapping, mechanisms.  The first tracks nanoseconds spent in user,
+kernel, and idle modes. The second mechanism tracks nanoseconds spent
+in interrupt.  Since the mechanisms overlap, netperf goes through some
+hand-waving to try to ``fix'' the problem.  Since the accuracy of the
+handwaving cannot be completely determined, one must presume that
+while better than the @code{K} mechanism, this mechanism too is not
+without issues.  The values are retrieved via kstat() calls, but the
+letter code is set to @code{M} to distinguish this mechanism from the
+even less accurate @code{K} mechanism.  The code for this mechanism is
+implemented in @file{src/netcpu_kstat10.c}.
+@item L
+A mechanism based on ``looper''or ``soaker'' processes which sit in
+tight loops counting as fast as they possibly can. This mechanism
+starts a looper process for each known CPU on the system.  The effect
+of processor hyperthreading on the mechanism is not yet known.  This
+mechanism definitely requires calibration.  The code for the
+``looper''mechanism can be found in @file{src/netcpu_looper.c}
+@item N
+A Microsoft Windows-specific mechanism, the code for which can be
+found in @file{src/netcpu_ntperf.c}.  This mechanism too is based on
+what appears to be a form of micro-state accounting and requires no
+calibration.  On laptops, or other systems which may dynamically alter
+the CPU frequency to minimize power consumption, it has been suggested
+that this mechanism may become slightly confused, in which case using
+BIOS/uEFI settings to disable the power saving would be indicated.
+
+@item S
+This mechanism uses @file{/proc/stat} on Linux to retrieve time
+(ticks) spent in idle mode.  It is thought but not known to be
+reasonably accurate.  The code for this mechanism can be found in
+@file{src/netcpu_procstat.c}.
+@item C
+A mechanism somewhat similar to @code{S} but using the sysctl() call
+on BSD-like Operating systems (*BSD and MacOS X).  The code for this
+mechanism can be found in @file{src/netcpu_sysctl.c}.
+@item Others
+Other mechanisms included in netperf in the past have included using
+the times() and getrusage() calls.  These calls are actually rather
+poorly suited to the task of measuring CPU overhead for networking as
+they tend to be process-specific and much network-related processing
+can happen outside the context of a process, in places where it is not
+a given it will be charged to the correct, or even a process.  They
+are mentioned here as a warning to anyone seeing those mechanisms used
+in other networking benchmarks.  These mechanisms are not available in
+netperf 2.4.0 and later.
+@end table
+
+For many platforms, the configure script will chose the best available
+CPU utilization mechanism.  However, some platforms have no
+particularly good mechanisms.  On those platforms, it is probably best
+to use the ``LOOPER'' mechanism which is basically some number of
+processes (as many as there are processors) sitting in tight little
+loops counting as fast as they can.  The rate at which the loopers
+count when the system is believed to be idle is compared with the rate
+when the system is running netperf and the ratio is used to compute
+CPU utilization.
+
+In the past, netperf included some mechanisms that only reported CPU
+time charged to the calling process.  Those mechanisms have been
+removed from netperf versions 2.4.0 and later because they are
+hopelessly inaccurate.  Networking can and often results in CPU time
+being spent in places - such as interrupt contexts - that do not get
+charged to a or the correct process.
+
+In fact, time spent in the processing of interrupts is a common issue
+for many CPU utilization mechanisms.  In particular, the ``PSTAT''
+mechanism was eventually known to have problems accounting for certain
+interrupt time prior to HP-UX 11.11 (11iv1).  HP-UX 11iv2 and later
+are known/presumed to be good. The ``KSTAT'' mechanism is known to
+have problems on all versions of Solaris up to and including Solaris
+10.  Even the microstate accounting available via kstat in Solaris 10
+has issues, though perhaps not as bad as those of prior versions.
+
+The /proc/stat mechanism under Linux is in what the author would
+consider an ``uncertain'' category as it appears to be statistical,
+which may also have issues with time spent processing interrupts.
+
+In summary, be sure to ``sanity-check'' the CPU utilization figures
+with other mechanisms.  However, platform tools such as top, vmstat or
+mpstat are often based on the same mechanisms used by netperf.
+
+@menu
+* CPU Utilization in a Virtual Guest::  
+@end menu
+
+@node CPU Utilization in a Virtual Guest,  , CPU Utilization, CPU Utilization
+@subsection CPU Utilization in a Virtual Guest
+
+The CPU utilization mechanisms used by netperf are ``inline'' in that
+they are run by the same netperf or netserver process as is running
+the test itself.  This works just fine for ``bare iron'' tests but
+runs into a problem when using virtual machines.
+
+The relationship between virtual guest and hypervisor can be thought
+of as being similar to that between a process and kernel in a bare
+iron system.  As such, (m)any CPU utilization mechanisms used in the
+virtual guest are similar to ``process-local'' mechanisms in a bare
+iron situation.  However, just as with bare iron and process-local
+mechanisms, much networking processing happens outside the context of
+the virtual guest.  It takes place in the hypervisor, and is not
+visible to mechanisms running in the guest(s).  For this reason, one
+should not really trust CPU utilization figures reported by netperf or
+netserver when running in a virtual guest.
+
+If one is looking to measure the added overhead of a virtualization
+mechanism, rather than rely on CPU utilization, one can rely instead
+on netperf _RR tests - path-lengths and overheads can be a significant
+fraction of the latency, so increases in overhead should appear as
+decreases in transaction rate.  Whatever you do, @b{DO NOT} rely on
+the throughput of a _STREAM test.  Achieving link-rate can be done via
+a multitude of options that mask overhead rather than eliminate it.
+
+@node Global Command-line Options, Using Netperf to Measure Bulk Data Transfer, The Design of Netperf, Top
+@chapter Global Command-line Options
+
+This section describes each of the global command-line options
+available in the netperf and netserver binaries.  Essentially, it is
+an expanded version of the usage information displayed by netperf or
+netserver when invoked with the @option{-h} global command-line
+option.
+
+@menu
+* Command-line Options Syntax::  
+* Global Options::              
+@end menu
+
+@node Command-line Options Syntax, Global Options, Global Command-line Options, Global Command-line Options
+@comment  node-name,  next,  previous,  up
+@section Command-line Options Syntax
+
+Revision 1.8 of netperf introduced enough new functionality to overrun
+the English alphabet for mnemonic command-line option names, and the
+author was not and is not quite ready to switch to the contemporary
+@option{--mumble} style of command-line options. (Call him a Luddite
+if you wish :).
+
+For this reason, the command-line options were split into two parts -
+the first are the global command-line options.  They are options that
+affect nearly any and every test type of netperf.  The second type are
+the test-specific command-line options.  Both are entered on the same
+command line, but they must be separated from one another by a @code{--}
+for correct parsing.  Global command-line options come first, followed
+by the @code{--} and then test-specific command-line options.  If there
+are no test-specific options to be set, the @code{--} may be omitted.  If
+there are no global command-line options to be set, test-specific
+options must still be preceded by a @code{--}.  For example:
+@example
+netperf <global> -- <test-specific>
+@end example
+sets both global and test-specific options:
+@example
+netperf <global>
+@end example
+sets just global options and:
+@example
+netperf -- <test-specific>
+@end example
+sets just test-specific options.
+
+@node Global Options,  , Command-line Options Syntax, Global Command-line Options
+@comment  node-name,  next,  previous,  up
+@section Global Options
+
+@table @code
+@vindex -a, Global
+@item -a <sizespec>
+This option allows you to alter the alignment of the buffers used in
+the sending and receiving calls on the local system.. Changing the
+alignment of the buffers can force the system to use different copy
+schemes, which can have a measurable effect on performance.  If the
+page size for the system were 4096 bytes, and you want to pass
+page-aligned buffers beginning on page boundaries, you could use
+@samp{-a 4096}.  By default the units are bytes, but suffix of ``G,''
+``M,'' or ``K'' will specify the units to be 2^30 (GB), 2^20 (MB) or
+2^10 (KB) respectively. A suffix of ``g,'' ``m'' or ``k'' will specify
+units of 10^9, 10^6 or 10^3 bytes respectively. [Default: 8 bytes]
+
+@vindex -A, Global
+@item -A <sizespec>
+This option is identical to the @option{-a} option with the difference
+being it affects alignments for the remote system.
+
+@vindex -b, Global
+@item -b <size>
+This option is only present when netperf has been configure with
+--enable-intervals=yes prior to compilation.  It sets the size of the
+burst of send calls in a _STREAM test.  When used in conjunction with
+the @option{-w} option it can cause the rate at which data is sent to
+be ``paced.''
+
+@vindex -B, Global
+@item -B <string>
+This option will cause @option{<string>} to be appended to the brief
+(see -P) output of netperf.
+
+@vindex -c, Global
+@item -c [rate]
+This option will ask that CPU utilization and service demand be
+calculated for the local system.  For those CPU utilization mechanisms
+requiring calibration, the options rate parameter may be specified to
+preclude running another calibration step, saving 40 seconds of time.
+For those CPU utilization mechanisms requiring no calibration, the
+optional rate parameter will be utterly and completely ignored.
+[Default: no CPU measurements]
+
+@vindex -C, Global
+@item -C [rate]
+This option requests CPU utilization and service demand calculations
+for the remote system.  It is otherwise identical to the @option{-c}
+option.
+
+@vindex -d, Global
+@item -d
+Each instance of this option will increase the quantity of debugging
+output displayed during a test.  If the debugging output level is set
+high enough, it may have a measurable effect on performance.
+Debugging information for the local system is printed to stdout.
+Debugging information for the remote system is sent by default to the
+file @file{/tmp/netperf.debug}. [Default: no debugging output]
+
+@vindex -D, Global
+@item -D [interval,units]
+This option is only available when netperf is configured with
+--enable-demo=yes.  When set, it will cause netperf to emit periodic
+reports of performance during the run.  [@var{interval},@var{units}]
+follow the semantics of an optionspec. If specified,
+@var{interval} gives the minimum interval in real seconds, it does not
+have to be whole seconds.  The @var{units} value can be used for the
+first guess as to how many units of work (bytes or transactions) must
+be done to take at least @var{interval} seconds. If omitted,
+@var{interval} defaults to one second and @var{units} to values
+specific to each test type.
+
+@vindex -f, Global
+@item -f G|M|K|g|m|k|x
+This option can be used to change the reporting units for _STREAM
+tests.  Arguments of ``G,'' ``M,'' or ``K'' will set the units to
+2^30, 2^20 or 2^10 bytes/s respectively (EG power of two GB, MB or
+KB).  Arguments of ``g,'' ``,m'' or ``k'' will set the units to 10^9,
+10^6 or 10^3 bits/s respectively.  An argument of ``x'' requests the
+units be transactions per second and is only meaningful for a
+request-response test. [Default: ``m'' or 10^6 bits/s]
+
+@vindex -F, Global
+@item -F <fillfile>
+This option specified the file from which send which buffers will be
+pre-filled .  While the buffers will contain data from the specified
+file, the file is not fully transferred to the remote system as the
+receiving end of the test will not write the contents of what it
+receives to a file.  This can be used to pre-fill the send buffers
+with data having different compressibility and so is useful when
+measuring performance over mechanisms which perform compression. 
+
+While previously required for a TCP_SENDFILE test, later versions of
+netperf removed that restriction, creating a temporary file as
+needed.  While the author cannot recall exactly when that took place,
+it is known to be unnecessary in version 2.5.0 and later.
+
+@vindex -h, Global
+@item -h
+This option causes netperf to display its ``global'' usage string and
+exit to the exclusion of all else.
+
+@vindex -H, Global
+@item -H <optionspec>
+This option will set the name of the remote system and or the address
+family used for the control connection.  For example:
+@example
+-H linger,4
+@end example
+will set the name of the remote system to ``linger'' and tells netperf to
+use IPv4 addressing only.
+@example
+-H ,6
+@end example
+will leave the name of the remote system at its default, and request
+that only IPv6 addresses be used for the control connection.
+@example
+-H lag
+@end example
+will set the name of the remote system to ``lag'' and leave the
+address family to AF_UNSPEC which means selection of IPv4 vs IPv6 is
+left to the system's address resolution.  
+
+A value of ``inet'' can be used in place of ``4'' to request IPv4 only
+addressing.  Similarly, a value of ``inet6'' can be used in place of
+``6'' to request IPv6 only addressing.  A value of ``0'' can be used
+to request either IPv4 or IPv6 addressing as name resolution dictates.
+
+By default, the options set with the global @option{-H} option are
+inherited by the test for its data connection, unless a test-specific
+@option{-H} option is specified.
+
+If a @option{-H} option follows either the @option{-4} or @option{-6}
+options, the family setting specified with the -H option will override
+the @option{-4} or @option{-6} options for the remote address
+family. If no address family is specified, settings from a previous
+@option{-4} or @option{-6} option will remain.  In a nutshell, the
+last explicit global command-line option wins.
+
+[Default:  ``localhost'' for the remote name/IP address and ``0'' (eg
+AF_UNSPEC) for the remote address family.]
+
+@vindex -I, Global
+@item -I <optionspec>
+This option enables the calculation of confidence intervals and sets
+the confidence and width parameters with the first half of the
+optionspec being either 99 or 95 for 99% or 95% confidence
+respectively.  The second value of the optionspec specifies the width
+of the desired confidence interval.  For example
+@example
+-I 99,5
+@end example
+asks netperf to be 99% confident that the measured mean values for
+throughput and CPU utilization are within +/- 2.5% of the ``real''
+mean values.  If the @option{-i} option is specified and the
+@option{-I} option is omitted, the confidence defaults to 99% and the
+width to 5% (giving +/- 2.5%)
+
+If classic netperf test calculates that the desired confidence
+intervals have not been met, it emits a noticeable warning that cannot
+be suppressed with the @option{-P} or @option{-v} options:
+
+@example
+netperf -H tardy.cup -i 3 -I 99,5
+TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to tardy.cup.hp.com (15.244.44.58) port 0 AF_INET : +/-2.5% @ 99% conf.
+!!! WARNING
+!!! Desired confidence was not achieved within the specified iterations.
+!!! This implies that there was variability in the test environment that
+!!! must be investigated before going further.
+!!! Confidence intervals: Throughput      :  6.8%
+!!!                       Local CPU util  :  0.0%
+!!!                       Remote CPU util :  0.0%
+
+Recv   Send    Send                          
+Socket Socket  Message  Elapsed              
+Size   Size    Size     Time     Throughput  
+bytes  bytes   bytes    secs.    10^6bits/sec  
+
+ 32768  16384  16384    10.01      40.23   
+@end example
+
+In the example above we see that netperf did not meet the desired
+confidence intervals.  Instead of being 99% confident it was within
++/- 2.5% of the real mean value of throughput it is only confident it
+was within +/-3.4%.  In this example, increasing the @option{-i}
+option (described below) and/or increasing the iteration length with
+the @option{-l} option might resolve the situation.
+
+In an explicit ``omni'' test, failure to meet the confidence intervals
+will not result in netperf emitting a warning.  To verify the hitting,
+or not, of the confidence intervals one will need to include them as
+part of an @ref{Omni Output Selection,output selection} in the
+test-specific @option{-o}, @option{-O} or @option{k} output selection
+options.  The warning about not hitting the confidence intervals will
+remain in a ``migrated'' classic netperf test.
+
+@vindex -i, Global
+@item -i <sizespec>
+This option enables the calculation of confidence intervals and sets
+the minimum and maximum number of iterations to run in attempting to
+achieve the desired confidence interval.  The first value sets the
+maximum number of iterations to run, the second, the minimum.  The
+maximum number of iterations is silently capped at 30 and the minimum
+is silently floored at 3.  Netperf repeats the measurement the minimum
+number of iterations and continues until it reaches either the
+desired confidence interval, or the maximum number of iterations,
+whichever comes first.  A classic or migrated netperf test will not
+display the actual number of iterations run. An @ref{The Omni
+Tests,omni test} will emit the number of iterations run if the
+@code{CONFIDENCE_ITERATION} output selector is included in the
+@ref{Omni Output Selection,output selection}.
+
+If the @option{-I} option is specified and the @option{-i} option
+omitted the maximum number of iterations is set to 10 and the minimum
+to three.
+
+Output of a warning upon not hitting the desired confidence intervals
+follows the description provided for the @option{-I} option.
+
+The total test time will be somewhere between the minimum and maximum
+number of iterations multiplied by the test length supplied by the
+@option{-l} option.
+
+@vindex -j, Global
+@item -j
+This option instructs netperf to keep additional timing statistics
+when explicitly running an @ref{The Omni Tests,omni test}.  These can
+be output when the test-specific @option{-o}, @option{-O} or
+@option{-k} @ref{Omni Output Selectors,output selectors} include one
+or more of:
+
+@itemize
+@item MIN_LATENCY
+@item MAX_LATENCY
+@item P50_LATENCY
+@item P90_LATENCY
+@item P99_LATENCY
+@item MEAN_LATENCY
+@item STDDEV_LATENCY
+@end itemize
+
+These statistics will be based on an expanded (100 buckets per row
+rather than 10) histogram of times rather than a terribly long list of
+individual times.  As such, there will be some slight error thanks to
+the bucketing. However, the reduction in storage and processing
+overheads is well worth it.  When running a request/response test, one
+might get some idea of the error by comparing the @ref{Omni Output
+Selectors,@code{MEAN_LATENCY}} calculated from the histogram with the
+@code{RT_LATENCY} calculated from the number of request/response
+transactions and the test run time.
+
+In the case of a request/response test the latencies will be
+transaction latencies.  In the case of a receive-only test they will
+be time spent in the receive call.  In the case of a send-only test
+they will be time spent in the send call. The units will be
+microseconds. Added in netperf 2.5.0.
+
+@vindex -l, Global
+@item -l testlen
+This option controls the length of any @b{one} iteration of the requested
+test.  A positive value for @var{testlen} will run each iteration of
+the test for at least @var{testlen} seconds.  A negative value for
+@var{testlen} will run each iteration for the absolute value of
+@var{testlen} transactions for a _RR test or bytes for a _STREAM test.
+Certain tests, notably those using UDP can only be timed, they cannot
+be limited by transaction or byte count.  This limitation may be
+relaxed in an @ref{The Omni Tests,omni} test.
+
+In some situations, individual iterations of a test may run for longer
+for the number of seconds specified by the @option{-l} option.  In
+particular, this may occur for those tests where the socket buffer
+size(s) are significantly longer than the bandwidthXdelay product of
+the link(s) over which the data connection passes, or those tests
+where there may be non-trivial numbers of retransmissions.
+
+If confidence intervals are enabled via either @option{-I} or
+@option{-i} the total length of the netperf test will be somewhere
+between the minimum and maximum iteration count multiplied by
+@var{testlen}.
+
+@vindex -L, Global
+@item -L <optionspec>
+This option is identical to the @option{-H} option with the difference
+being it sets the _local_ hostname/IP and/or address family
+information.  This option is generally unnecessary, but can be useful
+when you wish to make sure that the netperf control and data
+connections go via different paths.  It can also come-in handy if one
+is trying to run netperf through those evil, end-to-end breaking
+things known as firewalls.
+
+[Default: 0.0.0.0 (eg INADDR_ANY) for IPv4 and ::0 for IPv6 for the
+local name.  AF_UNSPEC for the local address family.]
+
+@vindex -n, Global
+@item -n numcpus
+This option tells netperf how many CPUs it should ass-u-me are active
+on the system running netperf.  In particular, this is used for the
+@ref{CPU Utilization,CPU utilization} and service demand calculations.
+On certain systems, netperf is able to determine the number of CPU's
+automagically. This option will override any number netperf might be
+able to determine on its own.
+
+Note that this option does _not_ set the number of CPUs on the system
+running netserver.  When netperf/netserver cannot automagically
+determine the number of CPUs that can only be set for netserver via a
+netserver @option{-n} command-line option.
+
+As it is almost universally possible for netperf/netserver to
+determine the number of CPUs on the system automagically, 99 times out
+of 10 this option should not be necessary and may be removed in a
+future release of netperf.
+
+@vindex -N, Global
+@item -N
+This option tells netperf to forgo establishing a control
+connection. This makes it is possible to run some limited netperf
+tests without a corresponding netserver on the remote system.
+
+With this option set, the test to be run is to get all the addressing
+information it needs to establish its data connection from the command
+line or internal defaults.  If not otherwise specified by
+test-specific command line options, the data connection for a
+``STREAM'' or ``SENDFILE'' test will be to the ``discard'' port, an
+``RR'' test will be to the ``echo'' port, and a ``MEARTS'' test will
+be to the chargen port.  
+
+The response size of an ``RR'' test will be silently set to be the
+same as the request size.  Otherwise the test would hang if the
+response size was larger than the request size, or would report an
+incorrect, inflated transaction rate if the response size was less
+than the request size.
+
+Since there is no control connection when this option is specified, it
+is not possible to set ``remote'' properties such as socket buffer
+size and the like via the netperf command line. Nor is it possible to
+retrieve such interesting remote information as CPU utilization.
+These items will be displayed as values which should make it
+immediately obvious that was the case.
+
+The only way to change remote characteristics such as socket buffer
+size or to obtain information such as CPU utilization is to employ
+platform-specific methods on the remote system.  Frankly, if one has
+access to the remote system to employ those methods one aught to be
+able to run a netserver there.  However, that ability may not be
+present in certain ``support'' situations, hence the addition of this
+option.
+
+Added in netperf 2.4.3.
+
+@vindex -o, Global
+@item -o <sizespec>
+The value(s) passed-in with this option will be used as an offset
+added to the alignment specified with the @option{-a} option.  For
+example:
+@example
+-o 3 -a 4096
+@end example
+will cause the buffers passed to the local (netperf) send and receive
+calls to begin three bytes past an address aligned to 4096
+bytes. [Default: 0 bytes]
+
+@vindex -O, Global
+@item -O <sizespec>
+This option behaves just as the @option{-o} option but on the remote
+(netserver) system and in conjunction with the @option{-A}
+option. [Default: 0 bytes]
+
+@vindex -p, Global
+@item -p <optionspec>
+The first value of the optionspec passed-in with this option tells
+netperf the port number at which it should expect the remote netserver
+to be listening for control connections.  The second value of the
+optionspec will request netperf to bind to that local port number
+before establishing the control connection.  For example
+@example
+-p 12345
+@end example
+tells netperf that the remote netserver is listening on port 12345 and
+leaves selection of the local port number for the control connection
+up to the local TCP/IP stack whereas
+@example
+-p ,32109
+@end example
+leaves the remote netserver port at the default value of 12865 and
+causes netperf to bind to the local port number 32109 before
+connecting to the remote netserver.
+
+In general, setting the local port number is only necessary when one
+is looking to run netperf through those evil, end-to-end breaking
+things known as firewalls.
+
+@vindex -P, Global
+@item -P 0|1
+A value of ``1'' for the @option{-P} option will enable display of
+the test banner.  A value of ``0'' will disable display of the test
+banner. One might want to disable display of the test banner when
+running the same basic test type (eg TCP_STREAM) multiple times in
+succession where the test banners would then simply be redundant and
+unnecessarily clutter the output. [Default: 1 - display test banners]
+
+@vindex -s, Global
+@item -s <seconds>
+This option will cause netperf to sleep @samp{<seconds>} before
+actually transferring data over the data connection.  This may be
+useful in situations where one wishes to start a great many netperf
+instances and do not want the earlier ones affecting the ability of
+the later ones to get established.
+
+Added somewhere between versions 2.4.3 and 2.5.0.
+
+@vindex -S, Global
+@item -S
+This option will cause an attempt to be made to set SO_KEEPALIVE on
+the data socket of a test using the BSD sockets interface.  The
+attempt will be made on the netperf side of all tests, and will be
+made on the netserver side of an @ref{The Omni Tests,omni} or
+@ref{Migrated Tests,migrated} test.  No indication of failure is given
+unless debug output is enabled with the global @option{-d} option.
+
+Added in version 2.5.0.
+
+@vindex -t, Global
+@item -t testname
+This option is used to tell netperf which test you wish to run.  As of
+this writing, valid values for @var{testname} include:
+@itemize
+@item
+@ref{TCP_STREAM}, @ref{TCP_MAERTS}, @ref{TCP_SENDFILE}, @ref{TCP_RR}, @ref{TCP_CRR}, @ref{TCP_CC}
+@item
+@ref{UDP_STREAM}, @ref{UDP_RR}
+@item
+@ref{XTI_TCP_STREAM},  @ref{XTI_TCP_RR}, @ref{XTI_TCP_CRR}, @ref{XTI_TCP_CC}
+@item
+@ref{XTI_UDP_STREAM}, @ref{XTI_UDP_RR}
+@item
+@ref{SCTP_STREAM}, @ref{SCTP_RR}
+@item
+@ref{DLCO_STREAM}, @ref{DLCO_RR},  @ref{DLCL_STREAM}, @ref{DLCL_RR}
+@item
+@ref{Other Netperf Tests,LOC_CPU}, @ref{Other Netperf Tests,REM_CPU}
+@item
+@ref{The Omni Tests,OMNI}
+@end itemize
+Not all tests are always compiled into netperf.  In particular, the
+``XTI,'' ``SCTP,'' ``UNIXDOMAIN,'' and ``DL*'' tests are only included in
+netperf when configured with
+@option{--enable-[xti|sctp|unixdomain|dlpi]=yes}.
+
+Netperf only runs one type of test no matter how many @option{-t}
+options may be present on the command-line.  The last @option{-t}
+global command-line option will determine the test to be
+run. [Default: TCP_STREAM]
+
+@vindex -T, Global
+@item -T <optionspec>
+This option controls the CPU, and probably by extension memory,
+affinity of netperf and/or netserver.
+@example
+netperf -T 1
+@end example
+will bind both netperf and netserver to ``CPU 1'' on their respective
+systems.
+@example
+netperf -T 1,
+@end example
+will bind just netperf to ``CPU 1'' and will leave netserver unbound.
+@example
+netperf -T ,2
+@end example
+will leave netperf unbound and will bind netserver to ``CPU 2.''
+@example
+netperf -T 1,2
+@end example
+will bind netperf to ``CPU 1'' and netserver to ``CPU 2.''
+
+This can be particularly useful when investigating performance issues
+involving where processes run relative to where NIC interrupts are
+processed or where NICs allocate their DMA buffers.
+
+@vindex -v, Global
+@item -v verbosity
+This option controls how verbose netperf will be in its output, and is
+often used in conjunction with the @option{-P} option. If the
+verbosity is set to a value of ``0'' then only the test's SFM (Single
+Figure of Merit) is displayed.  If local @ref{CPU Utilization,CPU
+utilization} is requested via the @option{-c} option then the SFM is
+the local service demand.  Othersise, if remote CPU utilization is
+requested via the @option{-C} option then the SFM is the remote
+service demand.  If neither local nor remote CPU utilization are
+requested the SFM will be the measured throughput or transaction rate
+as implied by the test specified with the @option{-t} option.
+
+If the verbosity level is set to ``1'' then the ``normal'' netperf
+result output for each test is displayed.
+
+If the verbosity level is set to ``2'' then ``extra'' information will
+be displayed.  This may include, but is not limited to the number of
+send or recv calls made and the average number of bytes per send or
+recv call, or a histogram of the time spent in each send() call or for
+each transaction if netperf was configured with
+@option{--enable-histogram=yes}. [Default: 1 - normal verbosity]
+
+In an @ref{The Omni Tests,omni} test the verbosity setting is largely
+ignored, save for when asking for the time histogram to be displayed.
+In version 2.5.0 and later there is no @ref{Omni Output Selectors,output
+selector} for the histogram and so it remains displayed only when the
+verbosity level is set to 2.
+
+@vindex -V, Global
+@item -V
+This option displays the netperf version and then exits.
+
+Added in netperf 2.4.4.
+
+@vindex -w, Global
+@item -w time
+If netperf was configured with @option{--enable-intervals=yes} then
+this value will set the inter-burst time to time milliseconds, and the
+@option{-b} option will set the number of sends per burst.  The actual
+inter-burst time may vary depending on the system's timer resolution.
+
+@vindex -W, Global
+@item -W <sizespec>
+This option controls the number of buffers in the send (first or only
+value) and or receive (second or only value) buffer rings.  Unlike
+some benchmarks, netperf does not continuously send or receive from a
+single buffer.  Instead it rotates through a ring of
+buffers. [Default: One more than the size of the send or receive
+socket buffer sizes (@option{-s} and/or @option{-S} options) divided
+by the send @option{-m} or receive @option{-M} buffer size
+respectively]
+
+@vindex -4, Global
+@item -4
+Specifying this option will set both the local and remote address
+families to AF_INET - that is use only IPv4 addresses on the control
+connection.  This can be overridden by a subsequent @option{-6},
+@option{-H} or @option{-L} option.  Basically, the last option
+explicitly specifying an address family wins.  Unless overridden by a
+test-specific option, this will be inherited for the data connection
+as well.
+
+@vindex -6, Global
+@item -6
+Specifying this option will set both local and and remote address
+families to AF_INET6 - that is use only IPv6 addresses on the control
+connection.  This can be overridden by a subsequent @option{-4},
+@option{-H} or @option{-L} option.  Basically, the last address family
+explicitly specified wins.  Unless overridden by a test-specific
+option, this will be inherited for the data connection as well.
+
+@end table
+
+
+@node Using Netperf to Measure Bulk Data Transfer, Using Netperf to Measure Request/Response , Global Command-line Options, Top
+@chapter Using Netperf to Measure Bulk Data Transfer
+
+The most commonly measured aspect of networked system performance is
+that of bulk or unidirectional transfer performance.  Everyone wants
+to know how many bits or bytes per second they can push across the
+network. The classic netperf convention for a bulk data transfer test
+name is to tack a ``_STREAM'' suffix to a test name.
+
+@menu
+* Issues in Bulk Transfer::     
+* Options common to TCP UDP and SCTP tests::  
+@end menu
+
+@node Issues in Bulk Transfer, Options common to TCP UDP and SCTP tests, Using Netperf to Measure Bulk Data Transfer, Using Netperf to Measure Bulk Data Transfer
+@comment  node-name,  next,  previous,  up
+@section Issues in Bulk Transfer
+
+There are any number of things which can affect the performance of a
+bulk transfer test.  
+
+Certainly, absent compression, bulk-transfer tests can be limited by
+the speed of the slowest link in the path from the source to the
+destination.  If testing over a gigabit link, you will not see more
+than a gigabit :) Such situations can be described as being
+@dfn{network-limited} or @dfn{NIC-limited}.
+
+CPU utilization can also affect the results of a bulk-transfer test.
+If the networking stack requires a certain number of instructions or
+CPU cycles per KB of data transferred, and the CPU is limited in the
+number of instructions or cycles it can provide, then the transfer can
+be described as being @dfn{CPU-bound}.  
+
+A bulk-transfer test can be CPU bound even when netperf reports less
+than 100% CPU utilization.  This can happen on an MP system where one
+or more of the CPUs saturate at 100% but other CPU's remain idle.
+Typically, a single flow of data, such as that from a single instance
+of a netperf _STREAM test cannot make use of much more than the power
+of one CPU. Exceptions to this generally occur when netperf and/or
+netserver run on CPU(s) other than the CPU(s) taking interrupts from
+the NIC(s). In that case, one might see as much as two CPUs' worth of
+processing being used to service the flow of data.
+
+Distance and the speed-of-light can affect performance for a
+bulk-transfer; often this can be mitigated by using larger windows.
+One common limit to the performance of a transport using window-based
+flow-control is:
+@example
+Throughput <= WindowSize/RoundTripTime
+@end example
+As the sender can only have a window's-worth of data outstanding on
+the network at any one time, and the soonest the sender can receive a
+window update from the receiver is one RoundTripTime (RTT).  TCP and
+SCTP are examples of such protocols.
+
+Packet losses and their effects can be particularly bad for
+performance.  This is especially true if the packet losses result in
+retransmission timeouts for the protocol(s) involved.  By the time a
+retransmission timeout has happened, the flow or connection has sat
+idle for a considerable length of time.
+
+On many platforms, some variant on the @command{netstat} command can
+be used to retrieve statistics about packet loss and
+retransmission. For example:
+@example
+netstat -p tcp
+@end example
+will retrieve TCP statistics on the HP-UX Operating System.  On other
+platforms, it may not be possible to retrieve statistics for a
+specific protocol and something like:
+@example
+netstat -s
+@end example
+would be used instead.
+
+Many times, such network statistics are keep since the time the stack
+started, and we are only really interested in statistics from when
+netperf was running.  In such situations something along the lines of:
+@example
+netstat -p tcp > before
+netperf -t TCP_mumble...
+netstat -p tcp > after
+@end example
+is indicated.  The
+@uref{ftp://ftp.cup.hp.com/dist/networking/tools/,beforeafter} utility
+can be used to subtract the statistics in @file{before} from the
+statistics in @file{after}:
+@example
+beforeafter before after > delta
+@end example
+and then one can look at the statistics in @file{delta}.  Beforeafter
+is distributed in source form so one can compile it on the platform(s)
+of interest. 
+
+If running a version 2.5.0 or later ``omni'' test under Linux one can
+include either or both of:
+@itemize
+@item LOCAL_TRANSPORT_RETRANS
+@item REMOTE_TRANSPORT_RETRANS
+@end itemize
+
+in the values provided via a test-specific @option{-o}, @option{-O},
+or @option{-k} output selction option and netperf will report the
+retransmissions experienced on the data connection, as reported via a
+@code{getsockopt(TCP_INFO)} call.  If confidence intervals have been
+requested via the global @option{-I} or @option{-i} options, the
+reported value(s) will be for the last iteration.  If the test is over
+a protocol other than TCP, or on a platform other than Linux, the
+results are undefined.
+
+While it was written with HP-UX's netstat in mind, the
+@uref{ftp://ftp.cup.hp.com/dist/networking/briefs/annotated_netstat.txt,annotated
+netstat} writeup may be helpful with other platforms as well.
+
+@node Options common to TCP UDP and SCTP tests,  , Issues in Bulk Transfer, Using Netperf to Measure Bulk Data Transfer
+@comment  node-name,  next,  previous,  up
+@section Options common to TCP UDP and SCTP tests
+
+Many ``test-specific'' options are actually common across the
+different tests.  For those tests involving TCP, UDP and SCTP, whether
+using the BSD Sockets or the XTI interface those common options
+include:
+
+@table @code
+@vindex -h, Test-specific
+@item -h
+Display the test-suite-specific usage string and exit.  For a TCP_ or
+UDP_ test this will be the usage string from the source file
+nettest_bsd.c.  For an XTI_ test, this will be the usage string from
+the source file nettest_xti.c.  For an SCTP test, this will be the
+usage string from the source file nettest_sctp.c.
+
+@item -H <optionspec>
+Normally, the remote hostname|IP and address family information is
+inherited from the settings for the control connection (eg global
+command-line @option{-H}, @option{-4} and/or @option{-6} options).
+The test-specific @option{-H} will override those settings for the
+data (aka test) connection only.  Settings for the control connection
+are left unchanged.
+
+@vindex -L, Test-specific
+@item -L <optionspec>
+The test-specific @option{-L} option is identical to the test-specific
+@option{-H} option except it affects the local hostname|IP and address
+family information.  As with its global command-line counterpart, this
+is generally only useful when measuring though those evil, end-to-end
+breaking things called firewalls.
+
+@vindex -m, Test-specific
+@item -m bytes
+Set the size of the buffer passed-in to the ``send'' calls of a
+_STREAM test.  Note that this may have only an indirect effect on the
+size of the packets sent over the network, and certain Layer 4
+protocols do _not_ preserve or enforce message boundaries, so setting
+@option{-m} for the send size does not necessarily mean the receiver
+will receive that many bytes at any one time. By default the units are
+bytes, but suffix of ``G,'' ``M,'' or ``K'' will specify the units to
+be 2^30 (GB), 2^20 (MB) or 2^10 (KB) respectively. A suffix of ``g,''
+``m'' or ``k'' will specify units of 10^9, 10^6 or 10^3 bytes
+respectively. For example:
+@example
+@code{-m 32K}
+@end example
+will set the size to 32KB or 32768 bytes. [Default: the local send
+socket buffer size for the connection - either the system's default or
+the value set via the @option{-s} option.]
+
+@vindex -M, Test-specific
+@item -M bytes
+Set the size of the buffer passed-in to the ``recv'' calls of a
+_STREAM test.  This will be an upper bound on the number of bytes
+received per receive call. By default the units are bytes, but suffix
+of ``G,'' ``M,'' or ``K'' will specify the units to be 2^30 (GB), 2^20
+(MB) or 2^10 (KB) respectively.  A suffix of ``g,'' ``m'' or ``k''
+will specify units of 10^9, 10^6 or 10^3 bytes respectively. For
+example:
+@example
+@code{-M 32K}
+@end example
+will set the size to 32KB or 32768 bytes. [Default: the remote receive
+socket buffer size for the data connection - either the system's
+default or the value set via the @option{-S} option.]
+
+@vindex -P, Test-specific
+@item -P <optionspec>
+Set the local and/or remote port numbers for the data connection.
+
+@vindex -s, Test-specific
+@item -s <sizespec>
+This option sets the local (netperf) send and receive socket buffer
+sizes for the data connection to the value(s) specified.  Often, this
+will affect the advertised and/or effective TCP or other window, but
+on some platforms it may not. By default the units are bytes, but
+suffix of ``G,'' ``M,'' or ``K'' will specify the units to be 2^30
+(GB), 2^20 (MB) or 2^10 (KB) respectively.  A suffix of ``g,'' ``m''
+or ``k'' will specify units of 10^9, 10^6 or 10^3 bytes
+respectively. For example:
+@example
+@code{-s 128K}
+@end example
+Will request the local send and receive socket buffer sizes to be
+128KB or 131072 bytes. 
+
+While the historic expectation is that setting the socket buffer size
+has a direct effect on say the TCP window, today that may not hold
+true for all stacks. Further, while the historic expectation is that
+the value specified in a @code{setsockopt()} call will be the value returned
+via a @code{getsockopt()} call, at least one stack is known to deliberately
+ignore history.  When running under Windows a value of 0 may be used
+which will be an indication to the stack the user wants to enable a
+form of copy avoidance. [Default: -1 - use the system's default socket
+buffer sizes]
+
+@vindex -S Test-specific
+@item -S <sizespec>
+This option sets the remote (netserver) send and/or receive socket
+buffer sizes for the data connection to the value(s) specified.
+Often, this will affect the advertised and/or effective TCP or other
+window, but on some platforms it may not. By default the units are
+bytes, but suffix of ``G,'' ``M,'' or ``K'' will specify the units to
+be 2^30 (GB), 2^20 (MB) or 2^10 (KB) respectively.  A suffix of ``g,''
+``m'' or ``k'' will specify units of 10^9, 10^6 or 10^3 bytes
+respectively.  For example:
+@example
+@code{-S 128K}
+@end example
+Will request the remote send and receive socket buffer sizes to be
+128KB or 131072 bytes. 
+
+While the historic expectation is that setting the socket buffer size
+has a direct effect on say the TCP window, today that may not hold
+true for all stacks.  Further, while the historic expectation is that
+the value specified in a @code{setsockopt()} call will be the value returned
+via a @code{getsockopt()} call, at least one stack is known to deliberately
+ignore history.  When running under Windows a value of 0 may be used
+which will be an indication to the stack the user wants to enable a
+form of copy avoidance. [Default: -1 - use the system's default socket
+buffer sizes]
+
+@vindex -4, Test-specific
+@item -4
+Set the local and remote address family for the data connection to
+AF_INET - ie use IPv4 addressing only.  Just as with their global
+command-line counterparts the last of the @option{-4}, @option{-6},
+@option{-H} or @option{-L} option wins for their respective address
+families.
+
+@vindex -6, Test-specific
+@item -6
+This option is identical to its @option{-4} cousin, but requests IPv6
+addresses for the local and remote ends of the data connection.
+
+@end table
+
+
+@menu
+* TCP_STREAM::                  
+* TCP_MAERTS::                  
+* TCP_SENDFILE::                
+* UDP_STREAM::                  
+* XTI_TCP_STREAM::              
+* XTI_UDP_STREAM::              
+* SCTP_STREAM::                 
+* DLCO_STREAM::                 
+* DLCL_STREAM::                 
+* STREAM_STREAM::               
+* DG_STREAM::                   
+@end menu
+
+@node TCP_STREAM, TCP_MAERTS, Options common to TCP UDP and SCTP tests, Options common to TCP UDP and SCTP tests
+@subsection TCP_STREAM
+
+The TCP_STREAM test is the default test in netperf.  It is quite
+simple, transferring some quantity of data from the system running
+netperf to the system running netserver.  While time spent
+establishing the connection is not included in the throughput
+calculation, time spent flushing the last of the data to the remote at
+the end of the test is.  This is how netperf knows that all the data
+it sent was received by the remote.  In addition to the @ref{Options
+common to TCP UDP and SCTP tests,options common to STREAM tests}, the
+following test-specific options can be included to possibly alter the
+behavior of the test:
+
+@table @code
+@item -C
+This option will set TCP_CORK mode on the data connection on those
+systems where TCP_CORK is defined (typically Linux).  A full
+description of TCP_CORK is beyond the scope of this manual, but in a
+nutshell it forces sub-MSS sends to be buffered so every segment sent
+is Maximum Segment Size (MSS) unless the application performs an
+explicit flush operation or the connection is closed.  At present
+netperf does not perform any explicit flush operations.  Setting
+TCP_CORK may improve the bitrate of tests where the ``send size''
+(@option{-m} option) is smaller than the MSS.  It should also improve
+(make smaller) the service demand.
+
+The Linux tcp(7) manpage states that TCP_CORK cannot be used in
+conjunction with TCP_NODELAY (set via the @option{-d} option), however
+netperf does not validate command-line options to enforce that.
+
+@item -D
+This option will set TCP_NODELAY on the data connection on those
+systems where TCP_NODELAY is defined.  This disables something known
+as the Nagle Algorithm, which is intended to make the segments TCP
+sends as large as reasonably possible.  Setting TCP_NODELAY for a
+TCP_STREAM test should either have no effect when the send size
+(@option{-m} option) is larger than the MSS or will decrease reported
+bitrate and increase service demand when the send size is smaller than
+the MSS.  This stems from TCP_NODELAY causing each sub-MSS send to be
+its own TCP segment rather than being aggregated with other small
+sends.  This means more trips up and down the protocol stack per KB of
+data transferred, which means greater CPU utilization.
+
+If setting TCP_NODELAY with @option{-D} affects throughput and/or
+service demand for tests where the send size (@option{-m}) is larger
+than the MSS it suggests the TCP/IP stack's implementation of the
+Nagle Algorithm _may_ be broken, perhaps interpreting the Nagle
+Algorithm on a segment by segment basis rather than the proper user
+send by user send basis.  However, a better test of this can be
+achieved with the @ref{TCP_RR} test.
+
+@end table
+
+Here is an example of a basic TCP_STREAM test, in this case from a
+Debian Linux (2.6 kernel) system to an HP-UX 11iv2 (HP-UX 11.23)
+system:
+
+@example
+$ netperf -H lag
+TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to lag.hpl.hp.com (15.4.89.214) port 0 AF_INET
+Recv   Send    Send                          
+Socket Socket  Message  Elapsed              
+Size   Size    Size     Time     Throughput  
+bytes  bytes   bytes    secs.    10^6bits/sec  
+
+ 32768  16384  16384    10.00      80.42   
+@end example
+
+We see that the default receive socket buffer size for the receiver
+(lag - HP-UX 11.23) is 32768 bytes, and the default socket send buffer
+size for the sender (Debian 2.6 kernel) is 16384 bytes, however Linux
+does ``auto tuning'' of socket buffer and TCP window sizes, which
+means the send socket buffer size may be different at the end of the
+test than it was at the beginning.  This is addressed in the @ref{The
+Omni Tests,omni tests} added in version 2.5.0 and @ref{Omni Output
+Selection,output selection}.  Throughput is expressed as 10^6 (aka
+Mega) bits per second, and the test ran for 10 seconds.  IPv4
+addresses (AF_INET) were used.
+
+@node TCP_MAERTS, TCP_SENDFILE, TCP_STREAM, Options common to TCP UDP and SCTP tests
+@comment  node-name,  next,  previous,  up
+@subsection TCP_MAERTS
+
+A TCP_MAERTS (MAERTS is STREAM backwards) test is ``just like'' a
+@ref{TCP_STREAM} test except the data flows from the netserver to the
+netperf. The global command-line @option{-F} option is ignored for
+this test type.  The test-specific command-line @option{-C} option is
+ignored for this test type.
+
+Here is an example of a TCP_MAERTS test between the same two systems
+as in the example for the @ref{TCP_STREAM} test.  This time we request
+larger socket buffers with @option{-s} and @option{-S} options:
+
+@example
+$ netperf -H lag -t TCP_MAERTS -- -s 128K -S 128K
+TCP MAERTS TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to lag.hpl.hp.com (15.4.89.214) port 0 AF_INET
+Recv   Send    Send                          
+Socket Socket  Message  Elapsed              
+Size   Size    Size     Time     Throughput  
+bytes  bytes   bytes    secs.    10^6bits/sec  
+
+221184 131072 131072    10.03      81.14   
+@end example
+
+Where we see that Linux, unlike HP-UX, may not return the same value
+in a @code{getsockopt()} as was requested in the prior @code{setsockopt()}.
+
+This test is included more for benchmarking convenience than anything
+else.
+
+@node TCP_SENDFILE, UDP_STREAM, TCP_MAERTS, Options common to TCP UDP and SCTP tests
+@comment  node-name,  next,  previous,  up
+@subsection TCP_SENDFILE
+
+The TCP_SENDFILE test is ``just like'' a @ref{TCP_STREAM} test except
+netperf the platform's @code{sendfile()} call instead of calling
+@code{send()}.  Often this results in a @dfn{zero-copy} operation
+where data is sent directly from the filesystem buffer cache.  This
+_should_ result in lower CPU utilization and possibly higher
+throughput.  If it does not, then you may want to contact your
+vendor(s) because they have a problem on their hands.
+
+Zero-copy mechanisms may also alter the characteristics (size and
+number of buffers per) of packets passed to the NIC.  In many stacks,
+when a copy is performed, the stack can ``reserve'' space at the
+beginning of the destination buffer for things like TCP, IP and Link
+headers.  This then has the packet contained in a single buffer which
+can be easier to DMA to the NIC.  When no copy is performed, there is
+no opportunity to reserve space for headers and so a packet will be
+contained in two or more buffers.
+
+As of some time before version 2.5.0, the @ref{Global Options,global
+@option{-F} option} is no longer required for this test.  If it is not
+specified, netperf will create a temporary file, which it will delete
+at the end of the test.  If the @option{-F} option is specified it
+must reference a file of at least the size of the send ring
+(@xref{Global Options,the global @option{-W} option}.) multiplied by
+the send size (@xref{Options common to TCP UDP and SCTP tests,the
+test-specific @option{-m} option}.).  All other TCP-specific options
+remain available and optional.
+
+In this first example:
+@example
+$ netperf -H lag -F ../src/netperf -t TCP_SENDFILE -- -s 128K -S 128K
+TCP SENDFILE TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to lag.hpl.hp.com (15.4.89.214) port 0 AF_INET
+alloc_sendfile_buf_ring: specified file too small.
+file must be larger than send_width * send_size
+@end example
+
+we see what happens when the file is too small.  Here:
+
+@example
+$ netperf -H lag -F /boot/vmlinuz-2.6.8-1-686 -t TCP_SENDFILE -- -s 128K -S 128K
+TCP SENDFILE TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to lag.hpl.hp.com (15.4.89.214) port 0 AF_INET
+Recv   Send    Send                          
+Socket Socket  Message  Elapsed              
+Size   Size    Size     Time     Throughput  
+bytes  bytes   bytes    secs.    10^6bits/sec  
+
+131072 221184 221184    10.02      81.83   
+@end example
+
+we resolve that issue by selecting a larger file.
+
+
+@node UDP_STREAM, XTI_TCP_STREAM, TCP_SENDFILE, Options common to TCP UDP and SCTP tests
+@subsection UDP_STREAM
+
+A UDP_STREAM test is similar to a @ref{TCP_STREAM} test except UDP is
+used as the transport rather than TCP.
+
+@cindex Limiting Bandwidth
+A UDP_STREAM test has no end-to-end flow control - UDP provides none
+and neither does netperf.  However, if you wish, you can configure
+netperf with @code{--enable-intervals=yes} to enable the global
+command-line @option{-b} and @option{-w} options to pace bursts of
+traffic onto the network.
+
+This has a number of implications.
+
+The biggest of these implications is the data which is sent might not
+be received by the remote.  For this reason, the output of a
+UDP_STREAM test shows both the sending and receiving throughput.  On
+some platforms, it may be possible for the sending throughput to be
+reported as a value greater than the maximum rate of the link.  This
+is common when the CPU(s) are faster than the network and there is no
+@dfn{intra-stack} flow-control.
+
+Here is an example of a UDP_STREAM test between two systems connected
+by a 10 Gigabit Ethernet link:
+@example
+$ netperf -t UDP_STREAM -H 192.168.2.125 -- -m 32768
+UDP UNIDIRECTIONAL SEND TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.2.125 (192.168.2.125) port 0 AF_INET
+Socket  Message  Elapsed      Messages                
+Size    Size     Time         Okay Errors   Throughput
+bytes   bytes    secs            #      #   10^6bits/sec
+
+124928   32768   10.00      105672      0    2770.20
+135168           10.00      104844           2748.50
+
+@end example
+
+The first line of numbers are statistics from the sending (netperf)
+side. The second line of numbers are from the receiving (netserver)
+side.  In this case, 105672 - 104844 or 828 messages did not make it
+all the way to the remote netserver process.
+
+If the value of the @option{-m} option is larger than the local send
+socket buffer size (@option{-s} option) netperf will likely abort with
+an error message about how the send call failed:
+
+@example
+netperf -t UDP_STREAM -H 192.168.2.125
+UDP UNIDIRECTIONAL SEND TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.2.125 (192.168.2.125) port 0 AF_INET
+udp_send: data send error: Message too long
+@end example
+
+If the value of the @option{-m} option is larger than the remote
+socket receive buffer, the reported receive throughput will likely be
+zero as the remote UDP will discard the messages as being too large to
+fit into the socket buffer.
+
+@example
+$ netperf -t UDP_STREAM -H 192.168.2.125 -- -m 65000 -S 32768
+UDP UNIDIRECTIONAL SEND TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.2.125 (192.168.2.125) port 0 AF_INET
+Socket  Message  Elapsed      Messages                
+Size    Size     Time         Okay Errors   Throughput
+bytes   bytes    secs            #      #   10^6bits/sec
+
+124928   65000   10.00       53595      0    2786.99
+ 65536           10.00           0              0.00
+@end example
+
+The example above was between a pair of systems running a ``Linux''
+kernel. Notice that the remote Linux system returned a value larger
+than that passed-in to the @option{-S} option.  In fact, this value
+was larger than the message size set with the @option{-m} option.
+That the remote socket buffer size is reported as 65536 bytes would
+suggest to any sane person that a message of 65000 bytes would fit,
+but the socket isn't _really_ 65536 bytes, even though Linux is
+telling us so.  Go figure.
+
+@node XTI_TCP_STREAM, XTI_UDP_STREAM, UDP_STREAM, Options common to TCP UDP and SCTP tests
+@subsection XTI_TCP_STREAM
+
+An XTI_TCP_STREAM test is simply a @ref{TCP_STREAM} test using the XTI
+rather than BSD Sockets interface.  The test-specific @option{-X
+<devspec>} option can be used to specify the name of the local and/or
+remote XTI device files, which is required by the @code{t_open()} call
+made by netperf XTI tests.
+
+The XTI_TCP_STREAM test is only present if netperf was configured with
+@code{--enable-xti=yes}.  The remote netserver must have also been
+configured with @code{--enable-xti=yes}.
+
+@node XTI_UDP_STREAM, SCTP_STREAM, XTI_TCP_STREAM, Options common to TCP UDP and SCTP tests
+@subsection XTI_UDP_STREAM
+
+An XTI_UDP_STREAM test is simply a @ref{UDP_STREAM} test using the XTI
+rather than BSD Sockets Interface.  The test-specific @option{-X
+<devspec>} option can be used to specify the name of the local and/or
+remote XTI device files, which is required by the @code{t_open()} call
+made by netperf XTI tests.
+
+The XTI_UDP_STREAM test is only present if netperf was configured with
+@code{--enable-xti=yes}. The remote netserver must have also been
+configured with @code{--enable-xti=yes}.
+
+@node SCTP_STREAM, DLCO_STREAM, XTI_UDP_STREAM, Options common to TCP UDP and SCTP tests
+@subsection SCTP_STREAM
+
+An SCTP_STREAM test is essentially a @ref{TCP_STREAM} test using the SCTP
+rather than TCP.  The @option{-D} option will set SCTP_NODELAY, which
+is much like the TCP_NODELAY option for TCP.  The @option{-C} option
+is not applicable to an SCTP test as there is no corresponding
+SCTP_CORK option.  The author is still figuring-out what the
+test-specific @option{-N} option does :)
+
+The SCTP_STREAM test is only present if netperf was configured with
+@code{--enable-sctp=yes}. The remote netserver must have also been
+configured with @code{--enable-sctp=yes}.
+
+@node DLCO_STREAM, DLCL_STREAM, SCTP_STREAM, Options common to TCP UDP and SCTP tests
+@subsection DLCO_STREAM
+
+A DLPI Connection Oriented Stream (DLCO_STREAM) test is very similar
+in concept to a @ref{TCP_STREAM} test.  Both use reliable,
+connection-oriented protocols.  The DLPI test differs from the TCP
+test in that its protocol operates only at the link-level and does not
+include TCP-style segmentation and reassembly.  This last difference
+means that the value  passed-in  with the @option{-m} option must be
+less than the interface MTU.  Otherwise, the @option{-m} and
+@option{-M} options are just like their TCP/UDP/SCTP counterparts.
+
+Other DLPI-specific options include:
+
+@table @code
+@item -D <devspec>
+This option is used to provide the fully-qualified names for the local
+and/or remote DLPI device files.  The syntax is otherwise identical to
+that of a @dfn{sizespec}.
+@item -p <ppaspec>
+This option is used to specify the local and/or remote DLPI PPA(s).
+The PPA is used to identify the interface over which traffic is to be
+sent/received. The syntax of a @dfn{ppaspec} is otherwise the same as
+a @dfn{sizespec}.
+@item -s sap 
+This option specifies the 802.2 SAP for the test.  A SAP is somewhat
+like either the port field of a TCP or UDP header or the protocol
+field of an IP header.  The specified SAP should not conflict with any
+other active SAPs on the specified PPA's (@option{-p} option).
+@item -w <sizespec>
+This option specifies the local send and receive window sizes in units
+of frames on those platforms which support setting such things.
+@item -W <sizespec>
+This option specifies the remote send and receive window sizes in
+units of frames on those platforms which support setting such things.
+@end table
+
+The DLCO_STREAM test is only present if netperf was configured with
+@code{--enable-dlpi=yes}. The remote netserver must have also been
+configured with @code{--enable-dlpi=yes}.
+
+
+@node DLCL_STREAM, STREAM_STREAM, DLCO_STREAM, Options common to TCP UDP and SCTP tests
+@subsection DLCL_STREAM
+
+A DLPI ConnectionLess Stream (DLCL_STREAM) test is analogous to a
+@ref{UDP_STREAM} test in that both make use of unreliable/best-effort,
+connection-less transports.  The DLCL_STREAM test differs from the
+@ref{UDP_STREAM} test in that the message size (@option{-m} option) must
+always be less than the link MTU as there is no IP-like fragmentation
+and reassembly available and netperf does not presume to provide one.
+
+The test-specific command-line options for a DLCL_STREAM test are the
+same as those for a @ref{DLCO_STREAM} test.
+
+The DLCL_STREAM test is only present if netperf was configured with
+@code{--enable-dlpi=yes}. The remote netserver must have also been
+configured with @code{--enable-dlpi=yes}.
+
+@node STREAM_STREAM, DG_STREAM, DLCL_STREAM, Options common to TCP UDP and SCTP tests
+@comment  node-name,  next,  previous,  up
+@subsection STREAM_STREAM
+
+A Unix Domain Stream Socket Stream test (STREAM_STREAM) is similar in
+concept to a @ref{TCP_STREAM} test, but using Unix Domain sockets.  It is,
+naturally, limited to intra-machine traffic.  A STREAM_STREAM test
+shares the @option{-m}, @option{-M}, @option{-s} and @option{-S}
+options of the other _STREAM tests.  In a STREAM_STREAM test the
+@option{-p} option sets the directory in which the pipes will be
+created rather than setting a port number.  The default is to create
+the pipes in the system default for the @code{tempnam()} call.
+
+The STREAM_STREAM test is only present if netperf was configured with
+@code{--enable-unixdomain=yes}. The remote netserver must have also been
+configured with @code{--enable-unixdomain=yes}.
+
+@node DG_STREAM,  , STREAM_STREAM, Options common to TCP UDP and SCTP tests
+@comment  node-name,  next,  previous,  up
+@subsection DG_STREAM
+
+A Unix Domain Datagram Socket Stream test (SG_STREAM) is very much
+like a @ref{TCP_STREAM} test except that message boundaries are preserved.
+In this way, it may also be considered similar to certain flavors of
+SCTP test which can also preserve message boundaries.
+
+All the options of a @ref{STREAM_STREAM} test are applicable to a DG_STREAM
+test. 
+
+The DG_STREAM test is only present if netperf was configured with
+@code{--enable-unixdomain=yes}. The remote netserver must have also been
+configured with @code{--enable-unixdomain=yes}.
+
+
+@node Using Netperf to Measure Request/Response , Using Netperf to Measure Aggregate Performance, Using Netperf to Measure Bulk Data Transfer, Top
+@chapter Using Netperf to Measure Request/Response 
+
+Request/response performance is often overlooked, yet it is just as
+important as bulk-transfer performance.  While things like larger
+socket buffers and TCP windows, and stateless offloads like TSO and
+LRO can cover a multitude of latency and even path-length sins, those
+sins cannot easily hide from a request/response test.  The convention
+for a request/response test is to have a _RR suffix.  There are
+however a few ``request/response'' tests that have other suffixes.
+
+A request/response test, particularly synchronous, one transaction at
+a time test such as those found by default in netperf, is particularly
+sensitive to the path-length of the networking stack.  An _RR test can
+also uncover those platforms where the NICs are strapped by default
+with overbearing interrupt avoidance settings in an attempt to
+increase the bulk-transfer performance (or rather, decrease the CPU
+utilization of a bulk-transfer test).  This sensitivity is most acute
+for small request and response sizes, such as the single-byte default
+for a netperf _RR test.
+
+While a bulk-transfer test reports its results in units of bits or
+bytes transferred per second, by default a mumble_RR test reports
+transactions per second where a transaction is defined as the
+completed exchange of a request and a response.  One can invert the
+transaction rate to arrive at the average round-trip latency.  If one
+is confident about the symmetry of the connection, the average one-way
+latency can be taken as one-half the average round-trip latency. As of
+version 2.5.0 (actually slightly before) netperf still does not do the
+latter, but will do the former if one sets the verbosity to 2 for a
+classic netperf test, or includes the appropriate @ref{Omni Output
+Selectors,output selector} in an @ref{The Omni Tests,omni test}.  It
+will also allow the user to switch the throughput units from
+transactions per second to bits or bytes per second with the global
+@option{-f} option.
+
+@menu
+* Issues in Request/Response::  
+* Options Common to TCP UDP and SCTP _RR tests::  
+@end menu
+
+@node Issues in Request/Response, Options Common to TCP UDP and SCTP _RR tests, Using Netperf to Measure Request/Response , Using Netperf to Measure Request/Response
+@comment  node-name,  next,  previous,  up
+@section Issues in Request/Response
+
+Most if not all the @ref{Issues in Bulk Transfer} apply to
+request/response.  The issue of round-trip latency is even more
+important as netperf generally only has one transaction outstanding at
+a time.
+
+A single instance of a one transaction outstanding _RR test should
+_never_ completely saturate the CPU of a system.  If testing between
+otherwise evenly matched systems, the symmetric nature of a _RR test
+with equal request and response sizes should result in equal CPU
+loading on both systems. However, this may not hold true on MP
+systems, particularly if one CPU binds the netperf and netserver
+differently via the global @option{-T} option.
+
+For smaller request and response sizes packet loss is a bigger issue
+as there is no opportunity for a @dfn{fast retransmit} or
+retransmission prior to a retransmission timer expiring.
+
+Virtualization may considerably increase the effective path length of
+a networking stack.  While this may not preclude achieving link-rate
+on a comparatively slow link (eg 1 Gigabit Ethernet) on a _STREAM
+test, it can show-up as measurably fewer transactions per second on an
+_RR test.  However, this may still be masked by interrupt coalescing
+in the NIC/driver.
+
+Certain NICs have ways to minimize the number of interrupts sent to
+the host.  If these are strapped badly they can significantly reduce
+the performance of something like a single-byte request/response test.
+Such setups are distinguished by seriously low reported CPU utilization
+and what seems like a low (even if in the thousands) transaction per
+second rate.  Also, if you run such an OS/driver combination on faster
+or slower hardware and do not see a corresponding change in the
+transaction rate, chances are good that the driver is strapping the
+NIC with aggressive interrupt avoidance settings.  Good for bulk
+throughput, but bad for latency.
+
+Some drivers may try to automagically adjust the interrupt avoidance
+settings.  If they are not terribly good at it, you will see
+considerable run-to-run variation in reported transaction rates.
+Particularly if you ``mix-up'' _STREAM and _RR tests.
+
+
+@node Options Common to TCP UDP and SCTP _RR tests,  , Issues in Request/Response, Using Netperf to Measure Request/Response
+@comment  node-name,  next,  previous,  up
+@section Options Common to TCP UDP and SCTP _RR tests
+
+Many ``test-specific'' options are actually common across the
+different tests.  For those tests involving TCP, UDP and SCTP, whether
+using the BSD Sockets or the XTI interface those common options
+include:
+
+@table @code
+@vindex -h, Test-specific
+@item -h
+Display the test-suite-specific usage string and exit.  For a TCP_ or
+UDP_ test this will be the usage string from the source file
+@file{nettest_bsd.c}.  For an XTI_ test, this will be the usage string
+from the source file @file{src/nettest_xti.c}.  For an SCTP test, this
+will be the usage string from the source file
+@file{src/nettest_sctp.c}.
+
+@vindex -H, Test-specific
+@item -H <optionspec>
+Normally, the remote hostname|IP and address family information is
+inherited from the settings for the control connection (eg global
+command-line @option{-H}, @option{-4} and/or @option{-6} options.
+The test-specific @option{-H} will override those settings for the
+data (aka test) connection only.  Settings for the control connection
+are left unchanged.  This might be used to cause the control and data
+connections to take different paths through the network.
+
+@vindex -L, Test-specific
+@item -L <optionspec>
+The test-specific @option{-L} option is identical to the test-specific
+@option{-H} option except it affects the local hostname|IP and address
+family information.  As with its global command-line counterpart, this
+is generally only useful when measuring though those evil, end-to-end
+breaking things called firewalls.
+
+@vindex -P, Test-specific
+@item -P <optionspec>
+Set the local and/or remote port numbers for the data connection.
+
+@vindex -r, Test-specific
+@item -r <sizespec>
+This option sets the request (first value) and/or response (second
+value) sizes for an _RR test. By default the units are bytes, but a
+suffix of ``G,'' ``M,'' or ``K'' will specify the units to be 2^30
+(GB), 2^20 (MB) or 2^10 (KB) respectively.  A suffix of ``g,'' ``m''
+or ``k'' will specify units of 10^9, 10^6 or 10^3 bytes
+respectively. For example:
+@example
+@code{-r 128,16K}
+@end example
+Will set the request size to 128 bytes and the response size to 16 KB
+or 16384 bytes. [Default: 1 - a single-byte request and response ]
+
+@vindex -s, Test-specific
+@item -s <sizespec>
+This option sets the local (netperf) send and receive socket buffer
+sizes for the data connection to the value(s) specified.  Often, this
+will affect the advertised and/or effective TCP or other window, but
+on some platforms it may not. By default the units are bytes, but a
+suffix of ``G,'' ``M,'' or ``K'' will specify the units to be 2^30
+(GB), 2^20 (MB) or 2^10 (KB) respectively.  A suffix of ``g,'' ``m''
+or ``k'' will specify units of 10^9, 10^6 or 10^3 bytes
+respectively. For example:
+@example
+@code{-s 128K}
+@end example
+Will request the local send (netperf) and receive socket buffer sizes
+to be 128KB or 131072 bytes.
+
+While the historic expectation is that setting the socket buffer size
+has a direct effect on say the TCP window, today that may not hold
+true for all stacks.  When running under Windows a value of 0 may be
+used which will be an indication to the stack the user wants to enable
+a form of copy avoidance. [Default: -1 - use the system's default
+socket buffer sizes]
+
+@vindex -S, Test-specific
+@item -S <sizespec>
+This option sets the remote (netserver) send and/or receive socket
+buffer sizes for the data connection to the value(s) specified.
+Often, this will affect the advertised and/or effective TCP or other
+window, but on some platforms it may not. By default the units are
+bytes, but a suffix of ``G,'' ``M,'' or ``K'' will specify the units
+to be 2^30 (GB), 2^20 (MB) or 2^10 (KB) respectively.  A suffix of
+``g,'' ``m'' or ``k'' will specify units of 10^9, 10^6 or 10^3 bytes
+respectively.  For example:
+@example
+@code{-S 128K}
+@end example
+Will request the remote (netserver) send and receive socket buffer
+sizes to be 128KB or 131072 bytes.
+
+While the historic expectation is that setting the socket buffer size
+has a direct effect on say the TCP window, today that may not hold
+true for all stacks.  When running under Windows a value of 0 may be
+used which will be an indication to the stack the user wants to enable
+a form of copy avoidance.  [Default: -1 - use the system's default
+socket buffer sizes]
+
+@vindex -4, Test-specific
+@item -4
+Set the local and remote address family for the data connection to
+AF_INET - ie use IPv4 addressing only.  Just as with their global
+command-line counterparts the last of the @option{-4}, @option{-6},
+@option{-H} or @option{-L} option wins for their respective address
+families.
+
+@vindex -6 Test-specific
+@item -6
+This option is identical to its @option{-4} cousin, but requests IPv6
+addresses for the local and remote ends of the data connection.
+
+@end table
+
+@menu
+* TCP_RR::                      
+* TCP_CC::                      
+* TCP_CRR::                     
+* UDP_RR::                      
+* XTI_TCP_RR::                  
+* XTI_TCP_CC::                  
+* XTI_TCP_CRR::                 
+* XTI_UDP_RR::                  
+* DLCL_RR::                     
+* DLCO_RR::                     
+* SCTP_RR::                     
+@end menu
+
+@node TCP_RR, TCP_CC, Options Common to TCP UDP and SCTP _RR tests, Options Common to TCP UDP and SCTP _RR tests
+@subsection TCP_RR
+@cindex Measuring Latency
+@cindex Latency, Request-Response
+
+A TCP_RR (TCP Request/Response) test is requested by passing a value
+of ``TCP_RR'' to the global @option{-t} command-line option.  A TCP_RR
+test can be thought-of as a user-space to user-space @code{ping} with
+no think time - it is by default a synchronous, one transaction at a
+time, request/response test.
+
+The transaction rate is the number of complete transactions exchanged
+divided by the length of time it took to perform those transactions.
+
+If the two Systems Under Test are otherwise identical, a TCP_RR test
+with the same request and response size should be symmetric - it
+should not matter which way the test is run, and the CPU utilization
+measured should be virtually the same on each system.  If not, it
+suggests that the CPU utilization mechanism being used may have some,
+well, issues measuring CPU utilization completely and accurately.
+
+Time to establish the TCP connection is not counted in the result.  If
+you want connection setup overheads included, you should consider the
+@ref{TCP_CC,TPC_CC} or @ref{TCP_CRR,TCP_CRR} tests.
+
+If specifying the @option{-D} option to set TCP_NODELAY and disable
+the Nagle Algorithm increases the transaction rate reported by a
+TCP_RR test, it implies the stack(s) over which the TCP_RR test is
+running have a broken implementation of the Nagle Algorithm.  Likely
+as not they are interpreting Nagle on a segment by segment basis
+rather than a user send by user send basis.  You should contact your
+stack vendor(s) to report the problem to them.
+
+Here is an example of two systems running a basic TCP_RR test over a
+10 Gigabit Ethernet link:
+
+@example
+netperf -t TCP_RR -H 192.168.2.125
+TCP REQUEST/RESPONSE TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.2.125 (192.168.2.125) port 0 AF_INET
+Local /Remote
+Socket Size   Request  Resp.   Elapsed  Trans.
+Send   Recv   Size     Size    Time     Rate         
+bytes  Bytes  bytes    bytes   secs.    per sec   
+
+16384  87380  1        1       10.00    29150.15   
+16384  87380 
+@end example
+
+In this example the request and response sizes were one byte, the
+socket buffers were left at their defaults, and the test ran for all
+of 10 seconds.  The transaction per second rate was rather good for
+the time :)
+
+@node TCP_CC, TCP_CRR, TCP_RR, Options Common to TCP UDP and SCTP _RR tests
+@subsection TCP_CC
+@cindex Connection Latency
+@cindex Latency, Connection Establishment
+
+A TCP_CC (TCP Connect/Close) test is requested by passing a value of
+``TCP_CC'' to the global @option{-t} option.  A TCP_CC test simply
+measures how fast the pair of systems can open and close connections
+between one another in a synchronous (one at a time) manner.  While
+this is considered an _RR test, no request or response is exchanged
+over the connection.
+
+@cindex Port Reuse
+@cindex TIME_WAIT
+The issue of TIME_WAIT reuse is an important one for a TCP_CC test.
+Basically, TIME_WAIT reuse is when a pair of systems churn through
+connections fast enough that they wrap the 16-bit port number space in
+less time than the length of the TIME_WAIT state.  While it is indeed
+theoretically possible to ``reuse'' a connection in TIME_WAIT, the
+conditions under which such reuse is possible are rather rare.  An
+attempt to reuse a connection in TIME_WAIT can result in a non-trivial
+delay in connection establishment.
+
+Basically, any time the connection churn rate approaches:
+
+Sizeof(clientportspace) / Lengthof(TIME_WAIT)
+
+there is the risk of TIME_WAIT reuse.  To minimize the chances of this
+happening, netperf will by default select its own client port numbers
+from the range of 5000 to 65535.  On systems with a 60 second
+TIME_WAIT state, this should allow roughly 1000 transactions per
+second.  The size of the client port space used by netperf can be
+controlled via the test-specific @option{-p} option, which takes a
+@dfn{sizespec} as a value setting the minimum (first value) and
+maximum (second value) port numbers used by netperf at the client end.
+
+Since no requests or responses are exchanged during a TCP_CC test,
+only the @option{-H}, @option{-L}, @option{-4} and @option{-6} of the
+``common'' test-specific options are likely to have an effect, if any,
+on the results.  The @option{-s} and @option{-S} options _may_ have
+some effect if they alter the number and/or type of options carried in
+the TCP SYNchronize segments, such as Window Scaling or Timestamps.
+The @option{-P} and @option{-r} options are utterly ignored.
+
+Since connection establishment and tear-down for TCP is not symmetric,
+a TCP_CC test is not symmetric in its loading of the two systems under
+test.
+
+@node TCP_CRR, UDP_RR, TCP_CC, Options Common to TCP UDP and SCTP _RR tests
+@subsection TCP_CRR
+@cindex Latency, Connection Establishment
+@cindex Latency, Request-Response
+
+The TCP Connect/Request/Response (TCP_CRR) test is requested by
+passing a value of ``TCP_CRR'' to the global @option{-t} command-line
+option.  A TCP_CRR test is like a merger of a @ref{TCP_RR} and
+@ref{TCP_CC} test which measures the performance of establishing a
+connection, exchanging a single request/response transaction, and
+tearing-down that connection.  This is very much like what happens in
+an HTTP 1.0 or HTTP 1.1 connection when HTTP Keepalives are not used.
+In fact, the TCP_CRR test was added to netperf to simulate just that.
+
+Since a request and response are exchanged the @option{-r},
+@option{-s} and @option{-S} options can have an effect on the
+performance.
+
+The issue of TIME_WAIT reuse exists for the TCP_CRR test just as it
+does for the TCP_CC test.  Similarly, since connection establishment
+and tear-down is not symmetric, a TCP_CRR test is not symmetric even
+when the request and response sizes are the same.
+
+@node UDP_RR, XTI_TCP_RR, TCP_CRR, Options Common to TCP UDP and SCTP _RR tests
+@subsection UDP_RR
+@cindex Latency, Request-Response
+@cindex Packet Loss
+
+A UDP Request/Response (UDP_RR) test is requested by passing a value
+of ``UDP_RR'' to a global @option{-t} option.  It is very much the
+same as a TCP_RR test except UDP is used rather than TCP.
+
+UDP does not provide for retransmission of lost UDP datagrams, and
+netperf does not add anything for that either.  This means that if
+_any_ request or response is lost, the exchange of requests and
+responses will stop from that point until the test timer expires.
+Netperf will not really ``know'' this has happened - the only symptom
+will be a low transaction per second rate.  If @option{--enable-burst}
+was included in the @code{configure} command and a test-specific
+@option{-b} option used, the UDP_RR test will ``survive'' the loss of
+requests and responses until the sum is one more than the value passed
+via the @option{-b} option. It will though almost certainly run more
+slowly.
+
+The netperf side of a UDP_RR test will call @code{connect()} on its
+data socket and thenceforth use the @code{send()} and @code{recv()}
+socket calls.  The netserver side of a UDP_RR test will not call
+@code{connect()} and will use @code{recvfrom()} and @code{sendto()}
+calls.  This means that even if the request and response sizes are the
+same, a UDP_RR test is _not_ symmetric in its loading of the two
+systems under test.
+
+Here is an example of a UDP_RR test between two otherwise
+identical two-CPU systems joined via a 1 Gigabit Ethernet network:
+
+@example
+$ netperf -T 1 -H 192.168.1.213 -t UDP_RR -c -C
+UDP REQUEST/RESPONSE TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.1.213 (192.168.1.213) port 0 AF_INET
+Local /Remote
+Socket Size   Request Resp.  Elapsed Trans.   CPU    CPU    S.dem   S.dem
+Send   Recv   Size    Size   Time    Rate     local  remote local   remote
+bytes  bytes  bytes   bytes  secs.   per sec  % I    % I    us/Tr   us/Tr
+
+65535  65535  1       1      10.01   15262.48   13.90  16.11  18.221  21.116
+65535  65535 
+@end example
+
+This example includes the @option{-c} and @option{-C} options to
+enable CPU utilization reporting and shows the asymmetry in CPU
+loading.  The @option{-T} option was used to make sure netperf and
+netserver ran on a given CPU and did not move around during the test.
+
+@node XTI_TCP_RR, XTI_TCP_CC, UDP_RR, Options Common to TCP UDP and SCTP _RR tests
+@subsection XTI_TCP_RR
+@cindex Latency, Request-Response
+
+An XTI_TCP_RR test is essentially the same as a @ref{TCP_RR} test only
+using the XTI rather than BSD Sockets interface. It is requested by
+passing a value of ``XTI_TCP_RR'' to the @option{-t} global
+command-line option.
+
+The test-specific options for an XTI_TCP_RR test are the same as those
+for a TCP_RR test with the addition of the @option{-X <devspec>} option to
+specify the names of the local and/or remote XTI device file(s).
+
+@node XTI_TCP_CC, XTI_TCP_CRR, XTI_TCP_RR, Options Common to TCP UDP and SCTP _RR tests
+@comment  node-name,  next,  previous,  up
+@subsection XTI_TCP_CC
+@cindex Latency, Connection Establishment
+
+An XTI_TCP_CC test is essentially the same as a @ref{TCP_CC,TCP_CC}
+test, only using the XTI rather than BSD Sockets interface.
+
+The test-specific options for an XTI_TCP_CC test are the same as those
+for a TCP_CC test with the addition of the @option{-X <devspec>} option to
+specify the names of the local and/or remote XTI device file(s).
+
+@node XTI_TCP_CRR, XTI_UDP_RR, XTI_TCP_CC, Options Common to TCP UDP and SCTP _RR tests
+@comment  node-name,  next,  previous,  up
+@subsection XTI_TCP_CRR
+@cindex Latency, Connection Establishment
+@cindex Latency, Request-Response
+
+The XTI_TCP_CRR test is essentially the same as a
+@ref{TCP_CRR,TCP_CRR} test, only using the XTI rather than BSD Sockets
+interface.
+
+The test-specific options for an XTI_TCP_CRR test are the same as those
+for a TCP_RR test with the addition of the @option{-X <devspec>} option to
+specify the names of the local and/or remote XTI device file(s).
+
+@node XTI_UDP_RR, DLCL_RR, XTI_TCP_CRR, Options Common to TCP UDP and SCTP _RR tests
+@subsection XTI_UDP_RR
+@cindex Latency, Request-Response
+
+An XTI_UDP_RR test is essentially the same as a UDP_RR test only using
+the XTI rather than BSD Sockets interface.  It is requested by passing
+a value of ``XTI_UDP_RR'' to the @option{-t} global command-line
+option.
+
+The test-specific options for an XTI_UDP_RR test are the same as those
+for a UDP_RR test with the addition of the @option{-X <devspec>}
+option to specify the name of the local and/or remote XTI device
+file(s).
+
+@node DLCL_RR, DLCO_RR, XTI_UDP_RR, Options Common to TCP UDP and SCTP _RR tests
+@comment  node-name,  next,  previous,  up
+@subsection DLCL_RR
+@cindex Latency, Request-Response
+
+@node DLCO_RR, SCTP_RR, DLCL_RR, Options Common to TCP UDP and SCTP _RR tests
+@comment  node-name,  next,  previous,  up
+@subsection DLCO_RR
+@cindex Latency, Request-Response
+
+@node SCTP_RR,  , DLCO_RR, Options Common to TCP UDP and SCTP _RR tests
+@comment  node-name,  next,  previous,  up
+@subsection SCTP_RR
+@cindex Latency, Request-Response
+
+@node Using Netperf to Measure Aggregate Performance, Using Netperf to Measure Bidirectional Transfer, Using Netperf to Measure Request/Response , Top
+@comment  node-name,  next,  previous,  up
+@chapter Using Netperf to Measure Aggregate Performance
+@cindex Aggregate Performance
+@vindex --enable-burst, Configure
+
+Ultimately, @ref{Netperf4,Netperf4} will be the preferred benchmark to
+use when one wants to measure aggregate performance because netperf
+has no support for explicit synchronization of concurrent tests. Until
+netperf4 is ready for prime time, one can make use of the heuristics
+and procedures mentioned here for the 85% solution.
+
+There are a few ways to measure aggregate performance with netperf.
+The first is to run multiple, concurrent netperf tests and can be
+applied to any of the netperf tests.  The second is to configure
+netperf with @code{--enable-burst} and is applicable to the TCP_RR
+test. The third is a variation on the first.
+
+@menu
+* Running Concurrent Netperf Tests::  
+* Using --enable-burst::        
+* Using --enable-demo::         
+@end menu
+
+@node  Running Concurrent Netperf Tests, Using --enable-burst, Using Netperf to Measure Aggregate Performance, Using Netperf to Measure Aggregate Performance
+@comment  node-name,  next,  previous,  up
+@section Running Concurrent Netperf Tests
+
+@ref{Netperf4,Netperf4} is the preferred benchmark to use when one
+wants to measure aggregate performance because netperf has no support
+for explicit synchronization of concurrent tests.  This leaves
+netperf2 results vulnerable to @dfn{skew} errors.
+
+However, since there are times when netperf4 is unavailable it may be
+necessary to run netperf. The skew error can be minimized by making
+use of the confidence interval functionality.  Then one simply
+launches multiple tests from the shell using a @code{for} loop or the
+like:
+
+@example
+for i in 1 2 3 4
+do
+netperf -t TCP_STREAM -H tardy.cup.hp.com -i 10 -P 0 &
+done
+@end example
+
+which will run four, concurrent @ref{TCP_STREAM,TCP_STREAM} tests from
+the system on which it is executed to tardy.cup.hp.com.  Each
+concurrent netperf will iterate 10 times thanks to the @option{-i}
+option and will omit the test banners (option @option{-P}) for
+brevity.  The output looks something like this:
+
+@example
+ 87380  16384  16384    10.03     235.15   
+ 87380  16384  16384    10.03     235.09   
+ 87380  16384  16384    10.03     235.38   
+ 87380  16384  16384    10.03     233.96
+@end example
+
+We can take the sum of the results and be reasonably confident that
+the aggregate performance was 940 Mbits/s.  This method does not need
+to be limited to one system speaking to one other system.  It can be
+extended to one system talking to N other systems.  It could be as simple as:
+@example
+for host in 'foo bar baz bing'
+do
+netperf -t TCP_STREAM -H $hosts -i 10 -P 0 &
+done
+@end example
+A more complicated/sophisticated example can be found in
+@file{doc/examples/runemomniagg2.sh} where.
+
+If you see warnings about netperf not achieving the confidence
+intervals, the best thing to do is to increase the number of
+iterations with @option{-i} and/or increase the run length of each
+iteration with @option{-l}.
+
+You can also enable local (@option{-c}) and/or remote (@option{-C})
+CPU utilization:
+
+@example
+for i in 1 2 3 4
+do
+netperf -t TCP_STREAM -H tardy.cup.hp.com -i 10 -P 0 -c -C &
+done
+
+87380  16384  16384    10.03       235.47   3.67     5.09     10.226  14.180 
+87380  16384  16384    10.03       234.73   3.67     5.09     10.260  14.225 
+87380  16384  16384    10.03       234.64   3.67     5.10     10.263  14.231 
+87380  16384  16384    10.03       234.87   3.67     5.09     10.253  14.215
+@end example
+
+If the CPU utilizations reported for the same system are the same or
+very very close you can be reasonably confident that skew error is
+minimized.  Presumably one could then omit @option{-i} but that is
+not advised, particularly when/if the CPU utilization approaches 100
+percent.  In the example above we see that the CPU utilization on the
+local system remains the same for all four tests, and is only off by
+0.01 out of 5.09 on the remote system.  As the number of CPUs in the
+system increases, and so too the odds of saturating a single CPU, the
+accuracy of similar CPU utilization implying little skew error is
+diminished.  This is also the case for those increasingly rare single
+CPU systems if the utilization is reported as 100% or very close to
+it.
+
+@quotation
+@b{NOTE: It is very important to remember that netperf is calculating
+system-wide CPU utilization.  When calculating the service demand
+(those last two columns in the output above) each netperf assumes it
+is the only thing running on the system.  This means that for
+concurrent tests the service demands reported by netperf will be
+wrong.  One has to compute service demands for concurrent tests by
+hand.}
+@end quotation
+
+If you wish you can add a unique, global @option{-B} option to each
+command line to append the given string to the output:
+
+@example
+for i in 1 2 3 4
+do
+netperf -t TCP_STREAM -H tardy.cup.hp.com -B "this is test $i" -i 10 -P 0 &
+done
+
+87380  16384  16384    10.03     234.90   this is test 4
+87380  16384  16384    10.03     234.41   this is test 2
+87380  16384  16384    10.03     235.26   this is test 1
+87380  16384  16384    10.03     235.09   this is test 3
+@end example
+
+You will notice that the tests completed in an order other than they
+were started from the shell.  This underscores why there is a threat
+of skew error and why netperf4 will eventually be the preferred tool
+for aggregate tests.  Even if you see the Netperf Contributing Editor
+acting to the contrary!-)
+
+@menu
+* Issues in Running Concurrent Tests::  
+@end menu
+
+@node Issues in Running Concurrent Tests,  , Running Concurrent Netperf Tests, Running Concurrent Netperf Tests
+@subsection Issues in Running Concurrent Tests
+
+In addition to the aforementioned issue of skew error, there can be
+other issues to consider when running concurrent netperf tests.
+
+For example, when running concurrent tests over multiple interfaces,
+one is not always assured that the traffic one thinks went over a
+given interface actually did so.  In particular, the Linux networking
+stack takes a particularly strong stance on its following the so
+called @samp{weak end system model}.  As such, it is willing to answer
+ARP requests for any of its local IP addresses on any of its
+interfaces.  If multiple interfaces are connected to the same
+broadcast domain, then even if they are configured into separate IP
+subnets there is no a priori way of knowing which interface was
+actually used for which connection(s).  This can be addressed by
+setting the @samp{arp_ignore} sysctl before configuring interfaces.
+
+As it is quite important, we will repeat that it is very important to
+remember that each concurrent netperf instance is calculating
+system-wide CPU utilization.  When calculating the service demand each
+netperf assumes it is the only thing running on the system.  This
+means that for concurrent tests the service demands reported by
+netperf @b{will be wrong}.  One has to compute service demands for
+concurrent tests by hand
+
+Running concurrent tests can also become difficult when there is no
+one ``central'' node.  Running tests between pairs of systems may be
+more difficult, calling for remote shell commands in the for loop
+rather than netperf commands.  This introduces more skew error, which
+the confidence intervals may not be able to sufficiently mitigate.
+One possibility is to actually run three consecutive netperf tests on
+each node - the first being a warm-up, the last being a cool-down.
+The idea then is to ensure that the time it takes to get all the
+netperfs started is less than the length of the first netperf command
+in the sequence of three.  Similarly, it assumes that all ``middle''
+netperfs will complete before the first of the ``last'' netperfs
+complete.
+
+@node  Using --enable-burst, Using --enable-demo, Running Concurrent Netperf Tests, Using Netperf to Measure Aggregate Performance
+@comment  node-name,  next,  previous,  up
+@section Using - -enable-burst
+
+Starting in version 2.5.0 @code{--enable-burst=yes} is the default,
+which means one no longer must:
+
+@example
+configure --enable-burst
+@end example
+
+To have burst-mode functionality present in netperf.  This enables a
+test-specific @option{-b num} option in @ref{TCP_RR,TCP_RR},
+@ref{UDP_RR,UDP_RR} and @ref{The Omni Tests,omni} tests.
+
+Normally, netperf will attempt to ramp-up the number of outstanding
+requests to @option{num} plus one transactions in flight at one time.
+The ramp-up is to avoid transactions being smashed together into a
+smaller number of segments when the transport's congestion window (if
+any) is smaller at the time than what netperf wants to have
+outstanding at one time. If, however, the user specifies a negative
+value for @option{num} this ramp-up is bypassed and the burst of sends
+is made without consideration of transport congestion window.
+
+This burst-mode is used as an alternative to or even in conjunction
+with multiple-concurrent _RR tests and as a way to implement a
+single-connection, bidirectional bulk-transfer test.  When run with
+just a single instance of netperf, increasing the burst size can
+determine the maximum number of transactions per second which can be
+serviced by a single process:
+
+@example
+for b in 0 1 2 4 8 16 32
+do 
+ netperf -v 0 -t TCP_RR -B "-b $b" -H hpcpc108 -P 0 -- -b $b
+done
+
+9457.59 -b 0
+9975.37 -b 1
+10000.61 -b 2
+20084.47 -b 4
+29965.31 -b 8
+71929.27 -b 16
+109718.17 -b 32
+@end example
+
+The global @option{-v} and @option{-P} options were used to minimize
+the output to the single figure of merit which in this case the
+transaction rate.  The global @code{-B} option was used to more
+clearly label the output, and the test-specific @option{-b} option
+enabled by @code{--enable-burst} increase the number of transactions
+in flight at one time.
+
+Now, since the test-specific @option{-D} option was not specified to
+set TCP_NODELAY, the stack was free to ``bundle'' requests and/or
+responses into TCP segments as it saw fit, and since the default
+request and response size is one byte, there could have been some
+considerable bundling even in the absence of transport congestion
+window issues.  If one wants to try to achieve a closer to
+one-to-one correspondence between a request and response and a TCP
+segment, add the test-specific @option{-D} option:
+
+@example
+for b in 0 1 2 4 8 16 32
+do
+ netperf -v 0 -t TCP_RR -B "-b $b -D" -H hpcpc108 -P 0 -- -b $b -D
+done
+
+ 8695.12 -b 0 -D
+ 19966.48 -b 1 -D
+ 20691.07 -b 2 -D
+ 49893.58 -b 4 -D
+ 62057.31 -b 8 -D
+ 108416.88 -b 16 -D
+ 114411.66 -b 32 -D
+@end example
+
+You can see that this has a rather large effect on the reported
+transaction rate.  In this particular instance, the author believes it
+relates to interactions between the test and interrupt coalescing
+settings in the driver for the NICs used.
+
+@quotation
+@b{NOTE: Even if you set the @option{-D} option that is still not a
+guarantee that each transaction is in its own TCP segments.  You
+should get into the habit of verifying the relationship between the
+transaction rate and the packet rate via other means.}
+@end quotation
+
+You can also combine @code{--enable-burst} functionality with
+concurrent netperf tests.  This would then be an ``aggregate of
+aggregates'' if you like:
+
+@example
+
+for i in 1 2 3 4
+do
+ netperf -H hpcpc108 -v 0 -P 0 -i 10 -B "aggregate $i -b 8 -D" -t TCP_RR -- -b 8 -D &
+done
+
+ 46668.38 aggregate 4 -b 8 -D
+ 44890.64 aggregate 2 -b 8 -D
+ 45702.04 aggregate 1 -b 8 -D
+ 46352.48 aggregate 3 -b 8 -D
+
+@end example
+
+Since each netperf did hit the confidence intervals, we can be
+reasonably certain that the aggregate transaction per second rate was
+the sum of all four concurrent tests, or something just shy of 184,000
+transactions per second.  To get some idea if that was also the packet
+per second rate, we could bracket that @code{for} loop with something
+to gather statistics and run the results through
+@uref{ftp://ftp.cup.hp.com/dist/networking/tools,beforeafter}:
+
+@example
+/usr/sbin/ethtool -S eth2 > before
+for i in 1 2 3 4
+do
+ netperf -H 192.168.2.108 -l 60 -v 0 -P 0 -B "aggregate $i -b 8 -D" -t TCP_RR -- -b 8 -D &
+done
+wait
+/usr/sbin/ethtool -S eth2 > after
+
+ 52312.62 aggregate 2 -b 8 -D
+ 50105.65 aggregate 4 -b 8 -D
+ 50890.82 aggregate 1 -b 8 -D
+ 50869.20 aggregate 3 -b 8 -D
+
+beforeafter before after > delta
+
+grep packets delta
+     rx_packets: 12251544
+     tx_packets: 12251550
+
+@end example
+
+This example uses @code{ethtool} because the system being used is
+running Linux.  Other platforms have other tools - for example HP-UX
+has lanadmin:
+
+@example
+lanadmin -g mibstats <ppa>
+@end example
+
+and of course one could instead use @code{netstat}.
+
+The @code{wait} is important because we are launching concurrent
+netperfs in the background.  Without it, the second ethtool command
+would be run before the tests finished and perhaps even before the
+last of them got started!
+
+The sum of the reported transaction rates is 204178 over 60 seconds,
+which is a total of 12250680 transactions.  Each transaction is the
+exchange of a request and a response, so we multiply that by 2 to
+arrive at 24501360.
+
+The sum of the ethtool stats is 24503094 packets which matches what
+netperf was reporting very well. 
+
+Had the request or response size differed, we would need to know how
+it compared with the @dfn{MSS} for the connection.
+
+Just for grins, here is the exercise repeated, using @code{netstat}
+instead of @code{ethtool}
+
+@example
+netstat -s -t > before
+for i in 1 2 3 4
+do
+ netperf -l 60 -H 192.168.2.108 -v 0 -P 0 -B "aggregate $i -b 8 -D" -t TCP_RR -- -b 8 -D & done
+wait
+netstat -s -t > after
+
+ 51305.88 aggregate 4 -b 8 -D
+ 51847.73 aggregate 2 -b 8 -D
+ 50648.19 aggregate 3 -b 8 -D
+ 53605.86 aggregate 1 -b 8 -D
+
+beforeafter before after > delta
+
+grep segments delta
+    12445708 segments received
+    12445730 segments send out
+    1 segments retransmited
+    0 bad segments received.
+@end example
+
+The sums are left as an exercise to the reader :)
+
+Things become considerably more complicated if there are non-trvial
+packet losses and/or retransmissions.
+
+Of course all this checking is unnecessary if the test is a UDP_RR
+test because UDP ``never'' aggregates multiple sends into the same UDP
+datagram, and there are no ACKnowledgements in UDP.  The loss of a
+single request or response will not bring a ``burst'' UDP_RR test to a
+screeching halt, but it will reduce the number of transactions
+outstanding at any one time.  A ``burst'' UDP_RR test @b{will} come to a
+halt if the sum of the lost requests and responses reaches the value
+specified in the test-specific @option{-b} option.
+
+@node Using --enable-demo,  , Using --enable-burst, Using Netperf to Measure Aggregate Performance
+@section Using - -enable-demo
+
+One can
+@example
+configure --enable-demo
+@end example
+and compile netperf to enable netperf to emit ``interim results'' at
+semi-regular intervals.  This enables a global @code{-D} option which
+takes a reporting interval as an argument.  With that specified, the
+output of netperf will then look something like
+
+@example
+$ src/netperf -D 1.25
+MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to localhost.localdomain () port 0 AF_INET : demo
+Interim result: 25425.52 10^6bits/s over 1.25 seconds ending at 1327962078.405
+Interim result: 25486.82 10^6bits/s over 1.25 seconds ending at 1327962079.655
+Interim result: 25474.96 10^6bits/s over 1.25 seconds ending at 1327962080.905
+Interim result: 25523.49 10^6bits/s over 1.25 seconds ending at 1327962082.155
+Interim result: 25053.57 10^6bits/s over 1.27 seconds ending at 1327962083.429
+Interim result: 25349.64 10^6bits/s over 1.25 seconds ending at 1327962084.679
+Interim result: 25292.84 10^6bits/s over 1.25 seconds ending at 1327962085.932
+Recv   Send    Send                          
+Socket Socket  Message  Elapsed              
+Size   Size    Size     Time     Throughput  
+bytes  bytes   bytes    secs.    10^6bits/sec  
+
+ 87380  16384  16384    10.00    25375.66   
+@end example
+The units of the ``Interim result'' lines will follow the units
+selected via the global @code{-f} option.  If the test-specific
+@code{-o} option is specified on the command line, the format will be
+CSV:
+@example
+...
+2978.81,MBytes/s,1.25,1327962298.035
+...
+@end example
+If the test-specific @code{-k} option is used the format will be
+keyval with each keyval being given an index:
+@example
+...
+NETPERF_INTERIM_RESULT[2]=25.00
+NETPERF_UNITS[2]=10^9bits/s
+NETPERF_INTERVAL[2]=1.25
+NETPERF_ENDING[2]=1327962357.249
+...
+@end example
+The expectation is it may be easier to utilize the keyvals if they
+have indices.
+
+But how does this help with aggregate tests?  Well, what one can do is
+start the netperfs via a script, giving each a Very Long (tm) run
+time.  Direct the output to a file per instance.  Then, once all the
+netperfs have been started, take a timestamp and wait for some desired
+test interval.  Once that interval expires take another timestamp and
+then start terminating the netperfs by sending them a SIGALRM signal
+via the likes of the @code{kill} or @code{pkill} command.  The
+netperfs will terminate and emit the rest of the ``usual'' output, and
+you can then bring the files to a central location for post
+processing to find the aggregate performance over the ``test interval.''  
+
+This method has the advantage that it does not require advance
+knowledge of how long it takes to get netperf tests started and/or
+stopped.  It does though require sufficiently synchronized clocks on
+all the test systems.
+
+While calls to get the current time can be inexpensive, that neither
+has been nor is universally true.  For that reason netperf tries to
+minimize the number of such ``timestamping'' calls (eg
+@code{gettimeofday}) calls it makes when in demo mode.  Rather than
+take a timestamp after each @code{send} or @code{recv} call completes
+netperf tries to guess how many units of work will be performed over
+the desired interval.  Only once that many units of work have been
+completed will netperf check the time.  If the reporting interval has
+passed, netperf will emit an ``interim result.''  If the interval has
+not passed, netperf will update its estimate for units and continue.
+
+After a bit of thought one can see that if things ``speed-up'' netperf
+will still honor the interval.  However, if things ``slow-down''
+netperf may be late with an ``interim result.''  Here is an example of
+both of those happening during a test - with the interval being
+honored while throughput increases, and then about half-way through
+when another netperf (not shown) is started we see things slowing down
+and netperf not hitting the interval as desired.
+@example
+$ src/netperf -D 2 -H tardy.hpl.hp.com -l 20
+MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to tardy.hpl.hp.com () port 0 AF_INET : demo
+Interim result:   36.46 10^6bits/s over 2.01 seconds ending at 1327963880.565
+Interim result:   59.19 10^6bits/s over 2.00 seconds ending at 1327963882.569
+Interim result:   73.39 10^6bits/s over 2.01 seconds ending at 1327963884.576
+Interim result:   84.01 10^6bits/s over 2.03 seconds ending at 1327963886.603
+Interim result:   75.63 10^6bits/s over 2.21 seconds ending at 1327963888.814
+Interim result:   55.52 10^6bits/s over 2.72 seconds ending at 1327963891.538
+Interim result:   70.94 10^6bits/s over 2.11 seconds ending at 1327963893.650
+Interim result:   80.66 10^6bits/s over 2.13 seconds ending at 1327963895.777
+Interim result:   86.42 10^6bits/s over 2.12 seconds ending at 1327963897.901
+Recv   Send    Send                          
+Socket Socket  Message  Elapsed              
+Size   Size    Size     Time     Throughput  
+bytes  bytes   bytes    secs.    10^6bits/sec  
+
+ 87380  16384  16384    20.34      68.87   
+@end example
+So long as your post-processing mechanism can account for that, there
+should be no problem.  As time passes there may be changes to try to
+improve the netperf's honoring the interval but one should not
+ass-u-me it will always do so.  One should not assume the precision
+will remain fixed - future versions may change it - perhaps going
+beyond tenths of seconds in reporting the interval length etc.
+
+@node Using Netperf to Measure Bidirectional Transfer, The Omni Tests, Using Netperf to Measure Aggregate Performance, Top
+@comment  node-name,  next,  previous,  up
+@chapter Using Netperf to Measure Bidirectional Transfer
+
+There are two ways to use netperf to measure the performance of
+bidirectional transfer.  The first is to run concurrent netperf tests
+from the command line.  The second is to configure netperf with
+@code{--enable-burst} and use a single instance of the
+@ref{TCP_RR,TCP_RR} test.
+
+While neither method is more ``correct'' than the other, each is doing
+so in different ways, and that has possible implications.  For
+instance, using the concurrent netperf test mechanism means that
+multiple TCP connections and multiple processes are involved, whereas
+using the single instance of TCP_RR there is only one TCP connection
+and one process on each end.  They may behave differently, especially
+on an MP system.
+
+@menu
+* Bidirectional Transfer with Concurrent Tests::  
+* Bidirectional Transfer with TCP_RR::  
+* Implications of Concurrent Tests vs Burst Request/Response::  
+@end menu
+
+@node  Bidirectional Transfer with Concurrent Tests, Bidirectional Transfer with TCP_RR, Using Netperf to Measure Bidirectional Transfer, Using Netperf to Measure Bidirectional Transfer
+@comment  node-name,  next,  previous,  up
+@section Bidirectional Transfer with Concurrent Tests
+
+If we had two hosts Fred and Ethel, we could simply run a netperf
+@ref{TCP_STREAM,TCP_STREAM} test on Fred pointing at Ethel, and a
+concurrent netperf TCP_STREAM test on Ethel pointing at Fred, but
+since there are no mechanisms to synchronize netperf tests and we
+would be starting tests from two different systems, there is a
+considerable risk of skew error.
+
+Far better would be to run simultaneous TCP_STREAM and
+@ref{TCP_MAERTS,TCP_MAERTS} tests from just @b{one} system, using the
+concepts and procedures outlined in @ref{Running Concurrent Netperf
+Tests,Running Concurrent Netperf Tests}. Here then is an example:
+
+@example
+for i in 1
+do
+ netperf -H 192.168.2.108 -t TCP_STREAM -B "outbound" -i 10 -P 0 -v 0 \
+   -- -s 256K -S 256K &
+ netperf -H 192.168.2.108 -t TCP_MAERTS -B "inbound"  -i 10 -P 0 -v 0 \
+   -- -s 256K -S 256K &
+done
+
+ 892.66 outbound
+ 891.34 inbound
+@end example
+
+We have used a @code{for} loop in the shell with just one iteration
+because that will be @b{much} easier to get both tests started at more or
+less the same time than doing it by hand.  The global @option{-P} and
+@option{-v} options are used because we aren't interested in anything
+other than the throughput, and the global @option{-B} option is used
+to tag each output so we know which was inbound and which outbound
+relative to the system on which we were running netperf.  Of course
+that sense is switched on the system running netserver :)  The use of
+the global @option{-i} option is explained in @ref{Running Concurrent
+Netperf Tests,Running Concurrent Netperf Tests}.
+
+Beginning with version 2.5.0 we can accomplish a similar result with
+the @ref{The Omni Tests,the omni tests} and @ref{Omni Output
+Selectors,output selectors}:
+
+@example
+for i in 1
+do
+  netperf -H 192.168.1.3 -t omni -l 10 -P 0 -- \
+    -d stream -s 256K -S 256K -o throughput,direction &
+  netperf -H 192.168.1.3 -t omni -l 10 -P 0 -- \
+    -d maerts -s 256K -S 256K -o throughput,direction &
+done
+
+805.26,Receive
+828.54,Send
+@end example
+
+@node  Bidirectional Transfer with TCP_RR, Implications of Concurrent Tests vs Burst Request/Response, Bidirectional Transfer with Concurrent Tests, Using Netperf to Measure Bidirectional Transfer
+@comment  node-name,  next,  previous,  up
+@section Bidirectional Transfer with TCP_RR
+
+Starting with version 2.5.0 the @code{--enable-burst} configure option
+defaults to @code{yes}, and starting some time before version 2.5.0
+but after 2.4.0 the global @option{-f} option would affect the
+``throughput'' reported by request/response tests.  If one uses the
+test-specific @option{-b} option to have several ``transactions'' in
+flight at one time and the test-specific @option{-r} option to
+increase their size, the test looks more and more like a
+single-connection bidirectional transfer than a simple
+request/response test.
+
+So, putting it all together one can do something like:
+
+@example
+netperf -f m -t TCP_RR -H 192.168.1.3 -v 2 -- -b 6 -r 32K -S 256K -S 256K
+MIGRATED TCP REQUEST/RESPONSE TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.1.3 (192.168.1.3) port 0 AF_INET : interval : first burst 6
+Local /Remote
+Socket Size   Request  Resp.   Elapsed  
+Send   Recv   Size     Size    Time     Throughput 
+bytes  Bytes  bytes    bytes   secs.    10^6bits/sec   
+
+16384  87380  32768    32768   10.00    1821.30   
+524288 524288
+Alignment      Offset         RoundTrip  Trans    Throughput
+Local  Remote  Local  Remote  Latency    Rate     10^6bits/s
+Send   Recv    Send   Recv    usec/Tran  per sec  Outbound   Inbound
+    8      0       0      0   2015.402   3473.252 910.492    910.492
+@end example
+
+to get a bidirectional bulk-throughput result. As one can see, the -v
+2 output will include a number of interesting, related values.
+
+@quotation
+@b{NOTE: The logic behind @code{--enable-burst} is very simple, and there
+are no calls to @code{poll()} or @code{select()} which means we want
+to make sure that the @code{send()} calls will never block, or we run
+the risk of deadlock with each side stuck trying to call @code{send()}
+and neither calling @code{recv()}.}
+@end quotation
+
+Fortunately, this is easily accomplished by setting a ``large enough''
+socket buffer size with the test-specific @option{-s} and @option{-S}
+options.  Presently this must be performed by the user.  Future
+versions of netperf might attempt to do this automagically, but there
+are some issues to be worked-out. 
+
+@node Implications of Concurrent Tests vs Burst Request/Response,  , Bidirectional Transfer with TCP_RR, Using Netperf to Measure Bidirectional Transfer
+@section Implications of Concurrent Tests vs Burst Request/Response
+
+There are perhaps subtle but important differences between using
+concurrent unidirectional tests vs a burst-mode request to measure
+bidirectional performance.
+
+Broadly speaking, a single ``connection'' or ``flow'' of traffic
+cannot make use of the services of more than one or two CPUs at either
+end.  Whether one or two CPUs will be used processing a flow will
+depend on the specifics of the stack(s) involved and whether or not
+the global @option{-T} option has been used to bind netperf/netserver
+to specific CPUs.
+
+When using concurrent tests there will be two concurrent connections
+or flows, which means that upwards of four CPUs will be employed
+processing the packets (global @option{-T} used, no more than two if
+not), however, with just a single, bidirectional request/response test
+no more than two CPUs will be employed (only one if the global
+@option{-T} is not used).
+
+If there is a CPU bottleneck on either system this may result in
+rather different results between the two methods.
+
+Also, with a bidirectional request/response test there is something of
+a natural balance or synchronization between inbound and outbound - a
+response will not be sent until a request is received, and (once the
+burst level is reached) a subsequent request will not be sent until a
+response is received.  This may mask favoritism in the NIC between
+inbound and outbound processing.
+
+With two concurrent unidirectional tests there is no such
+synchronization or balance and any favoritism in the NIC may be exposed.
+
+@node The Omni Tests, Other Netperf Tests, Using Netperf to Measure Bidirectional Transfer, Top
+@chapter The Omni Tests
+
+Beginning with version 2.5.0, netperf begins a migration to the
+@samp{omni} tests or ``Two routines to measure them all.''  The code for
+the omni tests can be found in @file{src/nettest_omni.c} and the goal
+is to make it easier for netperf to support multiple protocols and
+report a great many additional things about the systems under test.
+Additionally, a flexible output selection mechanism is present which
+allows the user to chose specifically what values she wishes to have
+reported and in what format.
+
+The omni tests are included by default in version 2.5.0.  To disable
+them, one must:
+@example
+./configure --enable-omni=no ...
+@end example
+
+and remake netperf.  Remaking netserver is optional because even in
+2.5.0 it has ``unmigrated'' netserver side routines for the classic
+(eg @file{src/nettest_bsd.c}) tests.
+
+@menu
+* Native Omni Tests::           
+* Migrated Tests::              
+* Omni Output Selection::       
+@end menu
+
+@node Native Omni Tests, Migrated Tests, The Omni Tests, The Omni Tests
+@section Native Omni Tests
+
+One access the omni tests ``natively'' by using a value of ``OMNI''
+with the global @option{-t} test-selection option.  This will then
+cause netperf to use the code in @file{src/nettest_omni.c} and in
+particular the test-specific options parser for the omni tests.  The
+test-specific options for the omni tests are a superset of those for
+``classic'' tests.  The options added by the omni tests are:
+
+@table @code
+@vindex -c, Test-specific
+@item -c
+This explicitly declares that the test is to include connection
+establishment and tear-down as in either a TCP_CRR or TCP_CC test.
+
+@vindex -d, Test-specific
+@item -d <direction>
+This option sets the direction of the test relative to the netperf
+process.  As of version 2.5.0 one can use the following in a
+case-insensitive manner:
+
+@table @code
+@item send, stream, transmit, xmit or 2 
+Any of which will cause netperf to send to the netserver.
+@item recv, receive, maerts or 4
+Any of which will cause netserver to send to netperf.
+@item rr or 6
+Either of which will cause a request/response test.
+@end table
+
+Additionally, one can specify two directions separated by a '|'
+character and they will be OR'ed together.  In this way one can use
+the ''Send|Recv'' that will be emitted by the @ref{Omni Output
+Selectors,DIRECTION} @ref{Omni Output Selection,output selector} when
+used with a request/response test.
+
+@vindex -k, Test-specific
+@item -k [@ref{Omni Output Selection,output selector}]
+This option sets the style of output to ``keyval'' where each line of
+output has the form:
+@example
+key=value
+@end example
+For example:
+@example
+$ netperf -t omni -- -d rr -k "THROUGHPUT,THROUGHPUT_UNITS"
+OMNI TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to localhost.localdomain (127.0.0.1) port 0 AF_INET : demo
+THROUGHPUT=59092.65
+THROUGHPUT_UNITS=Trans/s
+@end example
+
+Using the @option{-k} option will override any previous, test-specific
+@option{-o} or @option{-O} option.
+
+@vindex -o, Test-specific
+@item -o [@ref{Omni Output Selection,output selector}]
+This option sets the style of output to ``CSV'' where there will be
+one line of comma-separated values, preceded by one line of column
+names unless the global @option{-P} option is used with a value of 0:
+@example
+$ netperf -t omni -- -d rr -o "THROUGHPUT,THROUGHPUT_UNITS"
+OMNI TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to localhost.localdomain (127.0.0.1) port 0 AF_INET : demo
+Throughput,Throughput Units
+60999.07,Trans/s
+@end example
+
+Using the @option{-o} option will override any previous, test-specific
+@option{-k} or @option{-O} option.
+
+@vindex -O, Test-specific
+@item -O [@ref{Omni Output Selection,output selector}]
+This option sets the style of output to ``human readable'' which will
+look quite similar to classic netperf output:
+@example
+$ netperf -t omni -- -d rr -O "THROUGHPUT,THROUGHPUT_UNITS"
+OMNI TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to localhost.localdomain (127.0.0.1) port 0 AF_INET : demo
+Throughput Throughput 
+           Units      
+                      
+                      
+60492.57   Trans/s
+@end example
+
+Using the @option{-O} option will override any previous, test-specific
+@option{-k} or @option{-o} option.
+
+@vindex -t, Test-specific
+@item -t
+This option explicitly sets the socket type for the test's data
+connection. As of version 2.5.0 the known socket types include
+``stream'' and ``dgram'' for SOCK_STREAM and SOCK_DGRAM respectively.
+
+@vindex -T, Test-specific
+@item -T <protocol>
+This option is used to explicitly set the protocol used for the
+test. It is case-insensitive. As of version 2.5.0 the protocols known
+to netperf include:
+@table @code
+@item TCP
+Select the Transmission Control Protocol
+@item UDP
+Select the User Datagram Protocol
+@item SDP
+Select the Sockets Direct Protocol
+@item DCCP
+Select the Datagram Congestion Control Protocol
+@item SCTP
+Select the Stream Control Transport Protocol
+@item udplite
+Select UDP Lite
+@end table
+
+The default is implicit based on other settings.
+@end table
+
+The omni tests also extend the interpretation of some of the classic,
+test-specific options for the BSD Sockets tests:
+
+@table @code
+@item -m <optionspec>
+This can set the send size for either or both of the netperf and
+netserver sides of the test:
+@example
+-m 32K
+@end example
+sets only the netperf-side send size to 32768 bytes, and or's-in
+transmit for the direction. This is effectively the same behaviour as
+for the classic tests.
+@example
+-m ,32K
+@end example
+sets only the netserver side send size to 32768 bytes and or's-in
+receive for the direction.
+@example
+-m 16K,32K
+sets the netperf side send size to 16284 bytes, the netserver side
+send size to 32768 bytes and the direction will be "Send|Recv."
+@end example
+@item -M <optionspec>
+This can set the receive size for either or both of the netperf and
+netserver sides of the test:
+@example
+-M 32K
+@end example
+sets only the netserver side receive size to 32768 bytes and or's-in
+send for the test direction.
+@example
+-M ,32K
+@end example
+sets only the netperf side receive size to 32768 bytes and or's-in
+receive for the test direction.
+@example
+-M 16K,32K
+@end example
+sets the netserver side receive size to 16384 bytes and the netperf
+side receive size to 32768 bytes and the direction will be "Send|Recv."
+@end table
+
+@node Migrated Tests, Omni Output Selection, Native Omni Tests, The Omni Tests
+@section Migrated Tests
+
+As of version 2.5.0 several tests have been migrated to use the omni
+code in @file{src/nettest_omni.c} for the core of their testing.  A
+migrated test retains all its previous output code and so should still
+``look and feel'' just like a pre-2.5.0 test with one exception - the
+first line of the test banners will include the word ``MIGRATED'' at
+the beginning as in:
+
+@example
+$ netperf
+MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to localhost.localdomain (127.0.0.1) port 0 AF_INET : demo
+Recv   Send    Send                          
+Socket Socket  Message  Elapsed              
+Size   Size    Size     Time     Throughput  
+bytes  bytes   bytes    secs.    10^6bits/sec  
+
+ 87380  16384  16384    10.00    27175.27   
+@end example
+
+The tests migrated in version 2.5.0 are:
+@itemize
+@item TCP_STREAM
+@item TCP_MAERTS
+@item TCP_RR
+@item TCP_CRR
+@item UDP_STREAM
+@item UDP_RR
+@end itemize
+
+It is expected that future releases will have additional tests
+migrated to use the ``omni'' functionality.
+
+If one uses ``omni-specific'' test-specific options in conjunction
+with a migrated test, instead of using the classic output code, the
+new omni output code will be used. For example if one uses the
+@option{-k} test-specific option with a value of
+``MIN_LATENCY,MAX_LATENCY'' with a migrated TCP_RR test one will see:
+
+@example
+$ netperf -t tcp_rr -- -k THROUGHPUT,THROUGHPUT_UNITS
+MIGRATED TCP REQUEST/RESPONSE TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to localhost.localdomain (127.0.0.1) port 0 AF_INET : demo
+THROUGHPUT=60074.74
+THROUGHPUT_UNITS=Trans/s
+@end example
+rather than:
+@example
+$ netperf -t tcp_rr
+MIGRATED TCP REQUEST/RESPONSE TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to localhost.localdomain (127.0.0.1) port 0 AF_INET : demo
+Local /Remote
+Socket Size   Request  Resp.   Elapsed  Trans.
+Send   Recv   Size     Size    Time     Rate         
+bytes  Bytes  bytes    bytes   secs.    per sec   
+
+16384  87380  1        1       10.00    59421.52   
+16384  87380 
+@end example
+
+@node Omni Output Selection,  , Migrated Tests, The Omni Tests
+@section Omni Output Selection
+
+The omni test-specific @option{-k}, @option{-o} and @option{-O}
+options take an optional @code{output selector} by which the user can
+configure what values are reported.  The output selector can take
+several forms:
+
+@table @code
+@item @file{filename}
+The output selections will be read from the named file. Within the
+file there can be up to four lines of comma-separated output
+selectors. This controls how many multi-line blocks of output are emitted
+when the @option{-O} option is used.  This output, while not identical to
+``classic'' netperf output, is inspired by it.  Multiple lines have no
+effect for @option{-k} and @option{-o} options.  Putting output
+selections in a file can be useful when the list of selections is long.
+@item comma and/or semi-colon-separated list
+The output selections will be parsed from a comma and/or
+semi-colon-separated list of output selectors. When the list is given
+to a @option{-O} option a semi-colon specifies a new output block
+should be started.  Semi-colons have the same meaning as commas when
+used with the @option{-k} or @option{-o} options.  Depending on the
+command interpreter being used, the semi-colon may have to be escaped
+somehow to keep it from being interpreted by the command interpreter.
+This can often be done by enclosing the entire list in quotes.
+@item all
+If the keyword @b{all} is specified it means that all known output
+values should be displayed at the end of the test.  This can be a
+great deal of output.  As of version 2.5.0 there are 157 different
+output selectors.
+@item ?
+If a ``?'' is given as the output selection, the list of all known
+output selectors will be displayed and no test actually run.  When
+passed to the @option{-O} option they will be listed one per
+line. Otherwise they will be listed as a comma-separated list.  It may
+be necessary to protect the ``?'' from the command interpreter by
+escaping it or enclosing it in quotes.
+@item no selector
+If nothing is given to the @option{-k}, @option{-o} or @option{-O}
+option then the code selects a default set of output selectors
+inspired by classic netperf output. The format will be the @samp{human
+readable} format emitted by the test-specific @option{-O} option.
+@end table
+
+The order of evaluation will first check for an output selection.  If
+none is specified with the @option{-k}, @option{-o} or @option{-O}
+option netperf will select a default based on the characteristics of the
+test.  If there is an output selection, the code will first check for
+@samp{?}, then check to see if it is the magic @samp{all} keyword.
+After that it will check for either @samp{,} or @samp{;} in the
+selection and take that to mean it is a comma and/or
+semi-colon-separated list. If none of those checks match, netperf will then
+assume the output specification is a filename and attempt to open and
+parse the file.
+
+@menu
+* Omni Output Selectors::       
+@end menu
+
+@node Omni Output Selectors,  , Omni Output Selection, Omni Output Selection
+@subsection Omni Output Selectors
+
+As of version 2.5.0 the output selectors are:
+
+@table @code
+@item OUTPUT_NONE
+This is essentially a null output.  For @option{-k} output it will
+simply add a line that reads ``OUTPUT_NONE='' to the output. For
+@option{-o} it will cause an empty ``column'' to be included. For
+@option{-O} output it will cause extra spaces to separate ``real'' output.
+@item SOCKET_TYPE
+This will cause the socket type (eg SOCK_STREAM, SOCK_DGRAM) for the
+data connection to be output.
+@item PROTOCOL
+This will cause the protocol used for the data connection to be displayed.
+@item DIRECTION
+This will display the data flow direction relative to the netperf
+process. Units: Send or Recv for a unidirectional bulk-transfer test,
+or Send|Recv for a request/response test.
+@item ELAPSED_TIME
+This will display the elapsed time in seconds for the test.
+@item THROUGHPUT
+This will display the throughput for the test. Units: As requested via
+the global @option{-f} option and displayed by the THROUGHPUT_UNITS
+output selector.
+@item THROUGHPUT_UNITS
+This will display the units for what is displayed by the
+@code{THROUGHPUT} output selector.
+@item LSS_SIZE_REQ
+This will display the local (netperf) send socket buffer size (aka
+SO_SNDBUF) requested via the command line. Units: Bytes.
+@item LSS_SIZE
+This will display the local (netperf) send socket buffer size
+(SO_SNDBUF) immediately after the data connection socket was created.
+Peculiarities of different networking stacks may lead to this
+differing from the size requested via the command line. Units: Bytes.
+@item LSS_SIZE_END
+This will display the local (netperf) send socket buffer size
+(SO_SNDBUF) immediately before the data connection socket is closed.
+Peculiarities of different networking stacks may lead this to differ
+from the size requested via the command line and/or the size
+immediately after the data connection socket was created. Units: Bytes.
+@item LSR_SIZE_REQ
+This will display the local (netperf) receive socket buffer size (aka
+SO_RCVBUF) requested via the command line. Units: Bytes.
+@item LSR_SIZE
+This will display the local (netperf) receive socket buffer size
+(SO_RCVBUF) immediately after the data connection socket was created.
+Peculiarities of different networking stacks may lead to this
+differing from the size requested via the command line. Units: Bytes.
+@item LSR_SIZE_END
+This will display the local (netperf) receive socket buffer size
+(SO_RCVBUF) immediately before the data connection socket is closed.
+Peculiarities of different networking stacks may lead this to differ
+from the size requested via the command line and/or the size
+immediately after the data connection socket was created. Units: Bytes.
+@item RSS_SIZE_REQ
+This will display the remote (netserver) send socket buffer size (aka
+SO_SNDBUF) requested via the command line. Units: Bytes.
+@item RSS_SIZE
+This will display the remote (netserver) send socket buffer size
+(SO_SNDBUF) immediately after the data connection socket was created.
+Peculiarities of different networking stacks may lead to this
+differing from the size requested via the command line. Units: Bytes.
+@item RSS_SIZE_END
+This will display the remote (netserver) send socket buffer size
+(SO_SNDBUF) immediately before the data connection socket is closed.
+Peculiarities of different networking stacks may lead this to differ
+from the size requested via the command line and/or the size
+immediately after the data connection socket was created. Units: Bytes.
+@item RSR_SIZE_REQ
+This will display the remote (netserver) receive socket buffer size (aka
+SO_RCVBUF) requested via the command line. Units: Bytes.
+@item RSR_SIZE
+This will display the remote (netserver) receive socket buffer size
+(SO_RCVBUF) immediately after the data connection socket was created.
+Peculiarities of different networking stacks may lead to this
+differing from the size requested via the command line. Units: Bytes.
+@item RSR_SIZE_END
+This will display the remote (netserver) receive socket buffer size
+(SO_RCVBUF) immediately before the data connection socket is closed.
+Peculiarities of different networking stacks may lead this to differ
+from the size requested via the command line and/or the size
+immediately after the data connection socket was created. Units: Bytes.
+@item LOCAL_SEND_SIZE
+This will display the size of the buffers netperf passed in any
+``send'' calls it made on the data connection for a
+non-request/response test. Units: Bytes.
+@item LOCAL_RECV_SIZE
+This will display the size of the buffers netperf passed in any
+``receive'' calls it made on the data connection for a
+non-request/response test. Units: Bytes.
+@item REMOTE_SEND_SIZE
+This will display the size of the buffers netserver passed in any
+``send'' calls it made on the data connection for a
+non-request/response test. Units: Bytes.
+@item REMOTE_RECV_SIZE
+This will display the size of the buffers netserver passed in any
+``receive'' calls it made on the data connection for a
+non-request/response test. Units: Bytes.
+@item REQUEST_SIZE
+This will display the size of the requests netperf sent in a
+request-response test. Units: Bytes.
+@item RESPONSE_SIZE
+This will display the size of the responses netserver sent in a
+request-response test. Units: Bytes.
+@item LOCAL_CPU_UTIL
+This will display the overall CPU utilization during the test as
+measured by netperf. Units: 0 to 100 percent.
+@item LOCAL_CPU_PERCENT_USER
+This will display the CPU fraction spent in user mode during the test
+as measured by netperf. Only supported by netcpu_procstat. Units: 0 to
+100 percent.
+@item LOCAL_CPU_PERCENT_SYSTEM
+This will display the CPU fraction spent in system mode during the test
+as measured by netperf. Only supported by netcpu_procstat. Units: 0 to
+100 percent.
+@item LOCAL_CPU_PERCENT_IOWAIT
+This will display the fraction of time waiting for I/O to complete
+during the test as measured by netperf. Only supported by
+netcpu_procstat. Units: 0 to 100 percent.
+@item LOCAL_CPU_PERCENT_IRQ
+This will display the fraction of time servicing interrupts during the
+test as measured by netperf. Only supported by netcpu_procstat. Units:
+0 to 100 percent.
+@item LOCAL_CPU_PERCENT_SWINTR
+This will display the fraction of time servicing softirqs during the
+test as measured by netperf. Only supported by netcpu_procstat. Units:
+0 to 100 percent.
+@item LOCAL_CPU_METHOD
+This will display the method used by netperf to measure CPU
+utilization. Units: single character denoting method.
+@item LOCAL_SD
+This will display the service demand, or units of CPU consumed per
+unit of work, as measured by netperf. Units: microseconds of CPU
+consumed per either KB (K==1024) of data transferred or request/response
+transaction. 
+@item REMOTE_CPU_UTIL
+This will display the overall CPU utilization during the test as
+measured by netserver. Units 0 to 100 percent.
+@item REMOTE_CPU_PERCENT_USER
+This will display the CPU fraction spent in user mode during the test
+as measured by netserver. Only supported by netcpu_procstat. Units: 0 to
+100 percent.
+@item REMOTE_CPU_PERCENT_SYSTEM
+This will display the CPU fraction spent in system mode during the test
+as measured by netserver. Only supported by netcpu_procstat. Units: 0 to
+100 percent.
+@item REMOTE_CPU_PERCENT_IOWAIT
+This will display the fraction of time waiting for I/O to complete
+during the test as measured by netserver. Only supported by
+netcpu_procstat. Units: 0 to 100 percent.
+@item REMOTE_CPU_PERCENT_IRQ
+This will display the fraction of time servicing interrupts during the
+test as measured by netserver. Only supported by netcpu_procstat. Units:
+0 to 100 percent.
+@item REMOTE_CPU_PERCENT_SWINTR
+This will display the fraction of time servicing softirqs during the
+test as measured by netserver. Only supported by netcpu_procstat. Units:
+0 to 100 percent.
+@item REMOTE_CPU_METHOD
+This will display the method used by netserver to measure CPU
+utilization. Units: single character denoting method.
+@item REMOTE_SD
+This will display the service demand, or units of CPU consumed per
+unit of work, as measured by netserver. Units: microseconds of CPU
+consumed per either KB (K==1024) of data transferred or
+request/response transaction.
+@item SD_UNITS
+This will display the units for LOCAL_SD and REMOTE_SD
+@item CONFIDENCE_LEVEL
+This will display the confidence level requested by the user either
+explicitly via the global @option{-I} option, or implicitly via the
+global @option{-i} option.  The value will be either 95 or 99 if
+confidence intervals have been requested or 0 if they were not. Units:
+Percent
+@item CONFIDENCE_INTERVAL
+This will display the width of the confidence interval requested
+either explicitly via the global @option{-I} option or implicitly via
+the global @option{-i} option.  Units: Width in percent of mean value
+computed. A value of -1.0 means that confidence intervals were not requested.
+@item CONFIDENCE_ITERATION
+This will display the number of test iterations netperf undertook,
+perhaps while attempting to achieve the requested confidence interval
+and level. If confidence intervals were requested via the command line
+then the value will be between 3 and 30.  If confidence intervals were
+not requested the value will be 1.  Units: Iterations
+@item THROUGHPUT_CONFID
+This will display the width of the confidence interval actually
+achieved for @code{THROUGHPUT} during the test.  Units: Width of
+interval as percentage of reported throughput value.
+@item LOCAL_CPU_CONFID
+This will display the width of the confidence interval actually
+achieved for overall CPU utilization on the system running netperf
+(@code{LOCAL_CPU_UTIL}) during the test, if CPU utilization measurement
+was enabled.  Units: Width of interval as percentage of reported CPU
+utilization.
+@item REMOTE_CPU_CONFID
+This will display the width of the confidence interval actually
+achieved for overall CPU utilization on the system running netserver
+(@code{REMOTE_CPU_UTIL}) during the test, if CPU utilization
+measurement was enabled. Units: Width of interval as percentage of
+reported CPU utilization.
+@item TRANSACTION_RATE
+This will display the transaction rate in transactions per second for
+a request/response test even if the user has requested a throughput in
+units of bits or bytes per second via the global @option{-f}
+option. It is undefined for a non-request/response test. Units:
+Transactions per second.
+@item RT_LATENCY
+This will display the average round-trip latency for a
+request/response test, accounting for number of transactions in flight
+at one time. It is undefined for a non-request/response test. Units:
+Microseconds per transaction
+@item BURST_SIZE
+This will display the ``burst size'' or added transactions in flight
+in a request/response test as requested via a test-specific
+@option{-b} option.  The number of transactions in flight at one time
+will be one greater than this value.  It is undefined for a
+non-request/response test. Units: added Transactions in flight.
+@item LOCAL_TRANSPORT_RETRANS
+This will display the number of retransmissions experienced on the
+data connection during the test as determined by netperf.  A value of
+-1 means the attempt to determine the number of retransmissions failed
+or the concept was not valid for the given protocol or the mechanism
+is not known for the platform. A value of -2 means it was not
+attempted. As of version 2.5.0 the meaning of values are in flux and
+subject to change.  Units: number of retransmissions.
+@item REMOTE_TRANSPORT_RETRANS
+This will display the number of retransmissions experienced on the
+data connection during the test as determined by netserver.  A value
+of -1 means the attempt to determine the number of retransmissions
+failed or the concept was not valid for the given protocol or the
+mechanism is not known for the platform. A value of -2 means it was
+not attempted. As of version 2.5.0 the meaning of values are in flux
+and subject to change.  Units: number of retransmissions.
+@item TRANSPORT_MSS
+This will display the Maximum Segment Size (aka MSS) or its equivalent
+for the protocol being used during the test.  A value of -1 means
+either the concept of an MSS did not apply to the protocol being used,
+or there was an error in retrieving it. Units: Bytes.
+@item LOCAL_SEND_THROUGHPUT
+The throughput as measured by netperf for the successful ``send''
+calls it made on the data connection. Units: as requested via the
+global @option{-f} option and displayed via the @code{THROUGHPUT_UNITS}
+output selector.
+@item LOCAL_RECV_THROUGHPUT
+The throughput as measured by netperf for the successful ``receive''
+calls it made on the data connection. Units: as requested via the
+global @option{-f} option and displayed via the @code{THROUGHPUT_UNITS}
+output selector.
+@item REMOTE_SEND_THROUGHPUT
+The throughput as measured by netserver for the successful ``send''
+calls it made on the data connection. Units: as requested via the
+global @option{-f} option and displayed via the @code{THROUGHPUT_UNITS}
+output selector.
+@item REMOTE_RECV_THROUGHPUT
+The throughput as measured by netserver for the successful ``receive''
+calls it made on the data connection. Units: as requested via the
+global @option{-f} option and displayed via the @code{THROUGHPUT_UNITS}
+output selector.
+@item LOCAL_CPU_BIND
+The CPU to which netperf was bound, if at all, during the test. A
+value of -1 means that netperf was not explicitly bound to a CPU
+during the test. Units: CPU ID
+@item LOCAL_CPU_COUNT
+The number of CPUs (cores, threads) detected by netperf. Units: CPU count.
+@item LOCAL_CPU_PEAK_UTIL
+The utilization of the CPU most heavily utilized during the test, as
+measured by netperf. This can be used to see if any one CPU of a
+multi-CPU system was saturated even though the overall CPU utilization
+as reported by @code{LOCAL_CPU_UTIL} was low. Units: 0 to 100% 
+@item LOCAL_CPU_PEAK_ID
+The id of the CPU most heavily utilized during the test as determined
+by netperf. Units: CPU ID.
+@item LOCAL_CPU_MODEL
+Model information for the processor(s) present on the system running
+netperf. Assumes all processors in the system (as perceived by
+netperf) on which netperf is running are the same model. Units: Text
+@item LOCAL_CPU_FREQUENCY
+The frequency of the processor(s) on the system running netperf, at
+the time netperf made the call.  Assumes that all processors present
+in the system running netperf are running at the same
+frequency. Units: MHz
+@item REMOTE_CPU_BIND
+The CPU to which netserver was bound, if at all, during the test. A
+value of -1 means that netperf was not explicitly bound to a CPU
+during the test. Units: CPU ID
+@item REMOTE_CPU_COUNT
+The number of CPUs (cores, threads) detected by netserver. Units: CPU
+count.
+@item REMOTE_CPU_PEAK_UTIL
+The utilization of the CPU most heavily utilized during the test, as
+measured by netserver. This can be used to see if any one CPU of a
+multi-CPU system was saturated even though the overall CPU utilization
+as reported by @code{REMOTE_CPU_UTIL} was low. Units: 0 to 100%
+@item REMOTE_CPU_PEAK_ID
+The id of the CPU most heavily utilized during the test as determined
+by netserver. Units: CPU ID.
+@item REMOTE_CPU_MODEL
+Model information for the processor(s) present on the system running
+netserver. Assumes all processors in the system (as perceived by
+netserver) on which netserver is running are the same model. Units:
+Text
+@item REMOTE_CPU_FREQUENCY
+The frequency of the processor(s) on the system running netserver, at
+the time netserver made the call.  Assumes that all processors present
+in the system running netserver are running at the same
+frequency. Units: MHz
+@item SOURCE_PORT
+The port ID/service name to which the data socket created by netperf
+was bound.  A value of 0 means the data socket was not explicitly
+bound to a port number. Units: ASCII text.
+@item SOURCE_ADDR
+The name/address to which the data socket created by netperf was
+bound. A value of 0.0.0.0 means the data socket was not explicitly
+bound to an address. Units: ASCII text.
+@item SOURCE_FAMILY
+The address family to which the data socket created by netperf was
+bound.  A value of 0 means the data socket was not explicitly bound to
+a given address family. Units: ASCII text.
+@item DEST_PORT
+The port ID to which the data socket created by netserver was bound. A
+value of 0 means the data socket was not explicitly bound to a port
+number.  Units: ASCII text.
+@item DEST_ADDR
+The name/address of the data socket created by netserver.  Units:
+ASCII text.
+@item DEST_FAMILY
+The address family to which the data socket created by netserver was
+bound. A value of 0 means the data socket was not explicitly bound to
+a given address family. Units: ASCII text.
+@item LOCAL_SEND_CALLS
+The number of successful ``send'' calls made by netperf against its
+data socket. Units: Calls.
+@item LOCAL_RECV_CALLS
+The number of successful ``receive'' calls made by netperf against its
+data socket. Units: Calls.
+@item LOCAL_BYTES_PER_RECV
+The average number of bytes per ``receive'' call made by netperf
+against its data socket. Units: Bytes.
+@item LOCAL_BYTES_PER_SEND
+The average number of bytes per ``send'' call made by netperf against
+its data socket. Units: Bytes.
+@item LOCAL_BYTES_SENT
+The number of bytes successfully sent by netperf through its data
+socket. Units: Bytes.
+@item LOCAL_BYTES_RECVD
+The number of bytes successfully received by netperf through its data
+socket. Units: Bytes.
+@item LOCAL_BYTES_XFERD
+The sum of bytes sent and received by netperf through its data
+socket. Units: Bytes.
+@item LOCAL_SEND_OFFSET
+The offset from the alignment of the buffers passed by netperf in its
+``send'' calls. Specified via the global @option{-o} option and
+defaults to 0. Units: Bytes.
+@item LOCAL_RECV_OFFSET
+The offset from the alignment of the buffers passed by netperf in its
+``receive'' calls. Specified via the global @option{-o} option and
+defaults to 0. Units: Bytes.
+@item LOCAL_SEND_ALIGN
+The alignment of the buffers passed by netperf in its ``send'' calls
+as specified via the global @option{-a} option. Defaults to 8. Units:
+Bytes.
+@item LOCAL_RECV_ALIGN
+The alignment of the buffers passed by netperf in its ``receive''
+calls as specified via the global @option{-a} option. Defaults to
+8. Units: Bytes.
+@item LOCAL_SEND_WIDTH
+The ``width'' of the ring of buffers through which netperf cycles as
+it makes its ``send'' calls.  Defaults to one more than the local send
+socket buffer size divided by the send size as determined at the time
+the data socket is created. Can be used to make netperf more processor
+data cache unfriendly. Units: number of buffers.
+@item LOCAL_RECV_WIDTH
+The ``width'' of the ring of buffers through which netperf cycles as
+it makes its ``receive'' calls.  Defaults to one more than the local
+receive socket buffer size divided by the receive size as determined
+at the time the data socket is created. Can be used to make netperf
+more processor data cache unfriendly. Units: number of buffers.
+@item LOCAL_SEND_DIRTY_COUNT
+The number of bytes to ``dirty'' (write to) before netperf makes a
+``send'' call. Specified via the global @option{-k} option, which
+requires that --enable-dirty=yes was specified with the configure
+command prior to building netperf. Units: Bytes.
+@item LOCAL_RECV_DIRTY_COUNT
+The number of bytes to ``dirty'' (write to) before netperf makes a
+``recv'' call. Specified via the global @option{-k} option which
+requires that --enable-dirty was specified with the configure command
+prior to building netperf. Units: Bytes.
+@item LOCAL_RECV_CLEAN_COUNT
+The number of bytes netperf should read ``cleanly'' before making a
+``receive'' call. Specified via the global @option{-k} option which
+requires that --enable-dirty was specified with configure command
+prior to building netperf.  Clean reads start were dirty writes ended.
+Units: Bytes.
+@item LOCAL_NODELAY
+Indicates whether or not setting the test protocol-specific ``no
+delay'' (eg TCP_NODELAY) option on the data socket used by netperf was
+requested by the test-specific @option{-D} option and
+successful. Units: 0 means no, 1 means yes.
+@item LOCAL_CORK
+Indicates whether or not TCP_CORK was set on the data socket used by
+netperf as requested via the test-specific @option{-C} option. 1 means
+yes, 0 means no/not applicable.
+@item REMOTE_SEND_CALLS
+@item REMOTE_RECV_CALLS
+@item REMOTE_BYTES_PER_RECV
+@item REMOTE_BYTES_PER_SEND
+@item REMOTE_BYTES_SENT
+@item REMOTE_BYTES_RECVD
+@item REMOTE_BYTES_XFERD
+@item REMOTE_SEND_OFFSET
+@item REMOTE_RECV_OFFSET
+@item REMOTE_SEND_ALIGN
+@item REMOTE_RECV_ALIGN
+@item REMOTE_SEND_WIDTH
+@item REMOTE_RECV_WIDTH
+@item REMOTE_SEND_DIRTY_COUNT
+@item REMOTE_RECV_DIRTY_COUNT
+@item REMOTE_RECV_CLEAN_COUNT
+@item REMOTE_NODELAY
+@item REMOTE_CORK
+These are all like their ``LOCAL_'' counterparts only for the
+netserver rather than netperf.
+@item LOCAL_SYSNAME
+The name of the OS (eg ``Linux'') running on the system on which
+netperf was running. Units: ASCII Text
+@item LOCAL_SYSTEM_MODEL
+The model name of the system on which netperf was running. Units:
+ASCII Text.
+@item LOCAL_RELEASE
+The release name/number of the OS running on the system on which
+netperf  was running. Units: ASCII Text
+@item LOCAL_VERSION
+The version number of the OS running on the system on which netperf
+was running. Units: ASCII Text
+@item LOCAL_MACHINE
+The machine architecture of the machine on which netperf was
+running. Units: ASCII Text.
+@item REMOTE_SYSNAME
+@item REMOTE_SYSTEM_MODEL
+@item REMOTE_RELEASE
+@item REMOTE_VERSION
+@item REMOTE_MACHINE
+These are all like their ``LOCAL_'' counterparts only for the
+netserver rather than netperf.
+@item LOCAL_INTERFACE_NAME
+The name of the probable egress interface through which the data
+connection went on the system running netperf. Example: eth0. Units:
+ASCII Text.
+@item LOCAL_INTERFACE_VENDOR
+The vendor ID of the probable egress interface through which traffic
+on the data connection went on the system running netperf. Units:
+Hexadecimal IDs as might be found in a @file{pci.ids} file or at
+@uref{http://pciids.sourceforge.net/,the PCI ID Repository}.
+@item LOCAL_INTERFACE_DEVICE
+The device ID of the probable egress interface through which traffic
+on the data connection went on the system running netperf. Units:
+Hexadecimal IDs as might be found in a @file{pci.ids} file or at
+@uref{http://pciids.sourceforge.net/,the PCI ID Repository}.
+@item LOCAL_INTERFACE_SUBVENDOR
+The sub-vendor ID of the probable egress interface through which
+traffic on the data connection went on the system running
+netperf. Units: Hexadecimal IDs as might be found in a @file{pci.ids}
+file or at @uref{http://pciids.sourceforge.net/,the PCI ID
+Repository}.
+@item LOCAL_INTERFACE_SUBDEVICE
+The sub-device ID of the probable egress interface through which
+traffic on the data connection went on the system running
+netperf. Units: Hexadecimal IDs as might be found in a @file{pci.ids}
+file or at @uref{http://pciids.sourceforge.net/,the PCI ID
+Repository}.
+@item LOCAL_DRIVER_NAME
+The name of the driver used for the probable egress interface through
+which traffic on the data connection went on the system running
+netperf. Units: ASCII Text.
+@item LOCAL_DRIVER_VERSION
+The version string for the driver used for the probable egress
+interface through which traffic on the data connection went on the
+system running netperf. Units: ASCII Text.
+@item LOCAL_DRIVER_FIRMWARE
+The firmware version for the driver used for the probable egress
+interface through which traffic on the data connection went on the
+system running netperf. Units: ASCII Text.
+@item LOCAL_DRIVER_BUS
+The bus address of the probable egress interface through which traffic
+on the data connection went on the system running netperf. Units:
+ASCII Text.
+@item LOCAL_INTERFACE_SLOT
+The slot ID of the probable egress interface through which traffic
+on the data connection went on the system running netperf. Units:
+ASCII Text.
+@item REMOTE_INTERFACE_NAME
+@item REMOTE_INTERFACE_VENDOR
+@item REMOTE_INTERFACE_DEVICE
+@item REMOTE_INTERFACE_SUBVENDOR
+@item REMOTE_INTERFACE_SUBDEVICE
+@item REMOTE_DRIVER_NAME
+@item REMOTE_DRIVER_VERSION
+@item REMOTE_DRIVER_FIRMWARE
+@item REMOTE_DRIVER_BUS
+@item REMOTE_INTERFACE_SLOT
+These are all like their ``LOCAL_'' counterparts only for the
+netserver rather than netperf.
+@item LOCAL_INTERVAL_USECS
+The interval at which bursts of operations (sends, receives,
+transactions) were attempted by netperf.  Specified by the
+global @option{-w} option which requires --enable-intervals to have
+been specified with the configure command prior to building
+netperf. Units: Microseconds (though specified by default in
+milliseconds on the command line)
+@item LOCAL_INTERVAL_BURST
+The number of operations (sends, receives, transactions depending on
+the test) which were attempted by netperf each LOCAL_INTERVAL_USECS
+units of time. Specified by the global @option{-b} option which
+requires --enable-intervals to have been specified with the configure
+command prior to building netperf.  Units: number of operations per burst.
+@item REMOTE_INTERVAL_USECS
+The interval at which bursts of operations (sends, receives,
+transactions) were attempted by netserver.  Specified by the
+global @option{-w} option which requires --enable-intervals to have
+been specified with the configure command prior to building
+netperf. Units: Microseconds (though specified by default in
+milliseconds on the command line)
+@item REMOTE_INTERVAL_BURST
+The number of operations (sends, receives, transactions depending on
+the test) which were attempted by netperf each LOCAL_INTERVAL_USECS
+units of time. Specified by the global @option{-b} option which
+requires --enable-intervals to have been specified with the configure
+command prior to building netperf.  Units: number of operations per burst.
+@item LOCAL_SECURITY_TYPE_ID
+@item LOCAL_SECURITY_TYPE
+@item LOCAL_SECURITY_ENABLED_NUM
+@item LOCAL_SECURITY_ENABLED
+@item LOCAL_SECURITY_SPECIFIC
+@item REMOTE_SECURITY_TYPE_ID
+@item REMOTE_SECURITY_TYPE
+@item REMOTE_SECURITY_ENABLED_NUM
+@item REMOTE_SECURITY_ENABLED
+@item REMOTE_SECURITY_SPECIFIC
+A bunch of stuff related to what sort of security mechanisms (eg
+SELINUX) were enabled on the systems during the test.
+@item RESULT_BRAND
+The string specified by the user with the global @option{-B}
+option. Units: ASCII Text.
+@item UUID
+The universally unique identifier associated with this test, either
+generated automagically by netperf, or passed to netperf via an omni
+test-specific @option{-u} option. Note: Future versions may make this
+a global command-line option. Units: ASCII Text.
+@item MIN_LATENCY
+The minimum ``latency'' or operation time (send, receive or
+request/response exchange depending on the test) as measured on the
+netperf side when the global @option{-j} option was specified. Units:
+Microseconds.
+@item MAX_LATENCY
+The maximum ``latency'' or operation time (send, receive or
+request/response exchange depending on the test) as measured on the
+netperf side when the global @option{-j} option was specified. Units:
+Microseconds.
+@item P50_LATENCY
+The 50th percentile value of ``latency'' or operation time (send, receive or
+request/response exchange depending on the test) as measured on the
+netperf side when the global @option{-j} option was specified. Units:
+Microseconds.
+@item P90_LATENCY
+The 90th percentile value of ``latency'' or operation time (send, receive or
+request/response exchange depending on the test) as measured on the
+netperf side when the global @option{-j} option was specified. Units:
+Microseconds.
+@item P99_LATENCY
+The 99th percentile value of ``latency'' or operation time (send, receive or
+request/response exchange depending on the test) as measured on the
+netperf side when the global @option{-j} option was specified. Units:
+Microseconds.
+@item MEAN_LATENCY
+The average ``latency'' or operation time (send, receive or
+request/response exchange depending on the test) as measured on the
+netperf side when the global @option{-j} option was specified. Units:
+Microseconds.
+@item STDDEV_LATENCY
+The standard deviation of ``latency'' or operation time (send, receive or
+request/response exchange depending on the test) as measured on the
+netperf side when the global @option{-j} option was specified. Units:
+Microseconds.
+@item COMMAND_LINE
+The full command line used when invoking netperf. Units: ASCII Text.
+@item OUTPUT_END
+While emitted with the list of output selectors, it is ignored when
+specified as an output selector.
+@end table
+
+@node Other Netperf Tests, Address Resolution, The Omni Tests, Top
+@chapter Other Netperf Tests
+
+Apart from the typical performance tests, netperf contains some tests
+which can be used to streamline measurements and reporting.  These
+include CPU rate calibration (present) and host identification (future
+enhancement).
+
+@menu
+* CPU rate calibration::        
+* UUID Generation::             
+@end menu
+
+@node CPU rate calibration, UUID Generation, Other Netperf Tests, Other Netperf Tests
+@section CPU rate calibration
+
+Some of the CPU utilization measurement mechanisms of netperf work by
+comparing the rate at which some counter increments when the system is
+idle with the rate at which that same counter increments when the
+system is running a netperf test.  The ratio of those rates is used to
+arrive at a CPU utilization percentage.
+
+This means that netperf must know the rate at which the counter
+increments when the system is presumed to be ``idle.''  If it does not
+know the rate, netperf will measure it before starting a data transfer
+test.  This calibration step takes 40 seconds for each of the local or
+remote systems, and if repeated for each netperf test would make taking
+repeated measurements rather slow.
+
+Thus, the netperf CPU utilization options @option{-c} and and
+@option{-C} can take an optional calibration value.  This value is
+used as the ``idle rate'' and the calibration step is not
+performed. To determine the idle rate, netperf can be used to run
+special tests which only report the value of the calibration - they
+are the LOC_CPU and REM_CPU tests.  These return the calibration value
+for the local and remote system respectively.  A common way to use
+these tests is to store their results into an environment variable and
+use that in subsequent netperf commands:
+
+@example
+LOC_RATE=`netperf -t LOC_CPU`
+REM_RATE=`netperf -H <remote> -t REM_CPU`
+netperf -H <remote> -c $LOC_RATE -C $REM_RATE ... -- ...
+...
+netperf -H <remote> -c $LOC_RATE -C $REM_RATE ... -- ...
+@end example
+
+If you are going to use netperf to measure aggregate results, it is
+important to use the LOC_CPU and REM_CPU tests to get the calibration
+values first to avoid issues with some of the aggregate netperf tests
+transferring data while others are ``idle'' and getting bogus
+calibration values.  When running aggregate tests, it is very
+important to remember that any one instance of netperf does not know
+about the other instances of netperf.  It will report global CPU
+utilization and will calculate service demand believing it was the
+only thing causing that CPU utilization.  So, you can use the CPU
+utilization reported by netperf in an aggregate test, but you have to
+calculate service demands by hand.
+
+@node UUID Generation,  , CPU rate calibration, Other Netperf Tests
+@section UUID Generation
+
+Beginning with version 2.5.0 netperf can generate Universally Unique
+IDentifiers (UUIDs).  This can be done explicitly via the ``UUID''
+test:
+@example
+$ netperf -t UUID
+2c8561ae-9ebd-11e0-a297-0f5bfa0349d0
+@end example
+
+In and of itself, this is not terribly useful, but used in conjunction
+with the test-specific @option{-u} option of an ``omni'' test to set
+the UUID emitted by the @ref{Omni Output Selectors,UUID} output
+selector, it can be used to tie-together the separate instances of an
+aggregate netperf test.  Say, for instance if they were inserted into
+a database of some sort.
+
+@node Address Resolution, Enhancing Netperf, Other Netperf Tests, Top
+@comment  node-name,  next,  previous,  up
+@chapter Address Resolution
+
+Netperf versions 2.4.0 and later have merged IPv4 and IPv6 tests so
+the functionality of the tests in @file{src/nettest_ipv6.c} has been
+subsumed into the tests in @file{src/nettest_bsd.c}  This has been
+accomplished in part by switching from @code{gethostbyname()}to
+@code{getaddrinfo()} exclusively.  While it was theoretically possible
+to get multiple results for a hostname from @code{gethostbyname()} it
+was generally unlikely and netperf's ignoring of the second and later
+results was not much of an issue.
+
+Now with @code{getaddrinfo} and particularly with AF_UNSPEC it is
+increasingly likely that a given hostname will have multiple
+associated addresses.  The @code{establish_control()} routine of
+@file{src/netlib.c} will indeed attempt to chose from among all the
+matching IP addresses when establishing the control connection.
+Netperf does not _really_ care if the control connection is IPv4 or
+IPv6 or even mixed on either end.
+
+However, the individual tests still ass-u-me that the first result in
+the address list is the one to be used.  Whether or not this will
+turn-out to be an issue has yet to be determined.
+
+If you do run into problems with this, the easiest workaround is to
+specify IP addresses for the data connection explicitly in the
+test-specific @option{-H} and @option{-L} options.  At some point, the
+netperf tests _may_ try to be more sophisticated in their parsing of
+returns from @code{getaddrinfo()} - straw-man patches to
+@email{netperf-feedback@@netperf.org} would of course be most welcome
+:)
+
+Netperf has leveraged code from other open-source projects with
+amenable licensing to provide a replacement @code{getaddrinfo()} call
+on those platforms where the @command{configure} script believes there
+is no native getaddrinfo call.  As of this writing, the replacement
+@code{getaddrinfo()} as been tested on HP-UX 11.0 and then presumed to
+run elsewhere.
+
+@node Enhancing Netperf, Netperf4, Address Resolution, Top
+@comment  node-name,  next,  previous,  up
+@chapter Enhancing Netperf
+
+Netperf is constantly evolving.  If you find you want to make
+enhancements to netperf, by all means do so.  If you wish to add a new
+``suite'' of tests to netperf the general idea is to:
+
+@enumerate
+@item
+Add files @file{src/nettest_mumble.c} and @file{src/nettest_mumble.h}
+where mumble is replaced with something meaningful for the test-suite.
+@item
+Add support for an appropriate @option{--enable-mumble} option in
+@file{configure.ac}.
+@item
+Edit @file{src/netperf.c}, @file{netsh.c}, and @file{netserver.c} as
+required, using #ifdef WANT_MUMBLE.
+@item
+Compile and test
+@end enumerate
+
+However, with the addition of the ``omni'' tests in version 2.5.0 it
+is preferred that one attempt to make the necessary changes to
+@file{src/nettest_omni.c} rather than adding new source files, unless
+this would make the omni tests entirely too complicated.
+
+If you wish to submit your changes for possible inclusion into the
+mainline sources, please try to base your changes on the latest
+available sources. (@xref{Getting Netperf Bits}.) and then send email
+describing the changes at a high level to
+@email{netperf-feedback@@netperf.org} or perhaps
+@email{netperf-talk@@netperf.org}.  If the consensus is positive, then
+sending context @command{diff} results to
+@email{netperf-feedback@@netperf.org} is the next step.  From that
+point, it is a matter of pestering the Netperf Contributing Editor
+until he gets the changes incorporated :)
+
+@node  Netperf4, Concept Index, Enhancing Netperf, Top
+@comment  node-name,  next,  previous,  up
+@chapter Netperf4
+
+Netperf4 is the shorthand name given to version 4.X.X of netperf.
+This is really a separate benchmark more than a newer version of
+netperf, but it is a descendant of netperf so the netperf name is
+kept.  The facetious way to describe netperf4 is to say it is the
+egg-laying-woolly-milk-pig version of netperf :)  The more respectful
+way to describe it is to say it is the version of netperf with support
+for synchronized, multiple-thread, multiple-test, multiple-system,
+network-oriented benchmarking.
+
+Netperf4 is still undergoing evolution. Those wishing to work with or
+on netperf4 are encouraged to join the
+@uref{http://www.netperf.org/cgi-bin/mailman/listinfo/netperf-dev,netperf-dev}
+mailing list and/or peruse the
+@uref{http://www.netperf.org/svn/netperf4/trunk,current sources}.
+
+@node Concept Index, Option Index, Netperf4, Top
+@unnumbered Concept Index
+
+@printindex cp
+
+@node Option Index,  , Concept Index, Top
+@comment  node-name,  next,  previous,  up
+@unnumbered Option Index
+
+@printindex vr
+@bye                                      
+
+@c  LocalWords:  texinfo setfilename settitle titlepage vskip pt filll ifnottex
+@c  LocalWords:  insertcopying cindex dfn uref printindex cp