diff options
Diffstat (limited to 'doc/netperf.texi')
-rw-r--r-- | doc/netperf.texi | 4150 |
1 files changed, 4150 insertions, 0 deletions
diff --git a/doc/netperf.texi b/doc/netperf.texi new file mode 100644 index 0000000..6b32e60 --- /dev/null +++ b/doc/netperf.texi @@ -0,0 +1,4150 @@ +\input texinfo @c -*-texinfo-*- +@c %**start of header +@setfilename netperf.info +@settitle Care and Feeding of Netperf 2.6.X +@c %**end of header + +@copying +This is Rick Jones' feeble attempt at a Texinfo-based manual for the +netperf benchmark. + +Copyright @copyright{} 2005-2012 Hewlett-Packard Company +@quotation +Permission is granted to copy, distribute and/or modify this document +per the terms of the netperf source license, a copy of which can be +found in the file @file{COPYING} of the basic netperf distribution. +@end quotation +@end copying + +@titlepage +@title Care and Feeding of Netperf +@subtitle Versions 2.6.0 and Later +@author Rick Jones @email{rick.jones2@@hp.com} +@c this is here to start the copyright page +@page +@vskip 0pt plus 1filll +@insertcopying +@end titlepage + +@c begin with a table of contents +@contents + +@ifnottex +@node Top, Introduction, (dir), (dir) +@top Netperf Manual + +@insertcopying +@end ifnottex + +@menu +* Introduction:: An introduction to netperf - what it +is and what it is not. +* Installing Netperf:: How to go about installing netperf. +* The Design of Netperf:: +* Global Command-line Options:: +* Using Netperf to Measure Bulk Data Transfer:: +* Using Netperf to Measure Request/Response :: +* Using Netperf to Measure Aggregate Performance:: +* Using Netperf to Measure Bidirectional Transfer:: +* The Omni Tests:: +* Other Netperf Tests:: +* Address Resolution:: +* Enhancing Netperf:: +* Netperf4:: +* Concept Index:: +* Option Index:: +@end menu + +@node Introduction, Installing Netperf, Top, Top +@chapter Introduction + +@cindex Introduction + +Netperf is a benchmark that can be use to measure various aspect of +networking performance. The primary foci are bulk (aka +unidirectional) data transfer and request/response performance using +either TCP or UDP and the Berkeley Sockets interface. As of this +writing, the tests available either unconditionally or conditionally +include: + +@itemize @bullet +@item +TCP and UDP unidirectional transfer and request/response over IPv4 and +IPv6 using the Sockets interface. +@item +TCP and UDP unidirectional transfer and request/response over IPv4 +using the XTI interface. +@item +Link-level unidirectional transfer and request/response using the DLPI +interface. +@item +Unix domain sockets +@item +SCTP unidirectional transfer and request/response over IPv4 and IPv6 +using the sockets interface. +@end itemize + +While not every revision of netperf will work on every platform +listed, the intention is that at least some version of netperf will +work on the following platforms: + +@itemize @bullet +@item +Unix - at least all the major variants. +@item +Linux +@item +Windows +@item +Others +@end itemize + +Netperf is maintained and informally supported primarily by Rick +Jones, who can perhaps be best described as Netperf Contributing +Editor. Non-trivial and very appreciated assistance comes from others +in the network performance community, who are too numerous to mention +here. While it is often used by them, netperf is NOT supported via any +of the formal Hewlett-Packard support channels. You should feel free +to make enhancements and modifications to netperf to suit your +nefarious porpoises, so long as you stay within the guidelines of the +netperf copyright. If you feel so inclined, you can send your changes +to +@email{netperf-feedback@@netperf.org,netperf-feedback} for possible +inclusion into subsequent versions of netperf. + +It is the Contributing Editor's belief that the netperf license walks +like open source and talks like open source. However, the license was +never submitted for ``certification'' as an open source license. If +you would prefer to make contributions to a networking benchmark using +a certified open source license, please consider netperf4, which is +distributed under the terms of the GPLv2. + +The @email{netperf-talk@@netperf.org,netperf-talk} mailing list is +available to discuss the care and feeding of netperf with others who +share your interest in network performance benchmarking. The +netperf-talk mailing list is a closed list (to deal with spam) and you +must first subscribe by sending email to +@email{netperf-talk-request@@netperf.org,netperf-talk-request}. + + +@menu +* Conventions:: +@end menu + +@node Conventions, , Introduction, Introduction +@section Conventions + +A @dfn{sizespec} is a one or two item, comma-separated list used as an +argument to a command-line option that can set one or two, related +netperf parameters. If you wish to set both parameters to separate +values, items should be separated by a comma: + +@example +parameter1,parameter2 +@end example + +If you wish to set the first parameter without altering the value of +the second from its default, you should follow the first item with a +comma: + +@example +parameter1, +@end example + + +Likewise, precede the item with a comma if you wish to set only the +second parameter: + +@example +,parameter2 +@end example + +An item with no commas: + +@example +parameter1and2 +@end example + +will set both parameters to the same value. This last mode is one of +the most frequently used. + +There is another variant of the comma-separated, two-item list called +a @dfn{optionspec} which is like a sizespec with the exception that a +single item with no comma: + +@example +parameter1 +@end example + +will only set the value of the first parameter and will leave the +second parameter at its default value. + +Netperf has two types of command-line options. The first are global +command line options. They are essentially any option not tied to a +particular test or group of tests. An example of a global +command-line option is the one which sets the test type - @option{-t}. + +The second type of options are test-specific options. These are +options which are only applicable to a particular test or set of +tests. An example of a test-specific option would be the send socket +buffer size for a TCP_STREAM test. + +Global command-line options are specified first with test-specific +options following after a @code{--} as in: + +@example +netperf <global> -- <test-specific> +@end example + + +@node Installing Netperf, The Design of Netperf, Introduction, Top +@chapter Installing Netperf + +@cindex Installation + +Netperf's primary form of distribution is source code. This allows +installation on systems other than those to which the authors have +ready access and thus the ability to create binaries. There are two +styles of netperf installation. The first runs the netperf server +program - netserver - as a child of inetd. This requires the +installer to have sufficient privileges to edit the files +@file{/etc/services} and @file{/etc/inetd.conf} or their +platform-specific equivalents. + +The second style is to run netserver as a standalone daemon. This +second method does not require edit privileges on @file{/etc/services} +and @file{/etc/inetd.conf} but does mean you must remember to run the +netserver program explicitly after every system reboot. + +This manual assumes that those wishing to measure networking +performance already know how to use anonymous FTP and/or a web +browser. It is also expected that you have at least a passing +familiarity with the networking protocols and interfaces involved. In +all honesty, if you do not have such familiarity, likely as not you +have some experience to gain before attempting network performance +measurements. The excellent texts by authors such as Stevens, Fenner +and Rudoff and/or Stallings would be good starting points. There are +likely other excellent sources out there as well. + +@menu +* Getting Netperf Bits:: +* Installing Netperf Bits:: +* Verifying Installation:: +@end menu + +@node Getting Netperf Bits, Installing Netperf Bits, Installing Netperf, Installing Netperf +@section Getting Netperf Bits + +Gzipped tar files of netperf sources can be retrieved via +@uref{ftp://ftp.netperf.org/netperf,anonymous FTP} +for ``released'' versions of the bits. Pre-release versions of the +bits can be retrieved via anonymous FTP from the +@uref{ftp://ftp.netperf.org/netperf/experimental,experimental} subdirectory. + +For convenience and ease of remembering, a link to the download site +is provided via the +@uref{http://www.netperf.org/, NetperfPage} + +The bits corresponding to each discrete release of netperf are +@uref{http://www.netperf.org/svn/netperf2/tags,tagged} for retrieval +via subversion. For example, there is a tag for the first version +corresponding to this version of the manual - +@uref{http://www.netperf.org/svn/netperf2/tags/netperf-2.6.0,netperf +2.6.0}. Those wishing to be on the bleeding edge of netperf +development can use subversion to grab the +@uref{http://www.netperf.org/svn/netperf2/trunk,top of trunk}. When +fixing bugs or making enhancements, patches against the top-of-trunk +are preferred. + +There are likely other places around the Internet from which one can +download netperf bits. These may be simple mirrors of the main +netperf site, or they may be local variants on netperf. As with +anything one downloads from the Internet, take care to make sure it is +what you really wanted and isn't some malicious Trojan or whatnot. +Caveat downloader. + +As a general rule, binaries of netperf and netserver are not +distributed from ftp.netperf.org. From time to time a kind soul or +souls has packaged netperf as a Debian package available via the +apt-get mechanism or as an RPM. I would be most interested in +learning how to enhance the makefiles to make that easier for people. + +@node Installing Netperf Bits, Verifying Installation, Getting Netperf Bits, Installing Netperf +@section Installing Netperf + +Once you have downloaded the tar file of netperf sources onto your +system(s), it is necessary to unpack the tar file, cd to the netperf +directory, run configure and then make. Most of the time it should be +sufficient to just: + +@example +gzcat netperf-<version>.tar.gz | tar xf - +cd netperf-<version> +./configure +make +make install +@end example + +Most of the ``usual'' configure script options should be present +dealing with where to install binaries and whatnot. +@example +./configure --help +@end example +should list all of those and more. You may find the @code{--prefix} +option helpful in deciding where the binaries and such will be put +during the @code{make install}. + +@vindex --enable-cpuutil, Configure +If the netperf configure script does not know how to automagically +detect which CPU utilization mechanism to use on your platform you may +want to add a @code{--enable-cpuutil=mumble} option to the configure +command. If you have knowledge and/or experience to contribute to +that area, feel free to contact @email{netperf-feedback@@netperf.org}. + +@vindex --enable-xti, Configure +@vindex --enable-unixdomain, Configure +@vindex --enable-dlpi, Configure +@vindex --enable-sctp, Configure +Similarly, if you want tests using the XTI interface, Unix Domain +Sockets, DLPI or SCTP it will be necessary to add one or more +@code{--enable-[xti|unixdomain|dlpi|sctp]=yes} options to the configure +command. As of this writing, the configure script will not include +those tests automagically. + +@vindex --enable-omni, Configure +Starting with version 2.5.0, netperf began migrating most of the +``classic'' netperf tests found in @file{src/nettest_bsd.c} to the +so-called ``omni'' tests (aka ``two routines to run them all'') found +in @file{src/nettest_omni.c}. This migration enables a number of new +features such as greater control over what output is included, and new +things to output. The ``omni'' test is enabled by default in 2.5.0 +and a number of the classic tests are migrated - you can tell if a +test has been migrated +from the presence of @code{MIGRATED} in the test banner. If you +encounter problems with either the omni or migrated tests, please +first attempt to obtain resolution via +@email{netperf-talk@@netperf.org} or +@email{netperf-feedback@@netperf.org}. If that is unsuccessful, you +can add a @code{--enable-omni=no} to the configure command and the +omni tests will not be compiled-in and the classic tests will not be +migrated. + +Starting with version 2.5.0, netperf includes the ``burst mode'' +functionality in a default compilation of the bits. If you encounter +problems with this, please first attempt to obtain help via +@email{netperf-talk@@netperf.org} or +@email{netperf-feedback@@netperf.org}. If that is unsuccessful, you +can add a @code{--enable-burst=no} to the configure command and the +burst mode functionality will not be compiled-in. + +On some platforms, it may be necessary to precede the configure +command with a CFLAGS and/or LIBS variable as the netperf configure +script is not yet smart enough to set them itself. Whenever possible, +these requirements will be found in @file{README.@var{platform}} files. +Expertise and assistance in making that more automagic in the +configure script would be most welcome. + +@cindex Limiting Bandwidth +@cindex Bandwidth Limitation +@vindex --enable-intervals, Configure +@vindex --enable-histogram, Configure +Other optional configure-time settings include +@code{--enable-intervals=yes} to give netperf the ability to ``pace'' +its _STREAM tests and @code{--enable-histogram=yes} to have netperf +keep a histogram of interesting times. Each of these will have some +effect on the measured result. If your system supports +@code{gethrtime()} the effect of the histogram measurement should be +minimized but probably still measurable. For example, the histogram +of a netperf TCP_RR test will be of the individual transaction times: +@example +netperf -t TCP_RR -H lag -v 2 +TCP REQUEST/RESPONSE TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to lag.hpl.hp.com (15.4.89.214) port 0 AF_INET : histogram +Local /Remote +Socket Size Request Resp. Elapsed Trans. +Send Recv Size Size Time Rate +bytes Bytes bytes bytes secs. per sec + +16384 87380 1 1 10.00 3538.82 +32768 32768 +Alignment Offset +Local Remote Local Remote +Send Recv Send Recv + 8 0 0 0 +Histogram of request/response times +UNIT_USEC : 0: 0: 0: 0: 0: 0: 0: 0: 0: 0 +TEN_USEC : 0: 0: 0: 0: 0: 0: 0: 0: 0: 0 +HUNDRED_USEC : 0: 34480: 111: 13: 12: 6: 9: 3: 4: 7 +UNIT_MSEC : 0: 60: 50: 51: 44: 44: 72: 119: 100: 101 +TEN_MSEC : 0: 105: 0: 0: 0: 0: 0: 0: 0: 0 +HUNDRED_MSEC : 0: 0: 0: 0: 0: 0: 0: 0: 0: 0 +UNIT_SEC : 0: 0: 0: 0: 0: 0: 0: 0: 0: 0 +TEN_SEC : 0: 0: 0: 0: 0: 0: 0: 0: 0: 0 +>100_SECS: 0 +HIST_TOTAL: 35391 +@end example + +The histogram you see above is basically a base-10 log histogram where +we can see that most of the transaction times were on the order of one +hundred to one-hundred, ninety-nine microseconds, but they were +occasionally as long as ten to nineteen milliseconds + +The @option{--enable-demo=yes} configure option will cause code to be +included to report interim results during a test run. The rate at +which interim results are reported can then be controlled via the +global @option{-D} option. Here is an example of @option{-D} output: + +@example +$ src/netperf -D 1.35 -H tardy.hpl.hp.com -f M +MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to tardy.hpl.hp.com (15.9.116.144) port 0 AF_INET : demo +Interim result: 5.41 MBytes/s over 1.35 seconds ending at 1308789765.848 +Interim result: 11.07 MBytes/s over 1.36 seconds ending at 1308789767.206 +Interim result: 16.00 MBytes/s over 1.36 seconds ending at 1308789768.566 +Interim result: 20.66 MBytes/s over 1.36 seconds ending at 1308789769.922 +Interim result: 22.74 MBytes/s over 1.36 seconds ending at 1308789771.285 +Interim result: 23.07 MBytes/s over 1.36 seconds ending at 1308789772.647 +Interim result: 23.77 MBytes/s over 1.37 seconds ending at 1308789774.016 +Recv Send Send +Socket Socket Message Elapsed +Size Size Size Time Throughput +bytes bytes bytes secs. MBytes/sec + + 87380 16384 16384 10.06 17.81 +@end example + +Notice how the units of the interim result track that requested by the +@option{-f} option. Also notice that sometimes the interval will be +longer than the value specified in the @option{-D} option. This is +normal and stems from how demo mode is implemented not by relying on +interval timers or frequent calls to get the current time, but by +calculating how many units of work must be performed to take at least +the desired interval. + +Those familiar with this option in earlier versions of netperf will +note the addition of the ``ending at'' text. This is the time as +reported by a @code{gettimeofday()} call (or its emulation) with a +@code{NULL} timezone pointer. This addition is intended to make it +easier to insert interim results into an +@uref{http://oss.oetiker.ch/rrdtool/doc/rrdtool.en.html,rrdtool} +Round-Robin Database (RRD). A likely bug-riddled example of doing so +can be found in @file{doc/examples/netperf_interim_to_rrd.sh}. The +time is reported out to milliseconds rather than microseconds because +that is the most rrdtool understands as of the time of this writing. + +As of this writing, a @code{make install} will not actually update the +files @file{/etc/services} and/or @file{/etc/inetd.conf} or their +platform-specific equivalents. It remains necessary to perform that +bit of installation magic by hand. Patches to the makefile sources to +effect an automagic editing of the necessary files to have netperf +installed as a child of inetd would be most welcome. + +Starting the netserver as a standalone daemon should be as easy as: +@example +$ netserver +Starting netserver at port 12865 +Starting netserver at hostname 0.0.0.0 port 12865 and family 0 +@end example + +Over time the specifics of the messages netserver prints to the screen +may change but the gist will remain the same. + +If the compilation of netperf or netserver happens to fail, feel free +to contact @email{netperf-feedback@@netperf.org} or join and ask in +@email{netperf-talk@@netperf.org}. However, it is quite important +that you include the actual compilation errors and perhaps even the +configure log in your email. Otherwise, it will be that much more +difficult for someone to assist you. + +@node Verifying Installation, , Installing Netperf Bits, Installing Netperf +@section Verifying Installation + +Basically, once netperf is installed and netserver is configured as a +child of inetd, or launched as a standalone daemon, simply typing: +@example +netperf +@end example +should result in output similar to the following: +@example +$ netperf +TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to localhost.localdomain (127.0.0.1) port 0 AF_INET +Recv Send Send +Socket Socket Message Elapsed +Size Size Size Time Throughput +bytes bytes bytes secs. 10^6bits/sec + + 87380 16384 16384 10.00 2997.84 +@end example + + +@node The Design of Netperf, Global Command-line Options, Installing Netperf, Top +@chapter The Design of Netperf + +@cindex Design of Netperf + +Netperf is designed around a basic client-server model. There are +two executables - netperf and netserver. Generally you will only +execute the netperf program, with the netserver program being invoked +by the remote system's inetd or having been previously started as its +own standalone daemon. + +When you execute netperf it will establish a ``control connection'' to +the remote system. This connection will be used to pass test +configuration information and results to and from the remote system. +Regardless of the type of test to be run, the control connection will +be a TCP connection using BSD sockets. The control connection can use +either IPv4 or IPv6. + +Once the control connection is up and the configuration information +has been passed, a separate ``data'' connection will be opened for the +measurement itself using the API's and protocols appropriate for the +specified test. When the test is completed, the data connection will +be torn-down and results from the netserver will be passed-back via the +control connection and combined with netperf's result for display to +the user. + +Netperf places no traffic on the control connection while a test is in +progress. Certain TCP options, such as SO_KEEPALIVE, if set as your +systems' default, may put packets out on the control connection while +a test is in progress. Generally speaking this will have no effect on +the results. + +@menu +* CPU Utilization:: +@end menu + +@node CPU Utilization, , The Design of Netperf, The Design of Netperf +@section CPU Utilization +@cindex CPU Utilization + +CPU utilization is an important, and alas all-too infrequently +reported component of networking performance. Unfortunately, it can +be one of the most difficult metrics to measure accurately and +portably. Netperf will do its level best to report accurate +CPU utilization figures, but some combinations of processor, OS and +configuration may make that difficult. + +CPU utilization in netperf is reported as a value between 0 and 100% +regardless of the number of CPUs involved. In addition to CPU +utilization, netperf will report a metric called a @dfn{service +demand}. The service demand is the normalization of CPU utilization +and work performed. For a _STREAM test it is the microseconds of CPU +time consumed to transfer on KB (K == 1024) of data. For a _RR test +it is the microseconds of CPU time consumed processing a single +transaction. For both CPU utilization and service demand, lower is +better. + +Service demand can be particularly useful when trying to gauge the +effect of a performance change. It is essentially a measure of +efficiency, with smaller values being more efficient and thus +``better.'' + +Netperf is coded to be able to use one of several, generally +platform-specific CPU utilization measurement mechanisms. Single +letter codes will be included in the CPU portion of the test banner to +indicate which mechanism was used on each of the local (netperf) and +remote (netserver) system. + +As of this writing those codes are: + +@table @code +@item U +The CPU utilization measurement mechanism was unknown to netperf or +netperf/netserver was not compiled to include CPU utilization +measurements. The code for the null CPU utilization mechanism can be +found in @file{src/netcpu_none.c}. +@item I +An HP-UX-specific CPU utilization mechanism whereby the kernel +incremented a per-CPU counter by one for each trip through the idle +loop. This mechanism was only available on specially-compiled HP-UX +kernels prior to HP-UX 10 and is mentioned here only for the sake of +historical completeness and perhaps as a suggestion to those who might +be altering other operating systems. While rather simple, perhaps even +simplistic, this mechanism was quite robust and was not affected by +the concerns of statistical methods, or methods attempting to track +time in each of user, kernel, interrupt and idle modes which require +quite careful accounting. It can be thought-of as the in-kernel +version of the looper @code{L} mechanism without the context switch +overhead. This mechanism required calibration. +@item P +An HP-UX-specific CPU utilization mechanism whereby the kernel +keeps-track of time (in the form of CPU cycles) spent in the kernel +idle loop (HP-UX 10.0 to 11.31 inclusive), or where the kernel keeps +track of time spent in idle, user, kernel and interrupt processing +(HP-UX 11.23 and later). The former requires calibration, the latter +does not. Values in either case are retrieved via one of the pstat(2) +family of calls, hence the use of the letter @code{P}. The code for +these mechanisms is found in @file{src/netcpu_pstat.c} and +@file{src/netcpu_pstatnew.c} respectively. +@item K +A Solaris-specific CPU utilization mechanism whereby the kernel keeps +track of ticks (eg HZ) spent in the idle loop. This method is +statistical and is known to be inaccurate when the interrupt rate is +above epsilon as time spent processing interrupts is not subtracted +from idle. The value is retrieved via a kstat() call - hence the use +of the letter @code{K}. Since this mechanism uses units of ticks (HZ) +the calibration value should invariably match HZ. (Eg 100) The code +for this mechanism is implemented in @file{src/netcpu_kstat.c}. +@item M +A Solaris-specific mechanism available on Solaris 10 and latter which +uses the new microstate accounting mechanisms. There are two, alas, +overlapping, mechanisms. The first tracks nanoseconds spent in user, +kernel, and idle modes. The second mechanism tracks nanoseconds spent +in interrupt. Since the mechanisms overlap, netperf goes through some +hand-waving to try to ``fix'' the problem. Since the accuracy of the +handwaving cannot be completely determined, one must presume that +while better than the @code{K} mechanism, this mechanism too is not +without issues. The values are retrieved via kstat() calls, but the +letter code is set to @code{M} to distinguish this mechanism from the +even less accurate @code{K} mechanism. The code for this mechanism is +implemented in @file{src/netcpu_kstat10.c}. +@item L +A mechanism based on ``looper''or ``soaker'' processes which sit in +tight loops counting as fast as they possibly can. This mechanism +starts a looper process for each known CPU on the system. The effect +of processor hyperthreading on the mechanism is not yet known. This +mechanism definitely requires calibration. The code for the +``looper''mechanism can be found in @file{src/netcpu_looper.c} +@item N +A Microsoft Windows-specific mechanism, the code for which can be +found in @file{src/netcpu_ntperf.c}. This mechanism too is based on +what appears to be a form of micro-state accounting and requires no +calibration. On laptops, or other systems which may dynamically alter +the CPU frequency to minimize power consumption, it has been suggested +that this mechanism may become slightly confused, in which case using +BIOS/uEFI settings to disable the power saving would be indicated. + +@item S +This mechanism uses @file{/proc/stat} on Linux to retrieve time +(ticks) spent in idle mode. It is thought but not known to be +reasonably accurate. The code for this mechanism can be found in +@file{src/netcpu_procstat.c}. +@item C +A mechanism somewhat similar to @code{S} but using the sysctl() call +on BSD-like Operating systems (*BSD and MacOS X). The code for this +mechanism can be found in @file{src/netcpu_sysctl.c}. +@item Others +Other mechanisms included in netperf in the past have included using +the times() and getrusage() calls. These calls are actually rather +poorly suited to the task of measuring CPU overhead for networking as +they tend to be process-specific and much network-related processing +can happen outside the context of a process, in places where it is not +a given it will be charged to the correct, or even a process. They +are mentioned here as a warning to anyone seeing those mechanisms used +in other networking benchmarks. These mechanisms are not available in +netperf 2.4.0 and later. +@end table + +For many platforms, the configure script will chose the best available +CPU utilization mechanism. However, some platforms have no +particularly good mechanisms. On those platforms, it is probably best +to use the ``LOOPER'' mechanism which is basically some number of +processes (as many as there are processors) sitting in tight little +loops counting as fast as they can. The rate at which the loopers +count when the system is believed to be idle is compared with the rate +when the system is running netperf and the ratio is used to compute +CPU utilization. + +In the past, netperf included some mechanisms that only reported CPU +time charged to the calling process. Those mechanisms have been +removed from netperf versions 2.4.0 and later because they are +hopelessly inaccurate. Networking can and often results in CPU time +being spent in places - such as interrupt contexts - that do not get +charged to a or the correct process. + +In fact, time spent in the processing of interrupts is a common issue +for many CPU utilization mechanisms. In particular, the ``PSTAT'' +mechanism was eventually known to have problems accounting for certain +interrupt time prior to HP-UX 11.11 (11iv1). HP-UX 11iv2 and later +are known/presumed to be good. The ``KSTAT'' mechanism is known to +have problems on all versions of Solaris up to and including Solaris +10. Even the microstate accounting available via kstat in Solaris 10 +has issues, though perhaps not as bad as those of prior versions. + +The /proc/stat mechanism under Linux is in what the author would +consider an ``uncertain'' category as it appears to be statistical, +which may also have issues with time spent processing interrupts. + +In summary, be sure to ``sanity-check'' the CPU utilization figures +with other mechanisms. However, platform tools such as top, vmstat or +mpstat are often based on the same mechanisms used by netperf. + +@menu +* CPU Utilization in a Virtual Guest:: +@end menu + +@node CPU Utilization in a Virtual Guest, , CPU Utilization, CPU Utilization +@subsection CPU Utilization in a Virtual Guest + +The CPU utilization mechanisms used by netperf are ``inline'' in that +they are run by the same netperf or netserver process as is running +the test itself. This works just fine for ``bare iron'' tests but +runs into a problem when using virtual machines. + +The relationship between virtual guest and hypervisor can be thought +of as being similar to that between a process and kernel in a bare +iron system. As such, (m)any CPU utilization mechanisms used in the +virtual guest are similar to ``process-local'' mechanisms in a bare +iron situation. However, just as with bare iron and process-local +mechanisms, much networking processing happens outside the context of +the virtual guest. It takes place in the hypervisor, and is not +visible to mechanisms running in the guest(s). For this reason, one +should not really trust CPU utilization figures reported by netperf or +netserver when running in a virtual guest. + +If one is looking to measure the added overhead of a virtualization +mechanism, rather than rely on CPU utilization, one can rely instead +on netperf _RR tests - path-lengths and overheads can be a significant +fraction of the latency, so increases in overhead should appear as +decreases in transaction rate. Whatever you do, @b{DO NOT} rely on +the throughput of a _STREAM test. Achieving link-rate can be done via +a multitude of options that mask overhead rather than eliminate it. + +@node Global Command-line Options, Using Netperf to Measure Bulk Data Transfer, The Design of Netperf, Top +@chapter Global Command-line Options + +This section describes each of the global command-line options +available in the netperf and netserver binaries. Essentially, it is +an expanded version of the usage information displayed by netperf or +netserver when invoked with the @option{-h} global command-line +option. + +@menu +* Command-line Options Syntax:: +* Global Options:: +@end menu + +@node Command-line Options Syntax, Global Options, Global Command-line Options, Global Command-line Options +@comment node-name, next, previous, up +@section Command-line Options Syntax + +Revision 1.8 of netperf introduced enough new functionality to overrun +the English alphabet for mnemonic command-line option names, and the +author was not and is not quite ready to switch to the contemporary +@option{--mumble} style of command-line options. (Call him a Luddite +if you wish :). + +For this reason, the command-line options were split into two parts - +the first are the global command-line options. They are options that +affect nearly any and every test type of netperf. The second type are +the test-specific command-line options. Both are entered on the same +command line, but they must be separated from one another by a @code{--} +for correct parsing. Global command-line options come first, followed +by the @code{--} and then test-specific command-line options. If there +are no test-specific options to be set, the @code{--} may be omitted. If +there are no global command-line options to be set, test-specific +options must still be preceded by a @code{--}. For example: +@example +netperf <global> -- <test-specific> +@end example +sets both global and test-specific options: +@example +netperf <global> +@end example +sets just global options and: +@example +netperf -- <test-specific> +@end example +sets just test-specific options. + +@node Global Options, , Command-line Options Syntax, Global Command-line Options +@comment node-name, next, previous, up +@section Global Options + +@table @code +@vindex -a, Global +@item -a <sizespec> +This option allows you to alter the alignment of the buffers used in +the sending and receiving calls on the local system.. Changing the +alignment of the buffers can force the system to use different copy +schemes, which can have a measurable effect on performance. If the +page size for the system were 4096 bytes, and you want to pass +page-aligned buffers beginning on page boundaries, you could use +@samp{-a 4096}. By default the units are bytes, but suffix of ``G,'' +``M,'' or ``K'' will specify the units to be 2^30 (GB), 2^20 (MB) or +2^10 (KB) respectively. A suffix of ``g,'' ``m'' or ``k'' will specify +units of 10^9, 10^6 or 10^3 bytes respectively. [Default: 8 bytes] + +@vindex -A, Global +@item -A <sizespec> +This option is identical to the @option{-a} option with the difference +being it affects alignments for the remote system. + +@vindex -b, Global +@item -b <size> +This option is only present when netperf has been configure with +--enable-intervals=yes prior to compilation. It sets the size of the +burst of send calls in a _STREAM test. When used in conjunction with +the @option{-w} option it can cause the rate at which data is sent to +be ``paced.'' + +@vindex -B, Global +@item -B <string> +This option will cause @option{<string>} to be appended to the brief +(see -P) output of netperf. + +@vindex -c, Global +@item -c [rate] +This option will ask that CPU utilization and service demand be +calculated for the local system. For those CPU utilization mechanisms +requiring calibration, the options rate parameter may be specified to +preclude running another calibration step, saving 40 seconds of time. +For those CPU utilization mechanisms requiring no calibration, the +optional rate parameter will be utterly and completely ignored. +[Default: no CPU measurements] + +@vindex -C, Global +@item -C [rate] +This option requests CPU utilization and service demand calculations +for the remote system. It is otherwise identical to the @option{-c} +option. + +@vindex -d, Global +@item -d +Each instance of this option will increase the quantity of debugging +output displayed during a test. If the debugging output level is set +high enough, it may have a measurable effect on performance. +Debugging information for the local system is printed to stdout. +Debugging information for the remote system is sent by default to the +file @file{/tmp/netperf.debug}. [Default: no debugging output] + +@vindex -D, Global +@item -D [interval,units] +This option is only available when netperf is configured with +--enable-demo=yes. When set, it will cause netperf to emit periodic +reports of performance during the run. [@var{interval},@var{units}] +follow the semantics of an optionspec. If specified, +@var{interval} gives the minimum interval in real seconds, it does not +have to be whole seconds. The @var{units} value can be used for the +first guess as to how many units of work (bytes or transactions) must +be done to take at least @var{interval} seconds. If omitted, +@var{interval} defaults to one second and @var{units} to values +specific to each test type. + +@vindex -f, Global +@item -f G|M|K|g|m|k|x +This option can be used to change the reporting units for _STREAM +tests. Arguments of ``G,'' ``M,'' or ``K'' will set the units to +2^30, 2^20 or 2^10 bytes/s respectively (EG power of two GB, MB or +KB). Arguments of ``g,'' ``,m'' or ``k'' will set the units to 10^9, +10^6 or 10^3 bits/s respectively. An argument of ``x'' requests the +units be transactions per second and is only meaningful for a +request-response test. [Default: ``m'' or 10^6 bits/s] + +@vindex -F, Global +@item -F <fillfile> +This option specified the file from which send which buffers will be +pre-filled . While the buffers will contain data from the specified +file, the file is not fully transferred to the remote system as the +receiving end of the test will not write the contents of what it +receives to a file. This can be used to pre-fill the send buffers +with data having different compressibility and so is useful when +measuring performance over mechanisms which perform compression. + +While previously required for a TCP_SENDFILE test, later versions of +netperf removed that restriction, creating a temporary file as +needed. While the author cannot recall exactly when that took place, +it is known to be unnecessary in version 2.5.0 and later. + +@vindex -h, Global +@item -h +This option causes netperf to display its ``global'' usage string and +exit to the exclusion of all else. + +@vindex -H, Global +@item -H <optionspec> +This option will set the name of the remote system and or the address +family used for the control connection. For example: +@example +-H linger,4 +@end example +will set the name of the remote system to ``linger'' and tells netperf to +use IPv4 addressing only. +@example +-H ,6 +@end example +will leave the name of the remote system at its default, and request +that only IPv6 addresses be used for the control connection. +@example +-H lag +@end example +will set the name of the remote system to ``lag'' and leave the +address family to AF_UNSPEC which means selection of IPv4 vs IPv6 is +left to the system's address resolution. + +A value of ``inet'' can be used in place of ``4'' to request IPv4 only +addressing. Similarly, a value of ``inet6'' can be used in place of +``6'' to request IPv6 only addressing. A value of ``0'' can be used +to request either IPv4 or IPv6 addressing as name resolution dictates. + +By default, the options set with the global @option{-H} option are +inherited by the test for its data connection, unless a test-specific +@option{-H} option is specified. + +If a @option{-H} option follows either the @option{-4} or @option{-6} +options, the family setting specified with the -H option will override +the @option{-4} or @option{-6} options for the remote address +family. If no address family is specified, settings from a previous +@option{-4} or @option{-6} option will remain. In a nutshell, the +last explicit global command-line option wins. + +[Default: ``localhost'' for the remote name/IP address and ``0'' (eg +AF_UNSPEC) for the remote address family.] + +@vindex -I, Global +@item -I <optionspec> +This option enables the calculation of confidence intervals and sets +the confidence and width parameters with the first half of the +optionspec being either 99 or 95 for 99% or 95% confidence +respectively. The second value of the optionspec specifies the width +of the desired confidence interval. For example +@example +-I 99,5 +@end example +asks netperf to be 99% confident that the measured mean values for +throughput and CPU utilization are within +/- 2.5% of the ``real'' +mean values. If the @option{-i} option is specified and the +@option{-I} option is omitted, the confidence defaults to 99% and the +width to 5% (giving +/- 2.5%) + +If classic netperf test calculates that the desired confidence +intervals have not been met, it emits a noticeable warning that cannot +be suppressed with the @option{-P} or @option{-v} options: + +@example +netperf -H tardy.cup -i 3 -I 99,5 +TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to tardy.cup.hp.com (15.244.44.58) port 0 AF_INET : +/-2.5% @ 99% conf. +!!! WARNING +!!! Desired confidence was not achieved within the specified iterations. +!!! This implies that there was variability in the test environment that +!!! must be investigated before going further. +!!! Confidence intervals: Throughput : 6.8% +!!! Local CPU util : 0.0% +!!! Remote CPU util : 0.0% + +Recv Send Send +Socket Socket Message Elapsed +Size Size Size Time Throughput +bytes bytes bytes secs. 10^6bits/sec + + 32768 16384 16384 10.01 40.23 +@end example + +In the example above we see that netperf did not meet the desired +confidence intervals. Instead of being 99% confident it was within ++/- 2.5% of the real mean value of throughput it is only confident it +was within +/-3.4%. In this example, increasing the @option{-i} +option (described below) and/or increasing the iteration length with +the @option{-l} option might resolve the situation. + +In an explicit ``omni'' test, failure to meet the confidence intervals +will not result in netperf emitting a warning. To verify the hitting, +or not, of the confidence intervals one will need to include them as +part of an @ref{Omni Output Selection,output selection} in the +test-specific @option{-o}, @option{-O} or @option{k} output selection +options. The warning about not hitting the confidence intervals will +remain in a ``migrated'' classic netperf test. + +@vindex -i, Global +@item -i <sizespec> +This option enables the calculation of confidence intervals and sets +the minimum and maximum number of iterations to run in attempting to +achieve the desired confidence interval. The first value sets the +maximum number of iterations to run, the second, the minimum. The +maximum number of iterations is silently capped at 30 and the minimum +is silently floored at 3. Netperf repeats the measurement the minimum +number of iterations and continues until it reaches either the +desired confidence interval, or the maximum number of iterations, +whichever comes first. A classic or migrated netperf test will not +display the actual number of iterations run. An @ref{The Omni +Tests,omni test} will emit the number of iterations run if the +@code{CONFIDENCE_ITERATION} output selector is included in the +@ref{Omni Output Selection,output selection}. + +If the @option{-I} option is specified and the @option{-i} option +omitted the maximum number of iterations is set to 10 and the minimum +to three. + +Output of a warning upon not hitting the desired confidence intervals +follows the description provided for the @option{-I} option. + +The total test time will be somewhere between the minimum and maximum +number of iterations multiplied by the test length supplied by the +@option{-l} option. + +@vindex -j, Global +@item -j +This option instructs netperf to keep additional timing statistics +when explicitly running an @ref{The Omni Tests,omni test}. These can +be output when the test-specific @option{-o}, @option{-O} or +@option{-k} @ref{Omni Output Selectors,output selectors} include one +or more of: + +@itemize +@item MIN_LATENCY +@item MAX_LATENCY +@item P50_LATENCY +@item P90_LATENCY +@item P99_LATENCY +@item MEAN_LATENCY +@item STDDEV_LATENCY +@end itemize + +These statistics will be based on an expanded (100 buckets per row +rather than 10) histogram of times rather than a terribly long list of +individual times. As such, there will be some slight error thanks to +the bucketing. However, the reduction in storage and processing +overheads is well worth it. When running a request/response test, one +might get some idea of the error by comparing the @ref{Omni Output +Selectors,@code{MEAN_LATENCY}} calculated from the histogram with the +@code{RT_LATENCY} calculated from the number of request/response +transactions and the test run time. + +In the case of a request/response test the latencies will be +transaction latencies. In the case of a receive-only test they will +be time spent in the receive call. In the case of a send-only test +they will be time spent in the send call. The units will be +microseconds. Added in netperf 2.5.0. + +@vindex -l, Global +@item -l testlen +This option controls the length of any @b{one} iteration of the requested +test. A positive value for @var{testlen} will run each iteration of +the test for at least @var{testlen} seconds. A negative value for +@var{testlen} will run each iteration for the absolute value of +@var{testlen} transactions for a _RR test or bytes for a _STREAM test. +Certain tests, notably those using UDP can only be timed, they cannot +be limited by transaction or byte count. This limitation may be +relaxed in an @ref{The Omni Tests,omni} test. + +In some situations, individual iterations of a test may run for longer +for the number of seconds specified by the @option{-l} option. In +particular, this may occur for those tests where the socket buffer +size(s) are significantly longer than the bandwidthXdelay product of +the link(s) over which the data connection passes, or those tests +where there may be non-trivial numbers of retransmissions. + +If confidence intervals are enabled via either @option{-I} or +@option{-i} the total length of the netperf test will be somewhere +between the minimum and maximum iteration count multiplied by +@var{testlen}. + +@vindex -L, Global +@item -L <optionspec> +This option is identical to the @option{-H} option with the difference +being it sets the _local_ hostname/IP and/or address family +information. This option is generally unnecessary, but can be useful +when you wish to make sure that the netperf control and data +connections go via different paths. It can also come-in handy if one +is trying to run netperf through those evil, end-to-end breaking +things known as firewalls. + +[Default: 0.0.0.0 (eg INADDR_ANY) for IPv4 and ::0 for IPv6 for the +local name. AF_UNSPEC for the local address family.] + +@vindex -n, Global +@item -n numcpus +This option tells netperf how many CPUs it should ass-u-me are active +on the system running netperf. In particular, this is used for the +@ref{CPU Utilization,CPU utilization} and service demand calculations. +On certain systems, netperf is able to determine the number of CPU's +automagically. This option will override any number netperf might be +able to determine on its own. + +Note that this option does _not_ set the number of CPUs on the system +running netserver. When netperf/netserver cannot automagically +determine the number of CPUs that can only be set for netserver via a +netserver @option{-n} command-line option. + +As it is almost universally possible for netperf/netserver to +determine the number of CPUs on the system automagically, 99 times out +of 10 this option should not be necessary and may be removed in a +future release of netperf. + +@vindex -N, Global +@item -N +This option tells netperf to forgo establishing a control +connection. This makes it is possible to run some limited netperf +tests without a corresponding netserver on the remote system. + +With this option set, the test to be run is to get all the addressing +information it needs to establish its data connection from the command +line or internal defaults. If not otherwise specified by +test-specific command line options, the data connection for a +``STREAM'' or ``SENDFILE'' test will be to the ``discard'' port, an +``RR'' test will be to the ``echo'' port, and a ``MEARTS'' test will +be to the chargen port. + +The response size of an ``RR'' test will be silently set to be the +same as the request size. Otherwise the test would hang if the +response size was larger than the request size, or would report an +incorrect, inflated transaction rate if the response size was less +than the request size. + +Since there is no control connection when this option is specified, it +is not possible to set ``remote'' properties such as socket buffer +size and the like via the netperf command line. Nor is it possible to +retrieve such interesting remote information as CPU utilization. +These items will be displayed as values which should make it +immediately obvious that was the case. + +The only way to change remote characteristics such as socket buffer +size or to obtain information such as CPU utilization is to employ +platform-specific methods on the remote system. Frankly, if one has +access to the remote system to employ those methods one aught to be +able to run a netserver there. However, that ability may not be +present in certain ``support'' situations, hence the addition of this +option. + +Added in netperf 2.4.3. + +@vindex -o, Global +@item -o <sizespec> +The value(s) passed-in with this option will be used as an offset +added to the alignment specified with the @option{-a} option. For +example: +@example +-o 3 -a 4096 +@end example +will cause the buffers passed to the local (netperf) send and receive +calls to begin three bytes past an address aligned to 4096 +bytes. [Default: 0 bytes] + +@vindex -O, Global +@item -O <sizespec> +This option behaves just as the @option{-o} option but on the remote +(netserver) system and in conjunction with the @option{-A} +option. [Default: 0 bytes] + +@vindex -p, Global +@item -p <optionspec> +The first value of the optionspec passed-in with this option tells +netperf the port number at which it should expect the remote netserver +to be listening for control connections. The second value of the +optionspec will request netperf to bind to that local port number +before establishing the control connection. For example +@example +-p 12345 +@end example +tells netperf that the remote netserver is listening on port 12345 and +leaves selection of the local port number for the control connection +up to the local TCP/IP stack whereas +@example +-p ,32109 +@end example +leaves the remote netserver port at the default value of 12865 and +causes netperf to bind to the local port number 32109 before +connecting to the remote netserver. + +In general, setting the local port number is only necessary when one +is looking to run netperf through those evil, end-to-end breaking +things known as firewalls. + +@vindex -P, Global +@item -P 0|1 +A value of ``1'' for the @option{-P} option will enable display of +the test banner. A value of ``0'' will disable display of the test +banner. One might want to disable display of the test banner when +running the same basic test type (eg TCP_STREAM) multiple times in +succession where the test banners would then simply be redundant and +unnecessarily clutter the output. [Default: 1 - display test banners] + +@vindex -s, Global +@item -s <seconds> +This option will cause netperf to sleep @samp{<seconds>} before +actually transferring data over the data connection. This may be +useful in situations where one wishes to start a great many netperf +instances and do not want the earlier ones affecting the ability of +the later ones to get established. + +Added somewhere between versions 2.4.3 and 2.5.0. + +@vindex -S, Global +@item -S +This option will cause an attempt to be made to set SO_KEEPALIVE on +the data socket of a test using the BSD sockets interface. The +attempt will be made on the netperf side of all tests, and will be +made on the netserver side of an @ref{The Omni Tests,omni} or +@ref{Migrated Tests,migrated} test. No indication of failure is given +unless debug output is enabled with the global @option{-d} option. + +Added in version 2.5.0. + +@vindex -t, Global +@item -t testname +This option is used to tell netperf which test you wish to run. As of +this writing, valid values for @var{testname} include: +@itemize +@item +@ref{TCP_STREAM}, @ref{TCP_MAERTS}, @ref{TCP_SENDFILE}, @ref{TCP_RR}, @ref{TCP_CRR}, @ref{TCP_CC} +@item +@ref{UDP_STREAM}, @ref{UDP_RR} +@item +@ref{XTI_TCP_STREAM}, @ref{XTI_TCP_RR}, @ref{XTI_TCP_CRR}, @ref{XTI_TCP_CC} +@item +@ref{XTI_UDP_STREAM}, @ref{XTI_UDP_RR} +@item +@ref{SCTP_STREAM}, @ref{SCTP_RR} +@item +@ref{DLCO_STREAM}, @ref{DLCO_RR}, @ref{DLCL_STREAM}, @ref{DLCL_RR} +@item +@ref{Other Netperf Tests,LOC_CPU}, @ref{Other Netperf Tests,REM_CPU} +@item +@ref{The Omni Tests,OMNI} +@end itemize +Not all tests are always compiled into netperf. In particular, the +``XTI,'' ``SCTP,'' ``UNIXDOMAIN,'' and ``DL*'' tests are only included in +netperf when configured with +@option{--enable-[xti|sctp|unixdomain|dlpi]=yes}. + +Netperf only runs one type of test no matter how many @option{-t} +options may be present on the command-line. The last @option{-t} +global command-line option will determine the test to be +run. [Default: TCP_STREAM] + +@vindex -T, Global +@item -T <optionspec> +This option controls the CPU, and probably by extension memory, +affinity of netperf and/or netserver. +@example +netperf -T 1 +@end example +will bind both netperf and netserver to ``CPU 1'' on their respective +systems. +@example +netperf -T 1, +@end example +will bind just netperf to ``CPU 1'' and will leave netserver unbound. +@example +netperf -T ,2 +@end example +will leave netperf unbound and will bind netserver to ``CPU 2.'' +@example +netperf -T 1,2 +@end example +will bind netperf to ``CPU 1'' and netserver to ``CPU 2.'' + +This can be particularly useful when investigating performance issues +involving where processes run relative to where NIC interrupts are +processed or where NICs allocate their DMA buffers. + +@vindex -v, Global +@item -v verbosity +This option controls how verbose netperf will be in its output, and is +often used in conjunction with the @option{-P} option. If the +verbosity is set to a value of ``0'' then only the test's SFM (Single +Figure of Merit) is displayed. If local @ref{CPU Utilization,CPU +utilization} is requested via the @option{-c} option then the SFM is +the local service demand. Othersise, if remote CPU utilization is +requested via the @option{-C} option then the SFM is the remote +service demand. If neither local nor remote CPU utilization are +requested the SFM will be the measured throughput or transaction rate +as implied by the test specified with the @option{-t} option. + +If the verbosity level is set to ``1'' then the ``normal'' netperf +result output for each test is displayed. + +If the verbosity level is set to ``2'' then ``extra'' information will +be displayed. This may include, but is not limited to the number of +send or recv calls made and the average number of bytes per send or +recv call, or a histogram of the time spent in each send() call or for +each transaction if netperf was configured with +@option{--enable-histogram=yes}. [Default: 1 - normal verbosity] + +In an @ref{The Omni Tests,omni} test the verbosity setting is largely +ignored, save for when asking for the time histogram to be displayed. +In version 2.5.0 and later there is no @ref{Omni Output Selectors,output +selector} for the histogram and so it remains displayed only when the +verbosity level is set to 2. + +@vindex -V, Global +@item -V +This option displays the netperf version and then exits. + +Added in netperf 2.4.4. + +@vindex -w, Global +@item -w time +If netperf was configured with @option{--enable-intervals=yes} then +this value will set the inter-burst time to time milliseconds, and the +@option{-b} option will set the number of sends per burst. The actual +inter-burst time may vary depending on the system's timer resolution. + +@vindex -W, Global +@item -W <sizespec> +This option controls the number of buffers in the send (first or only +value) and or receive (second or only value) buffer rings. Unlike +some benchmarks, netperf does not continuously send or receive from a +single buffer. Instead it rotates through a ring of +buffers. [Default: One more than the size of the send or receive +socket buffer sizes (@option{-s} and/or @option{-S} options) divided +by the send @option{-m} or receive @option{-M} buffer size +respectively] + +@vindex -4, Global +@item -4 +Specifying this option will set both the local and remote address +families to AF_INET - that is use only IPv4 addresses on the control +connection. This can be overridden by a subsequent @option{-6}, +@option{-H} or @option{-L} option. Basically, the last option +explicitly specifying an address family wins. Unless overridden by a +test-specific option, this will be inherited for the data connection +as well. + +@vindex -6, Global +@item -6 +Specifying this option will set both local and and remote address +families to AF_INET6 - that is use only IPv6 addresses on the control +connection. This can be overridden by a subsequent @option{-4}, +@option{-H} or @option{-L} option. Basically, the last address family +explicitly specified wins. Unless overridden by a test-specific +option, this will be inherited for the data connection as well. + +@end table + + +@node Using Netperf to Measure Bulk Data Transfer, Using Netperf to Measure Request/Response , Global Command-line Options, Top +@chapter Using Netperf to Measure Bulk Data Transfer + +The most commonly measured aspect of networked system performance is +that of bulk or unidirectional transfer performance. Everyone wants +to know how many bits or bytes per second they can push across the +network. The classic netperf convention for a bulk data transfer test +name is to tack a ``_STREAM'' suffix to a test name. + +@menu +* Issues in Bulk Transfer:: +* Options common to TCP UDP and SCTP tests:: +@end menu + +@node Issues in Bulk Transfer, Options common to TCP UDP and SCTP tests, Using Netperf to Measure Bulk Data Transfer, Using Netperf to Measure Bulk Data Transfer +@comment node-name, next, previous, up +@section Issues in Bulk Transfer + +There are any number of things which can affect the performance of a +bulk transfer test. + +Certainly, absent compression, bulk-transfer tests can be limited by +the speed of the slowest link in the path from the source to the +destination. If testing over a gigabit link, you will not see more +than a gigabit :) Such situations can be described as being +@dfn{network-limited} or @dfn{NIC-limited}. + +CPU utilization can also affect the results of a bulk-transfer test. +If the networking stack requires a certain number of instructions or +CPU cycles per KB of data transferred, and the CPU is limited in the +number of instructions or cycles it can provide, then the transfer can +be described as being @dfn{CPU-bound}. + +A bulk-transfer test can be CPU bound even when netperf reports less +than 100% CPU utilization. This can happen on an MP system where one +or more of the CPUs saturate at 100% but other CPU's remain idle. +Typically, a single flow of data, such as that from a single instance +of a netperf _STREAM test cannot make use of much more than the power +of one CPU. Exceptions to this generally occur when netperf and/or +netserver run on CPU(s) other than the CPU(s) taking interrupts from +the NIC(s). In that case, one might see as much as two CPUs' worth of +processing being used to service the flow of data. + +Distance and the speed-of-light can affect performance for a +bulk-transfer; often this can be mitigated by using larger windows. +One common limit to the performance of a transport using window-based +flow-control is: +@example +Throughput <= WindowSize/RoundTripTime +@end example +As the sender can only have a window's-worth of data outstanding on +the network at any one time, and the soonest the sender can receive a +window update from the receiver is one RoundTripTime (RTT). TCP and +SCTP are examples of such protocols. + +Packet losses and their effects can be particularly bad for +performance. This is especially true if the packet losses result in +retransmission timeouts for the protocol(s) involved. By the time a +retransmission timeout has happened, the flow or connection has sat +idle for a considerable length of time. + +On many platforms, some variant on the @command{netstat} command can +be used to retrieve statistics about packet loss and +retransmission. For example: +@example +netstat -p tcp +@end example +will retrieve TCP statistics on the HP-UX Operating System. On other +platforms, it may not be possible to retrieve statistics for a +specific protocol and something like: +@example +netstat -s +@end example +would be used instead. + +Many times, such network statistics are keep since the time the stack +started, and we are only really interested in statistics from when +netperf was running. In such situations something along the lines of: +@example +netstat -p tcp > before +netperf -t TCP_mumble... +netstat -p tcp > after +@end example +is indicated. The +@uref{ftp://ftp.cup.hp.com/dist/networking/tools/,beforeafter} utility +can be used to subtract the statistics in @file{before} from the +statistics in @file{after}: +@example +beforeafter before after > delta +@end example +and then one can look at the statistics in @file{delta}. Beforeafter +is distributed in source form so one can compile it on the platform(s) +of interest. + +If running a version 2.5.0 or later ``omni'' test under Linux one can +include either or both of: +@itemize +@item LOCAL_TRANSPORT_RETRANS +@item REMOTE_TRANSPORT_RETRANS +@end itemize + +in the values provided via a test-specific @option{-o}, @option{-O}, +or @option{-k} output selction option and netperf will report the +retransmissions experienced on the data connection, as reported via a +@code{getsockopt(TCP_INFO)} call. If confidence intervals have been +requested via the global @option{-I} or @option{-i} options, the +reported value(s) will be for the last iteration. If the test is over +a protocol other than TCP, or on a platform other than Linux, the +results are undefined. + +While it was written with HP-UX's netstat in mind, the +@uref{ftp://ftp.cup.hp.com/dist/networking/briefs/annotated_netstat.txt,annotated +netstat} writeup may be helpful with other platforms as well. + +@node Options common to TCP UDP and SCTP tests, , Issues in Bulk Transfer, Using Netperf to Measure Bulk Data Transfer +@comment node-name, next, previous, up +@section Options common to TCP UDP and SCTP tests + +Many ``test-specific'' options are actually common across the +different tests. For those tests involving TCP, UDP and SCTP, whether +using the BSD Sockets or the XTI interface those common options +include: + +@table @code +@vindex -h, Test-specific +@item -h +Display the test-suite-specific usage string and exit. For a TCP_ or +UDP_ test this will be the usage string from the source file +nettest_bsd.c. For an XTI_ test, this will be the usage string from +the source file nettest_xti.c. For an SCTP test, this will be the +usage string from the source file nettest_sctp.c. + +@item -H <optionspec> +Normally, the remote hostname|IP and address family information is +inherited from the settings for the control connection (eg global +command-line @option{-H}, @option{-4} and/or @option{-6} options). +The test-specific @option{-H} will override those settings for the +data (aka test) connection only. Settings for the control connection +are left unchanged. + +@vindex -L, Test-specific +@item -L <optionspec> +The test-specific @option{-L} option is identical to the test-specific +@option{-H} option except it affects the local hostname|IP and address +family information. As with its global command-line counterpart, this +is generally only useful when measuring though those evil, end-to-end +breaking things called firewalls. + +@vindex -m, Test-specific +@item -m bytes +Set the size of the buffer passed-in to the ``send'' calls of a +_STREAM test. Note that this may have only an indirect effect on the +size of the packets sent over the network, and certain Layer 4 +protocols do _not_ preserve or enforce message boundaries, so setting +@option{-m} for the send size does not necessarily mean the receiver +will receive that many bytes at any one time. By default the units are +bytes, but suffix of ``G,'' ``M,'' or ``K'' will specify the units to +be 2^30 (GB), 2^20 (MB) or 2^10 (KB) respectively. A suffix of ``g,'' +``m'' or ``k'' will specify units of 10^9, 10^6 or 10^3 bytes +respectively. For example: +@example +@code{-m 32K} +@end example +will set the size to 32KB or 32768 bytes. [Default: the local send +socket buffer size for the connection - either the system's default or +the value set via the @option{-s} option.] + +@vindex -M, Test-specific +@item -M bytes +Set the size of the buffer passed-in to the ``recv'' calls of a +_STREAM test. This will be an upper bound on the number of bytes +received per receive call. By default the units are bytes, but suffix +of ``G,'' ``M,'' or ``K'' will specify the units to be 2^30 (GB), 2^20 +(MB) or 2^10 (KB) respectively. A suffix of ``g,'' ``m'' or ``k'' +will specify units of 10^9, 10^6 or 10^3 bytes respectively. For +example: +@example +@code{-M 32K} +@end example +will set the size to 32KB or 32768 bytes. [Default: the remote receive +socket buffer size for the data connection - either the system's +default or the value set via the @option{-S} option.] + +@vindex -P, Test-specific +@item -P <optionspec> +Set the local and/or remote port numbers for the data connection. + +@vindex -s, Test-specific +@item -s <sizespec> +This option sets the local (netperf) send and receive socket buffer +sizes for the data connection to the value(s) specified. Often, this +will affect the advertised and/or effective TCP or other window, but +on some platforms it may not. By default the units are bytes, but +suffix of ``G,'' ``M,'' or ``K'' will specify the units to be 2^30 +(GB), 2^20 (MB) or 2^10 (KB) respectively. A suffix of ``g,'' ``m'' +or ``k'' will specify units of 10^9, 10^6 or 10^3 bytes +respectively. For example: +@example +@code{-s 128K} +@end example +Will request the local send and receive socket buffer sizes to be +128KB or 131072 bytes. + +While the historic expectation is that setting the socket buffer size +has a direct effect on say the TCP window, today that may not hold +true for all stacks. Further, while the historic expectation is that +the value specified in a @code{setsockopt()} call will be the value returned +via a @code{getsockopt()} call, at least one stack is known to deliberately +ignore history. When running under Windows a value of 0 may be used +which will be an indication to the stack the user wants to enable a +form of copy avoidance. [Default: -1 - use the system's default socket +buffer sizes] + +@vindex -S Test-specific +@item -S <sizespec> +This option sets the remote (netserver) send and/or receive socket +buffer sizes for the data connection to the value(s) specified. +Often, this will affect the advertised and/or effective TCP or other +window, but on some platforms it may not. By default the units are +bytes, but suffix of ``G,'' ``M,'' or ``K'' will specify the units to +be 2^30 (GB), 2^20 (MB) or 2^10 (KB) respectively. A suffix of ``g,'' +``m'' or ``k'' will specify units of 10^9, 10^6 or 10^3 bytes +respectively. For example: +@example +@code{-S 128K} +@end example +Will request the remote send and receive socket buffer sizes to be +128KB or 131072 bytes. + +While the historic expectation is that setting the socket buffer size +has a direct effect on say the TCP window, today that may not hold +true for all stacks. Further, while the historic expectation is that +the value specified in a @code{setsockopt()} call will be the value returned +via a @code{getsockopt()} call, at least one stack is known to deliberately +ignore history. When running under Windows a value of 0 may be used +which will be an indication to the stack the user wants to enable a +form of copy avoidance. [Default: -1 - use the system's default socket +buffer sizes] + +@vindex -4, Test-specific +@item -4 +Set the local and remote address family for the data connection to +AF_INET - ie use IPv4 addressing only. Just as with their global +command-line counterparts the last of the @option{-4}, @option{-6}, +@option{-H} or @option{-L} option wins for their respective address +families. + +@vindex -6, Test-specific +@item -6 +This option is identical to its @option{-4} cousin, but requests IPv6 +addresses for the local and remote ends of the data connection. + +@end table + + +@menu +* TCP_STREAM:: +* TCP_MAERTS:: +* TCP_SENDFILE:: +* UDP_STREAM:: +* XTI_TCP_STREAM:: +* XTI_UDP_STREAM:: +* SCTP_STREAM:: +* DLCO_STREAM:: +* DLCL_STREAM:: +* STREAM_STREAM:: +* DG_STREAM:: +@end menu + +@node TCP_STREAM, TCP_MAERTS, Options common to TCP UDP and SCTP tests, Options common to TCP UDP and SCTP tests +@subsection TCP_STREAM + +The TCP_STREAM test is the default test in netperf. It is quite +simple, transferring some quantity of data from the system running +netperf to the system running netserver. While time spent +establishing the connection is not included in the throughput +calculation, time spent flushing the last of the data to the remote at +the end of the test is. This is how netperf knows that all the data +it sent was received by the remote. In addition to the @ref{Options +common to TCP UDP and SCTP tests,options common to STREAM tests}, the +following test-specific options can be included to possibly alter the +behavior of the test: + +@table @code +@item -C +This option will set TCP_CORK mode on the data connection on those +systems where TCP_CORK is defined (typically Linux). A full +description of TCP_CORK is beyond the scope of this manual, but in a +nutshell it forces sub-MSS sends to be buffered so every segment sent +is Maximum Segment Size (MSS) unless the application performs an +explicit flush operation or the connection is closed. At present +netperf does not perform any explicit flush operations. Setting +TCP_CORK may improve the bitrate of tests where the ``send size'' +(@option{-m} option) is smaller than the MSS. It should also improve +(make smaller) the service demand. + +The Linux tcp(7) manpage states that TCP_CORK cannot be used in +conjunction with TCP_NODELAY (set via the @option{-d} option), however +netperf does not validate command-line options to enforce that. + +@item -D +This option will set TCP_NODELAY on the data connection on those +systems where TCP_NODELAY is defined. This disables something known +as the Nagle Algorithm, which is intended to make the segments TCP +sends as large as reasonably possible. Setting TCP_NODELAY for a +TCP_STREAM test should either have no effect when the send size +(@option{-m} option) is larger than the MSS or will decrease reported +bitrate and increase service demand when the send size is smaller than +the MSS. This stems from TCP_NODELAY causing each sub-MSS send to be +its own TCP segment rather than being aggregated with other small +sends. This means more trips up and down the protocol stack per KB of +data transferred, which means greater CPU utilization. + +If setting TCP_NODELAY with @option{-D} affects throughput and/or +service demand for tests where the send size (@option{-m}) is larger +than the MSS it suggests the TCP/IP stack's implementation of the +Nagle Algorithm _may_ be broken, perhaps interpreting the Nagle +Algorithm on a segment by segment basis rather than the proper user +send by user send basis. However, a better test of this can be +achieved with the @ref{TCP_RR} test. + +@end table + +Here is an example of a basic TCP_STREAM test, in this case from a +Debian Linux (2.6 kernel) system to an HP-UX 11iv2 (HP-UX 11.23) +system: + +@example +$ netperf -H lag +TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to lag.hpl.hp.com (15.4.89.214) port 0 AF_INET +Recv Send Send +Socket Socket Message Elapsed +Size Size Size Time Throughput +bytes bytes bytes secs. 10^6bits/sec + + 32768 16384 16384 10.00 80.42 +@end example + +We see that the default receive socket buffer size for the receiver +(lag - HP-UX 11.23) is 32768 bytes, and the default socket send buffer +size for the sender (Debian 2.6 kernel) is 16384 bytes, however Linux +does ``auto tuning'' of socket buffer and TCP window sizes, which +means the send socket buffer size may be different at the end of the +test than it was at the beginning. This is addressed in the @ref{The +Omni Tests,omni tests} added in version 2.5.0 and @ref{Omni Output +Selection,output selection}. Throughput is expressed as 10^6 (aka +Mega) bits per second, and the test ran for 10 seconds. IPv4 +addresses (AF_INET) were used. + +@node TCP_MAERTS, TCP_SENDFILE, TCP_STREAM, Options common to TCP UDP and SCTP tests +@comment node-name, next, previous, up +@subsection TCP_MAERTS + +A TCP_MAERTS (MAERTS is STREAM backwards) test is ``just like'' a +@ref{TCP_STREAM} test except the data flows from the netserver to the +netperf. The global command-line @option{-F} option is ignored for +this test type. The test-specific command-line @option{-C} option is +ignored for this test type. + +Here is an example of a TCP_MAERTS test between the same two systems +as in the example for the @ref{TCP_STREAM} test. This time we request +larger socket buffers with @option{-s} and @option{-S} options: + +@example +$ netperf -H lag -t TCP_MAERTS -- -s 128K -S 128K +TCP MAERTS TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to lag.hpl.hp.com (15.4.89.214) port 0 AF_INET +Recv Send Send +Socket Socket Message Elapsed +Size Size Size Time Throughput +bytes bytes bytes secs. 10^6bits/sec + +221184 131072 131072 10.03 81.14 +@end example + +Where we see that Linux, unlike HP-UX, may not return the same value +in a @code{getsockopt()} as was requested in the prior @code{setsockopt()}. + +This test is included more for benchmarking convenience than anything +else. + +@node TCP_SENDFILE, UDP_STREAM, TCP_MAERTS, Options common to TCP UDP and SCTP tests +@comment node-name, next, previous, up +@subsection TCP_SENDFILE + +The TCP_SENDFILE test is ``just like'' a @ref{TCP_STREAM} test except +netperf the platform's @code{sendfile()} call instead of calling +@code{send()}. Often this results in a @dfn{zero-copy} operation +where data is sent directly from the filesystem buffer cache. This +_should_ result in lower CPU utilization and possibly higher +throughput. If it does not, then you may want to contact your +vendor(s) because they have a problem on their hands. + +Zero-copy mechanisms may also alter the characteristics (size and +number of buffers per) of packets passed to the NIC. In many stacks, +when a copy is performed, the stack can ``reserve'' space at the +beginning of the destination buffer for things like TCP, IP and Link +headers. This then has the packet contained in a single buffer which +can be easier to DMA to the NIC. When no copy is performed, there is +no opportunity to reserve space for headers and so a packet will be +contained in two or more buffers. + +As of some time before version 2.5.0, the @ref{Global Options,global +@option{-F} option} is no longer required for this test. If it is not +specified, netperf will create a temporary file, which it will delete +at the end of the test. If the @option{-F} option is specified it +must reference a file of at least the size of the send ring +(@xref{Global Options,the global @option{-W} option}.) multiplied by +the send size (@xref{Options common to TCP UDP and SCTP tests,the +test-specific @option{-m} option}.). All other TCP-specific options +remain available and optional. + +In this first example: +@example +$ netperf -H lag -F ../src/netperf -t TCP_SENDFILE -- -s 128K -S 128K +TCP SENDFILE TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to lag.hpl.hp.com (15.4.89.214) port 0 AF_INET +alloc_sendfile_buf_ring: specified file too small. +file must be larger than send_width * send_size +@end example + +we see what happens when the file is too small. Here: + +@example +$ netperf -H lag -F /boot/vmlinuz-2.6.8-1-686 -t TCP_SENDFILE -- -s 128K -S 128K +TCP SENDFILE TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to lag.hpl.hp.com (15.4.89.214) port 0 AF_INET +Recv Send Send +Socket Socket Message Elapsed +Size Size Size Time Throughput +bytes bytes bytes secs. 10^6bits/sec + +131072 221184 221184 10.02 81.83 +@end example + +we resolve that issue by selecting a larger file. + + +@node UDP_STREAM, XTI_TCP_STREAM, TCP_SENDFILE, Options common to TCP UDP and SCTP tests +@subsection UDP_STREAM + +A UDP_STREAM test is similar to a @ref{TCP_STREAM} test except UDP is +used as the transport rather than TCP. + +@cindex Limiting Bandwidth +A UDP_STREAM test has no end-to-end flow control - UDP provides none +and neither does netperf. However, if you wish, you can configure +netperf with @code{--enable-intervals=yes} to enable the global +command-line @option{-b} and @option{-w} options to pace bursts of +traffic onto the network. + +This has a number of implications. + +The biggest of these implications is the data which is sent might not +be received by the remote. For this reason, the output of a +UDP_STREAM test shows both the sending and receiving throughput. On +some platforms, it may be possible for the sending throughput to be +reported as a value greater than the maximum rate of the link. This +is common when the CPU(s) are faster than the network and there is no +@dfn{intra-stack} flow-control. + +Here is an example of a UDP_STREAM test between two systems connected +by a 10 Gigabit Ethernet link: +@example +$ netperf -t UDP_STREAM -H 192.168.2.125 -- -m 32768 +UDP UNIDIRECTIONAL SEND TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.2.125 (192.168.2.125) port 0 AF_INET +Socket Message Elapsed Messages +Size Size Time Okay Errors Throughput +bytes bytes secs # # 10^6bits/sec + +124928 32768 10.00 105672 0 2770.20 +135168 10.00 104844 2748.50 + +@end example + +The first line of numbers are statistics from the sending (netperf) +side. The second line of numbers are from the receiving (netserver) +side. In this case, 105672 - 104844 or 828 messages did not make it +all the way to the remote netserver process. + +If the value of the @option{-m} option is larger than the local send +socket buffer size (@option{-s} option) netperf will likely abort with +an error message about how the send call failed: + +@example +netperf -t UDP_STREAM -H 192.168.2.125 +UDP UNIDIRECTIONAL SEND TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.2.125 (192.168.2.125) port 0 AF_INET +udp_send: data send error: Message too long +@end example + +If the value of the @option{-m} option is larger than the remote +socket receive buffer, the reported receive throughput will likely be +zero as the remote UDP will discard the messages as being too large to +fit into the socket buffer. + +@example +$ netperf -t UDP_STREAM -H 192.168.2.125 -- -m 65000 -S 32768 +UDP UNIDIRECTIONAL SEND TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.2.125 (192.168.2.125) port 0 AF_INET +Socket Message Elapsed Messages +Size Size Time Okay Errors Throughput +bytes bytes secs # # 10^6bits/sec + +124928 65000 10.00 53595 0 2786.99 + 65536 10.00 0 0.00 +@end example + +The example above was between a pair of systems running a ``Linux'' +kernel. Notice that the remote Linux system returned a value larger +than that passed-in to the @option{-S} option. In fact, this value +was larger than the message size set with the @option{-m} option. +That the remote socket buffer size is reported as 65536 bytes would +suggest to any sane person that a message of 65000 bytes would fit, +but the socket isn't _really_ 65536 bytes, even though Linux is +telling us so. Go figure. + +@node XTI_TCP_STREAM, XTI_UDP_STREAM, UDP_STREAM, Options common to TCP UDP and SCTP tests +@subsection XTI_TCP_STREAM + +An XTI_TCP_STREAM test is simply a @ref{TCP_STREAM} test using the XTI +rather than BSD Sockets interface. The test-specific @option{-X +<devspec>} option can be used to specify the name of the local and/or +remote XTI device files, which is required by the @code{t_open()} call +made by netperf XTI tests. + +The XTI_TCP_STREAM test is only present if netperf was configured with +@code{--enable-xti=yes}. The remote netserver must have also been +configured with @code{--enable-xti=yes}. + +@node XTI_UDP_STREAM, SCTP_STREAM, XTI_TCP_STREAM, Options common to TCP UDP and SCTP tests +@subsection XTI_UDP_STREAM + +An XTI_UDP_STREAM test is simply a @ref{UDP_STREAM} test using the XTI +rather than BSD Sockets Interface. The test-specific @option{-X +<devspec>} option can be used to specify the name of the local and/or +remote XTI device files, which is required by the @code{t_open()} call +made by netperf XTI tests. + +The XTI_UDP_STREAM test is only present if netperf was configured with +@code{--enable-xti=yes}. The remote netserver must have also been +configured with @code{--enable-xti=yes}. + +@node SCTP_STREAM, DLCO_STREAM, XTI_UDP_STREAM, Options common to TCP UDP and SCTP tests +@subsection SCTP_STREAM + +An SCTP_STREAM test is essentially a @ref{TCP_STREAM} test using the SCTP +rather than TCP. The @option{-D} option will set SCTP_NODELAY, which +is much like the TCP_NODELAY option for TCP. The @option{-C} option +is not applicable to an SCTP test as there is no corresponding +SCTP_CORK option. The author is still figuring-out what the +test-specific @option{-N} option does :) + +The SCTP_STREAM test is only present if netperf was configured with +@code{--enable-sctp=yes}. The remote netserver must have also been +configured with @code{--enable-sctp=yes}. + +@node DLCO_STREAM, DLCL_STREAM, SCTP_STREAM, Options common to TCP UDP and SCTP tests +@subsection DLCO_STREAM + +A DLPI Connection Oriented Stream (DLCO_STREAM) test is very similar +in concept to a @ref{TCP_STREAM} test. Both use reliable, +connection-oriented protocols. The DLPI test differs from the TCP +test in that its protocol operates only at the link-level and does not +include TCP-style segmentation and reassembly. This last difference +means that the value passed-in with the @option{-m} option must be +less than the interface MTU. Otherwise, the @option{-m} and +@option{-M} options are just like their TCP/UDP/SCTP counterparts. + +Other DLPI-specific options include: + +@table @code +@item -D <devspec> +This option is used to provide the fully-qualified names for the local +and/or remote DLPI device files. The syntax is otherwise identical to +that of a @dfn{sizespec}. +@item -p <ppaspec> +This option is used to specify the local and/or remote DLPI PPA(s). +The PPA is used to identify the interface over which traffic is to be +sent/received. The syntax of a @dfn{ppaspec} is otherwise the same as +a @dfn{sizespec}. +@item -s sap +This option specifies the 802.2 SAP for the test. A SAP is somewhat +like either the port field of a TCP or UDP header or the protocol +field of an IP header. The specified SAP should not conflict with any +other active SAPs on the specified PPA's (@option{-p} option). +@item -w <sizespec> +This option specifies the local send and receive window sizes in units +of frames on those platforms which support setting such things. +@item -W <sizespec> +This option specifies the remote send and receive window sizes in +units of frames on those platforms which support setting such things. +@end table + +The DLCO_STREAM test is only present if netperf was configured with +@code{--enable-dlpi=yes}. The remote netserver must have also been +configured with @code{--enable-dlpi=yes}. + + +@node DLCL_STREAM, STREAM_STREAM, DLCO_STREAM, Options common to TCP UDP and SCTP tests +@subsection DLCL_STREAM + +A DLPI ConnectionLess Stream (DLCL_STREAM) test is analogous to a +@ref{UDP_STREAM} test in that both make use of unreliable/best-effort, +connection-less transports. The DLCL_STREAM test differs from the +@ref{UDP_STREAM} test in that the message size (@option{-m} option) must +always be less than the link MTU as there is no IP-like fragmentation +and reassembly available and netperf does not presume to provide one. + +The test-specific command-line options for a DLCL_STREAM test are the +same as those for a @ref{DLCO_STREAM} test. + +The DLCL_STREAM test is only present if netperf was configured with +@code{--enable-dlpi=yes}. The remote netserver must have also been +configured with @code{--enable-dlpi=yes}. + +@node STREAM_STREAM, DG_STREAM, DLCL_STREAM, Options common to TCP UDP and SCTP tests +@comment node-name, next, previous, up +@subsection STREAM_STREAM + +A Unix Domain Stream Socket Stream test (STREAM_STREAM) is similar in +concept to a @ref{TCP_STREAM} test, but using Unix Domain sockets. It is, +naturally, limited to intra-machine traffic. A STREAM_STREAM test +shares the @option{-m}, @option{-M}, @option{-s} and @option{-S} +options of the other _STREAM tests. In a STREAM_STREAM test the +@option{-p} option sets the directory in which the pipes will be +created rather than setting a port number. The default is to create +the pipes in the system default for the @code{tempnam()} call. + +The STREAM_STREAM test is only present if netperf was configured with +@code{--enable-unixdomain=yes}. The remote netserver must have also been +configured with @code{--enable-unixdomain=yes}. + +@node DG_STREAM, , STREAM_STREAM, Options common to TCP UDP and SCTP tests +@comment node-name, next, previous, up +@subsection DG_STREAM + +A Unix Domain Datagram Socket Stream test (SG_STREAM) is very much +like a @ref{TCP_STREAM} test except that message boundaries are preserved. +In this way, it may also be considered similar to certain flavors of +SCTP test which can also preserve message boundaries. + +All the options of a @ref{STREAM_STREAM} test are applicable to a DG_STREAM +test. + +The DG_STREAM test is only present if netperf was configured with +@code{--enable-unixdomain=yes}. The remote netserver must have also been +configured with @code{--enable-unixdomain=yes}. + + +@node Using Netperf to Measure Request/Response , Using Netperf to Measure Aggregate Performance, Using Netperf to Measure Bulk Data Transfer, Top +@chapter Using Netperf to Measure Request/Response + +Request/response performance is often overlooked, yet it is just as +important as bulk-transfer performance. While things like larger +socket buffers and TCP windows, and stateless offloads like TSO and +LRO can cover a multitude of latency and even path-length sins, those +sins cannot easily hide from a request/response test. The convention +for a request/response test is to have a _RR suffix. There are +however a few ``request/response'' tests that have other suffixes. + +A request/response test, particularly synchronous, one transaction at +a time test such as those found by default in netperf, is particularly +sensitive to the path-length of the networking stack. An _RR test can +also uncover those platforms where the NICs are strapped by default +with overbearing interrupt avoidance settings in an attempt to +increase the bulk-transfer performance (or rather, decrease the CPU +utilization of a bulk-transfer test). This sensitivity is most acute +for small request and response sizes, such as the single-byte default +for a netperf _RR test. + +While a bulk-transfer test reports its results in units of bits or +bytes transferred per second, by default a mumble_RR test reports +transactions per second where a transaction is defined as the +completed exchange of a request and a response. One can invert the +transaction rate to arrive at the average round-trip latency. If one +is confident about the symmetry of the connection, the average one-way +latency can be taken as one-half the average round-trip latency. As of +version 2.5.0 (actually slightly before) netperf still does not do the +latter, but will do the former if one sets the verbosity to 2 for a +classic netperf test, or includes the appropriate @ref{Omni Output +Selectors,output selector} in an @ref{The Omni Tests,omni test}. It +will also allow the user to switch the throughput units from +transactions per second to bits or bytes per second with the global +@option{-f} option. + +@menu +* Issues in Request/Response:: +* Options Common to TCP UDP and SCTP _RR tests:: +@end menu + +@node Issues in Request/Response, Options Common to TCP UDP and SCTP _RR tests, Using Netperf to Measure Request/Response , Using Netperf to Measure Request/Response +@comment node-name, next, previous, up +@section Issues in Request/Response + +Most if not all the @ref{Issues in Bulk Transfer} apply to +request/response. The issue of round-trip latency is even more +important as netperf generally only has one transaction outstanding at +a time. + +A single instance of a one transaction outstanding _RR test should +_never_ completely saturate the CPU of a system. If testing between +otherwise evenly matched systems, the symmetric nature of a _RR test +with equal request and response sizes should result in equal CPU +loading on both systems. However, this may not hold true on MP +systems, particularly if one CPU binds the netperf and netserver +differently via the global @option{-T} option. + +For smaller request and response sizes packet loss is a bigger issue +as there is no opportunity for a @dfn{fast retransmit} or +retransmission prior to a retransmission timer expiring. + +Virtualization may considerably increase the effective path length of +a networking stack. While this may not preclude achieving link-rate +on a comparatively slow link (eg 1 Gigabit Ethernet) on a _STREAM +test, it can show-up as measurably fewer transactions per second on an +_RR test. However, this may still be masked by interrupt coalescing +in the NIC/driver. + +Certain NICs have ways to minimize the number of interrupts sent to +the host. If these are strapped badly they can significantly reduce +the performance of something like a single-byte request/response test. +Such setups are distinguished by seriously low reported CPU utilization +and what seems like a low (even if in the thousands) transaction per +second rate. Also, if you run such an OS/driver combination on faster +or slower hardware and do not see a corresponding change in the +transaction rate, chances are good that the driver is strapping the +NIC with aggressive interrupt avoidance settings. Good for bulk +throughput, but bad for latency. + +Some drivers may try to automagically adjust the interrupt avoidance +settings. If they are not terribly good at it, you will see +considerable run-to-run variation in reported transaction rates. +Particularly if you ``mix-up'' _STREAM and _RR tests. + + +@node Options Common to TCP UDP and SCTP _RR tests, , Issues in Request/Response, Using Netperf to Measure Request/Response +@comment node-name, next, previous, up +@section Options Common to TCP UDP and SCTP _RR tests + +Many ``test-specific'' options are actually common across the +different tests. For those tests involving TCP, UDP and SCTP, whether +using the BSD Sockets or the XTI interface those common options +include: + +@table @code +@vindex -h, Test-specific +@item -h +Display the test-suite-specific usage string and exit. For a TCP_ or +UDP_ test this will be the usage string from the source file +@file{nettest_bsd.c}. For an XTI_ test, this will be the usage string +from the source file @file{src/nettest_xti.c}. For an SCTP test, this +will be the usage string from the source file +@file{src/nettest_sctp.c}. + +@vindex -H, Test-specific +@item -H <optionspec> +Normally, the remote hostname|IP and address family information is +inherited from the settings for the control connection (eg global +command-line @option{-H}, @option{-4} and/or @option{-6} options. +The test-specific @option{-H} will override those settings for the +data (aka test) connection only. Settings for the control connection +are left unchanged. This might be used to cause the control and data +connections to take different paths through the network. + +@vindex -L, Test-specific +@item -L <optionspec> +The test-specific @option{-L} option is identical to the test-specific +@option{-H} option except it affects the local hostname|IP and address +family information. As with its global command-line counterpart, this +is generally only useful when measuring though those evil, end-to-end +breaking things called firewalls. + +@vindex -P, Test-specific +@item -P <optionspec> +Set the local and/or remote port numbers for the data connection. + +@vindex -r, Test-specific +@item -r <sizespec> +This option sets the request (first value) and/or response (second +value) sizes for an _RR test. By default the units are bytes, but a +suffix of ``G,'' ``M,'' or ``K'' will specify the units to be 2^30 +(GB), 2^20 (MB) or 2^10 (KB) respectively. A suffix of ``g,'' ``m'' +or ``k'' will specify units of 10^9, 10^6 or 10^3 bytes +respectively. For example: +@example +@code{-r 128,16K} +@end example +Will set the request size to 128 bytes and the response size to 16 KB +or 16384 bytes. [Default: 1 - a single-byte request and response ] + +@vindex -s, Test-specific +@item -s <sizespec> +This option sets the local (netperf) send and receive socket buffer +sizes for the data connection to the value(s) specified. Often, this +will affect the advertised and/or effective TCP or other window, but +on some platforms it may not. By default the units are bytes, but a +suffix of ``G,'' ``M,'' or ``K'' will specify the units to be 2^30 +(GB), 2^20 (MB) or 2^10 (KB) respectively. A suffix of ``g,'' ``m'' +or ``k'' will specify units of 10^9, 10^6 or 10^3 bytes +respectively. For example: +@example +@code{-s 128K} +@end example +Will request the local send (netperf) and receive socket buffer sizes +to be 128KB or 131072 bytes. + +While the historic expectation is that setting the socket buffer size +has a direct effect on say the TCP window, today that may not hold +true for all stacks. When running under Windows a value of 0 may be +used which will be an indication to the stack the user wants to enable +a form of copy avoidance. [Default: -1 - use the system's default +socket buffer sizes] + +@vindex -S, Test-specific +@item -S <sizespec> +This option sets the remote (netserver) send and/or receive socket +buffer sizes for the data connection to the value(s) specified. +Often, this will affect the advertised and/or effective TCP or other +window, but on some platforms it may not. By default the units are +bytes, but a suffix of ``G,'' ``M,'' or ``K'' will specify the units +to be 2^30 (GB), 2^20 (MB) or 2^10 (KB) respectively. A suffix of +``g,'' ``m'' or ``k'' will specify units of 10^9, 10^6 or 10^3 bytes +respectively. For example: +@example +@code{-S 128K} +@end example +Will request the remote (netserver) send and receive socket buffer +sizes to be 128KB or 131072 bytes. + +While the historic expectation is that setting the socket buffer size +has a direct effect on say the TCP window, today that may not hold +true for all stacks. When running under Windows a value of 0 may be +used which will be an indication to the stack the user wants to enable +a form of copy avoidance. [Default: -1 - use the system's default +socket buffer sizes] + +@vindex -4, Test-specific +@item -4 +Set the local and remote address family for the data connection to +AF_INET - ie use IPv4 addressing only. Just as with their global +command-line counterparts the last of the @option{-4}, @option{-6}, +@option{-H} or @option{-L} option wins for their respective address +families. + +@vindex -6 Test-specific +@item -6 +This option is identical to its @option{-4} cousin, but requests IPv6 +addresses for the local and remote ends of the data connection. + +@end table + +@menu +* TCP_RR:: +* TCP_CC:: +* TCP_CRR:: +* UDP_RR:: +* XTI_TCP_RR:: +* XTI_TCP_CC:: +* XTI_TCP_CRR:: +* XTI_UDP_RR:: +* DLCL_RR:: +* DLCO_RR:: +* SCTP_RR:: +@end menu + +@node TCP_RR, TCP_CC, Options Common to TCP UDP and SCTP _RR tests, Options Common to TCP UDP and SCTP _RR tests +@subsection TCP_RR +@cindex Measuring Latency +@cindex Latency, Request-Response + +A TCP_RR (TCP Request/Response) test is requested by passing a value +of ``TCP_RR'' to the global @option{-t} command-line option. A TCP_RR +test can be thought-of as a user-space to user-space @code{ping} with +no think time - it is by default a synchronous, one transaction at a +time, request/response test. + +The transaction rate is the number of complete transactions exchanged +divided by the length of time it took to perform those transactions. + +If the two Systems Under Test are otherwise identical, a TCP_RR test +with the same request and response size should be symmetric - it +should not matter which way the test is run, and the CPU utilization +measured should be virtually the same on each system. If not, it +suggests that the CPU utilization mechanism being used may have some, +well, issues measuring CPU utilization completely and accurately. + +Time to establish the TCP connection is not counted in the result. If +you want connection setup overheads included, you should consider the +@ref{TCP_CC,TPC_CC} or @ref{TCP_CRR,TCP_CRR} tests. + +If specifying the @option{-D} option to set TCP_NODELAY and disable +the Nagle Algorithm increases the transaction rate reported by a +TCP_RR test, it implies the stack(s) over which the TCP_RR test is +running have a broken implementation of the Nagle Algorithm. Likely +as not they are interpreting Nagle on a segment by segment basis +rather than a user send by user send basis. You should contact your +stack vendor(s) to report the problem to them. + +Here is an example of two systems running a basic TCP_RR test over a +10 Gigabit Ethernet link: + +@example +netperf -t TCP_RR -H 192.168.2.125 +TCP REQUEST/RESPONSE TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.2.125 (192.168.2.125) port 0 AF_INET +Local /Remote +Socket Size Request Resp. Elapsed Trans. +Send Recv Size Size Time Rate +bytes Bytes bytes bytes secs. per sec + +16384 87380 1 1 10.00 29150.15 +16384 87380 +@end example + +In this example the request and response sizes were one byte, the +socket buffers were left at their defaults, and the test ran for all +of 10 seconds. The transaction per second rate was rather good for +the time :) + +@node TCP_CC, TCP_CRR, TCP_RR, Options Common to TCP UDP and SCTP _RR tests +@subsection TCP_CC +@cindex Connection Latency +@cindex Latency, Connection Establishment + +A TCP_CC (TCP Connect/Close) test is requested by passing a value of +``TCP_CC'' to the global @option{-t} option. A TCP_CC test simply +measures how fast the pair of systems can open and close connections +between one another in a synchronous (one at a time) manner. While +this is considered an _RR test, no request or response is exchanged +over the connection. + +@cindex Port Reuse +@cindex TIME_WAIT +The issue of TIME_WAIT reuse is an important one for a TCP_CC test. +Basically, TIME_WAIT reuse is when a pair of systems churn through +connections fast enough that they wrap the 16-bit port number space in +less time than the length of the TIME_WAIT state. While it is indeed +theoretically possible to ``reuse'' a connection in TIME_WAIT, the +conditions under which such reuse is possible are rather rare. An +attempt to reuse a connection in TIME_WAIT can result in a non-trivial +delay in connection establishment. + +Basically, any time the connection churn rate approaches: + +Sizeof(clientportspace) / Lengthof(TIME_WAIT) + +there is the risk of TIME_WAIT reuse. To minimize the chances of this +happening, netperf will by default select its own client port numbers +from the range of 5000 to 65535. On systems with a 60 second +TIME_WAIT state, this should allow roughly 1000 transactions per +second. The size of the client port space used by netperf can be +controlled via the test-specific @option{-p} option, which takes a +@dfn{sizespec} as a value setting the minimum (first value) and +maximum (second value) port numbers used by netperf at the client end. + +Since no requests or responses are exchanged during a TCP_CC test, +only the @option{-H}, @option{-L}, @option{-4} and @option{-6} of the +``common'' test-specific options are likely to have an effect, if any, +on the results. The @option{-s} and @option{-S} options _may_ have +some effect if they alter the number and/or type of options carried in +the TCP SYNchronize segments, such as Window Scaling or Timestamps. +The @option{-P} and @option{-r} options are utterly ignored. + +Since connection establishment and tear-down for TCP is not symmetric, +a TCP_CC test is not symmetric in its loading of the two systems under +test. + +@node TCP_CRR, UDP_RR, TCP_CC, Options Common to TCP UDP and SCTP _RR tests +@subsection TCP_CRR +@cindex Latency, Connection Establishment +@cindex Latency, Request-Response + +The TCP Connect/Request/Response (TCP_CRR) test is requested by +passing a value of ``TCP_CRR'' to the global @option{-t} command-line +option. A TCP_CRR test is like a merger of a @ref{TCP_RR} and +@ref{TCP_CC} test which measures the performance of establishing a +connection, exchanging a single request/response transaction, and +tearing-down that connection. This is very much like what happens in +an HTTP 1.0 or HTTP 1.1 connection when HTTP Keepalives are not used. +In fact, the TCP_CRR test was added to netperf to simulate just that. + +Since a request and response are exchanged the @option{-r}, +@option{-s} and @option{-S} options can have an effect on the +performance. + +The issue of TIME_WAIT reuse exists for the TCP_CRR test just as it +does for the TCP_CC test. Similarly, since connection establishment +and tear-down is not symmetric, a TCP_CRR test is not symmetric even +when the request and response sizes are the same. + +@node UDP_RR, XTI_TCP_RR, TCP_CRR, Options Common to TCP UDP and SCTP _RR tests +@subsection UDP_RR +@cindex Latency, Request-Response +@cindex Packet Loss + +A UDP Request/Response (UDP_RR) test is requested by passing a value +of ``UDP_RR'' to a global @option{-t} option. It is very much the +same as a TCP_RR test except UDP is used rather than TCP. + +UDP does not provide for retransmission of lost UDP datagrams, and +netperf does not add anything for that either. This means that if +_any_ request or response is lost, the exchange of requests and +responses will stop from that point until the test timer expires. +Netperf will not really ``know'' this has happened - the only symptom +will be a low transaction per second rate. If @option{--enable-burst} +was included in the @code{configure} command and a test-specific +@option{-b} option used, the UDP_RR test will ``survive'' the loss of +requests and responses until the sum is one more than the value passed +via the @option{-b} option. It will though almost certainly run more +slowly. + +The netperf side of a UDP_RR test will call @code{connect()} on its +data socket and thenceforth use the @code{send()} and @code{recv()} +socket calls. The netserver side of a UDP_RR test will not call +@code{connect()} and will use @code{recvfrom()} and @code{sendto()} +calls. This means that even if the request and response sizes are the +same, a UDP_RR test is _not_ symmetric in its loading of the two +systems under test. + +Here is an example of a UDP_RR test between two otherwise +identical two-CPU systems joined via a 1 Gigabit Ethernet network: + +@example +$ netperf -T 1 -H 192.168.1.213 -t UDP_RR -c -C +UDP REQUEST/RESPONSE TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.1.213 (192.168.1.213) port 0 AF_INET +Local /Remote +Socket Size Request Resp. Elapsed Trans. CPU CPU S.dem S.dem +Send Recv Size Size Time Rate local remote local remote +bytes bytes bytes bytes secs. per sec % I % I us/Tr us/Tr + +65535 65535 1 1 10.01 15262.48 13.90 16.11 18.221 21.116 +65535 65535 +@end example + +This example includes the @option{-c} and @option{-C} options to +enable CPU utilization reporting and shows the asymmetry in CPU +loading. The @option{-T} option was used to make sure netperf and +netserver ran on a given CPU and did not move around during the test. + +@node XTI_TCP_RR, XTI_TCP_CC, UDP_RR, Options Common to TCP UDP and SCTP _RR tests +@subsection XTI_TCP_RR +@cindex Latency, Request-Response + +An XTI_TCP_RR test is essentially the same as a @ref{TCP_RR} test only +using the XTI rather than BSD Sockets interface. It is requested by +passing a value of ``XTI_TCP_RR'' to the @option{-t} global +command-line option. + +The test-specific options for an XTI_TCP_RR test are the same as those +for a TCP_RR test with the addition of the @option{-X <devspec>} option to +specify the names of the local and/or remote XTI device file(s). + +@node XTI_TCP_CC, XTI_TCP_CRR, XTI_TCP_RR, Options Common to TCP UDP and SCTP _RR tests +@comment node-name, next, previous, up +@subsection XTI_TCP_CC +@cindex Latency, Connection Establishment + +An XTI_TCP_CC test is essentially the same as a @ref{TCP_CC,TCP_CC} +test, only using the XTI rather than BSD Sockets interface. + +The test-specific options for an XTI_TCP_CC test are the same as those +for a TCP_CC test with the addition of the @option{-X <devspec>} option to +specify the names of the local and/or remote XTI device file(s). + +@node XTI_TCP_CRR, XTI_UDP_RR, XTI_TCP_CC, Options Common to TCP UDP and SCTP _RR tests +@comment node-name, next, previous, up +@subsection XTI_TCP_CRR +@cindex Latency, Connection Establishment +@cindex Latency, Request-Response + +The XTI_TCP_CRR test is essentially the same as a +@ref{TCP_CRR,TCP_CRR} test, only using the XTI rather than BSD Sockets +interface. + +The test-specific options for an XTI_TCP_CRR test are the same as those +for a TCP_RR test with the addition of the @option{-X <devspec>} option to +specify the names of the local and/or remote XTI device file(s). + +@node XTI_UDP_RR, DLCL_RR, XTI_TCP_CRR, Options Common to TCP UDP and SCTP _RR tests +@subsection XTI_UDP_RR +@cindex Latency, Request-Response + +An XTI_UDP_RR test is essentially the same as a UDP_RR test only using +the XTI rather than BSD Sockets interface. It is requested by passing +a value of ``XTI_UDP_RR'' to the @option{-t} global command-line +option. + +The test-specific options for an XTI_UDP_RR test are the same as those +for a UDP_RR test with the addition of the @option{-X <devspec>} +option to specify the name of the local and/or remote XTI device +file(s). + +@node DLCL_RR, DLCO_RR, XTI_UDP_RR, Options Common to TCP UDP and SCTP _RR tests +@comment node-name, next, previous, up +@subsection DLCL_RR +@cindex Latency, Request-Response + +@node DLCO_RR, SCTP_RR, DLCL_RR, Options Common to TCP UDP and SCTP _RR tests +@comment node-name, next, previous, up +@subsection DLCO_RR +@cindex Latency, Request-Response + +@node SCTP_RR, , DLCO_RR, Options Common to TCP UDP and SCTP _RR tests +@comment node-name, next, previous, up +@subsection SCTP_RR +@cindex Latency, Request-Response + +@node Using Netperf to Measure Aggregate Performance, Using Netperf to Measure Bidirectional Transfer, Using Netperf to Measure Request/Response , Top +@comment node-name, next, previous, up +@chapter Using Netperf to Measure Aggregate Performance +@cindex Aggregate Performance +@vindex --enable-burst, Configure + +Ultimately, @ref{Netperf4,Netperf4} will be the preferred benchmark to +use when one wants to measure aggregate performance because netperf +has no support for explicit synchronization of concurrent tests. Until +netperf4 is ready for prime time, one can make use of the heuristics +and procedures mentioned here for the 85% solution. + +There are a few ways to measure aggregate performance with netperf. +The first is to run multiple, concurrent netperf tests and can be +applied to any of the netperf tests. The second is to configure +netperf with @code{--enable-burst} and is applicable to the TCP_RR +test. The third is a variation on the first. + +@menu +* Running Concurrent Netperf Tests:: +* Using --enable-burst:: +* Using --enable-demo:: +@end menu + +@node Running Concurrent Netperf Tests, Using --enable-burst, Using Netperf to Measure Aggregate Performance, Using Netperf to Measure Aggregate Performance +@comment node-name, next, previous, up +@section Running Concurrent Netperf Tests + +@ref{Netperf4,Netperf4} is the preferred benchmark to use when one +wants to measure aggregate performance because netperf has no support +for explicit synchronization of concurrent tests. This leaves +netperf2 results vulnerable to @dfn{skew} errors. + +However, since there are times when netperf4 is unavailable it may be +necessary to run netperf. The skew error can be minimized by making +use of the confidence interval functionality. Then one simply +launches multiple tests from the shell using a @code{for} loop or the +like: + +@example +for i in 1 2 3 4 +do +netperf -t TCP_STREAM -H tardy.cup.hp.com -i 10 -P 0 & +done +@end example + +which will run four, concurrent @ref{TCP_STREAM,TCP_STREAM} tests from +the system on which it is executed to tardy.cup.hp.com. Each +concurrent netperf will iterate 10 times thanks to the @option{-i} +option and will omit the test banners (option @option{-P}) for +brevity. The output looks something like this: + +@example + 87380 16384 16384 10.03 235.15 + 87380 16384 16384 10.03 235.09 + 87380 16384 16384 10.03 235.38 + 87380 16384 16384 10.03 233.96 +@end example + +We can take the sum of the results and be reasonably confident that +the aggregate performance was 940 Mbits/s. This method does not need +to be limited to one system speaking to one other system. It can be +extended to one system talking to N other systems. It could be as simple as: +@example +for host in 'foo bar baz bing' +do +netperf -t TCP_STREAM -H $hosts -i 10 -P 0 & +done +@end example +A more complicated/sophisticated example can be found in +@file{doc/examples/runemomniagg2.sh} where. + +If you see warnings about netperf not achieving the confidence +intervals, the best thing to do is to increase the number of +iterations with @option{-i} and/or increase the run length of each +iteration with @option{-l}. + +You can also enable local (@option{-c}) and/or remote (@option{-C}) +CPU utilization: + +@example +for i in 1 2 3 4 +do +netperf -t TCP_STREAM -H tardy.cup.hp.com -i 10 -P 0 -c -C & +done + +87380 16384 16384 10.03 235.47 3.67 5.09 10.226 14.180 +87380 16384 16384 10.03 234.73 3.67 5.09 10.260 14.225 +87380 16384 16384 10.03 234.64 3.67 5.10 10.263 14.231 +87380 16384 16384 10.03 234.87 3.67 5.09 10.253 14.215 +@end example + +If the CPU utilizations reported for the same system are the same or +very very close you can be reasonably confident that skew error is +minimized. Presumably one could then omit @option{-i} but that is +not advised, particularly when/if the CPU utilization approaches 100 +percent. In the example above we see that the CPU utilization on the +local system remains the same for all four tests, and is only off by +0.01 out of 5.09 on the remote system. As the number of CPUs in the +system increases, and so too the odds of saturating a single CPU, the +accuracy of similar CPU utilization implying little skew error is +diminished. This is also the case for those increasingly rare single +CPU systems if the utilization is reported as 100% or very close to +it. + +@quotation +@b{NOTE: It is very important to remember that netperf is calculating +system-wide CPU utilization. When calculating the service demand +(those last two columns in the output above) each netperf assumes it +is the only thing running on the system. This means that for +concurrent tests the service demands reported by netperf will be +wrong. One has to compute service demands for concurrent tests by +hand.} +@end quotation + +If you wish you can add a unique, global @option{-B} option to each +command line to append the given string to the output: + +@example +for i in 1 2 3 4 +do +netperf -t TCP_STREAM -H tardy.cup.hp.com -B "this is test $i" -i 10 -P 0 & +done + +87380 16384 16384 10.03 234.90 this is test 4 +87380 16384 16384 10.03 234.41 this is test 2 +87380 16384 16384 10.03 235.26 this is test 1 +87380 16384 16384 10.03 235.09 this is test 3 +@end example + +You will notice that the tests completed in an order other than they +were started from the shell. This underscores why there is a threat +of skew error and why netperf4 will eventually be the preferred tool +for aggregate tests. Even if you see the Netperf Contributing Editor +acting to the contrary!-) + +@menu +* Issues in Running Concurrent Tests:: +@end menu + +@node Issues in Running Concurrent Tests, , Running Concurrent Netperf Tests, Running Concurrent Netperf Tests +@subsection Issues in Running Concurrent Tests + +In addition to the aforementioned issue of skew error, there can be +other issues to consider when running concurrent netperf tests. + +For example, when running concurrent tests over multiple interfaces, +one is not always assured that the traffic one thinks went over a +given interface actually did so. In particular, the Linux networking +stack takes a particularly strong stance on its following the so +called @samp{weak end system model}. As such, it is willing to answer +ARP requests for any of its local IP addresses on any of its +interfaces. If multiple interfaces are connected to the same +broadcast domain, then even if they are configured into separate IP +subnets there is no a priori way of knowing which interface was +actually used for which connection(s). This can be addressed by +setting the @samp{arp_ignore} sysctl before configuring interfaces. + +As it is quite important, we will repeat that it is very important to +remember that each concurrent netperf instance is calculating +system-wide CPU utilization. When calculating the service demand each +netperf assumes it is the only thing running on the system. This +means that for concurrent tests the service demands reported by +netperf @b{will be wrong}. One has to compute service demands for +concurrent tests by hand + +Running concurrent tests can also become difficult when there is no +one ``central'' node. Running tests between pairs of systems may be +more difficult, calling for remote shell commands in the for loop +rather than netperf commands. This introduces more skew error, which +the confidence intervals may not be able to sufficiently mitigate. +One possibility is to actually run three consecutive netperf tests on +each node - the first being a warm-up, the last being a cool-down. +The idea then is to ensure that the time it takes to get all the +netperfs started is less than the length of the first netperf command +in the sequence of three. Similarly, it assumes that all ``middle'' +netperfs will complete before the first of the ``last'' netperfs +complete. + +@node Using --enable-burst, Using --enable-demo, Running Concurrent Netperf Tests, Using Netperf to Measure Aggregate Performance +@comment node-name, next, previous, up +@section Using - -enable-burst + +Starting in version 2.5.0 @code{--enable-burst=yes} is the default, +which means one no longer must: + +@example +configure --enable-burst +@end example + +To have burst-mode functionality present in netperf. This enables a +test-specific @option{-b num} option in @ref{TCP_RR,TCP_RR}, +@ref{UDP_RR,UDP_RR} and @ref{The Omni Tests,omni} tests. + +Normally, netperf will attempt to ramp-up the number of outstanding +requests to @option{num} plus one transactions in flight at one time. +The ramp-up is to avoid transactions being smashed together into a +smaller number of segments when the transport's congestion window (if +any) is smaller at the time than what netperf wants to have +outstanding at one time. If, however, the user specifies a negative +value for @option{num} this ramp-up is bypassed and the burst of sends +is made without consideration of transport congestion window. + +This burst-mode is used as an alternative to or even in conjunction +with multiple-concurrent _RR tests and as a way to implement a +single-connection, bidirectional bulk-transfer test. When run with +just a single instance of netperf, increasing the burst size can +determine the maximum number of transactions per second which can be +serviced by a single process: + +@example +for b in 0 1 2 4 8 16 32 +do + netperf -v 0 -t TCP_RR -B "-b $b" -H hpcpc108 -P 0 -- -b $b +done + +9457.59 -b 0 +9975.37 -b 1 +10000.61 -b 2 +20084.47 -b 4 +29965.31 -b 8 +71929.27 -b 16 +109718.17 -b 32 +@end example + +The global @option{-v} and @option{-P} options were used to minimize +the output to the single figure of merit which in this case the +transaction rate. The global @code{-B} option was used to more +clearly label the output, and the test-specific @option{-b} option +enabled by @code{--enable-burst} increase the number of transactions +in flight at one time. + +Now, since the test-specific @option{-D} option was not specified to +set TCP_NODELAY, the stack was free to ``bundle'' requests and/or +responses into TCP segments as it saw fit, and since the default +request and response size is one byte, there could have been some +considerable bundling even in the absence of transport congestion +window issues. If one wants to try to achieve a closer to +one-to-one correspondence between a request and response and a TCP +segment, add the test-specific @option{-D} option: + +@example +for b in 0 1 2 4 8 16 32 +do + netperf -v 0 -t TCP_RR -B "-b $b -D" -H hpcpc108 -P 0 -- -b $b -D +done + + 8695.12 -b 0 -D + 19966.48 -b 1 -D + 20691.07 -b 2 -D + 49893.58 -b 4 -D + 62057.31 -b 8 -D + 108416.88 -b 16 -D + 114411.66 -b 32 -D +@end example + +You can see that this has a rather large effect on the reported +transaction rate. In this particular instance, the author believes it +relates to interactions between the test and interrupt coalescing +settings in the driver for the NICs used. + +@quotation +@b{NOTE: Even if you set the @option{-D} option that is still not a +guarantee that each transaction is in its own TCP segments. You +should get into the habit of verifying the relationship between the +transaction rate and the packet rate via other means.} +@end quotation + +You can also combine @code{--enable-burst} functionality with +concurrent netperf tests. This would then be an ``aggregate of +aggregates'' if you like: + +@example + +for i in 1 2 3 4 +do + netperf -H hpcpc108 -v 0 -P 0 -i 10 -B "aggregate $i -b 8 -D" -t TCP_RR -- -b 8 -D & +done + + 46668.38 aggregate 4 -b 8 -D + 44890.64 aggregate 2 -b 8 -D + 45702.04 aggregate 1 -b 8 -D + 46352.48 aggregate 3 -b 8 -D + +@end example + +Since each netperf did hit the confidence intervals, we can be +reasonably certain that the aggregate transaction per second rate was +the sum of all four concurrent tests, or something just shy of 184,000 +transactions per second. To get some idea if that was also the packet +per second rate, we could bracket that @code{for} loop with something +to gather statistics and run the results through +@uref{ftp://ftp.cup.hp.com/dist/networking/tools,beforeafter}: + +@example +/usr/sbin/ethtool -S eth2 > before +for i in 1 2 3 4 +do + netperf -H 192.168.2.108 -l 60 -v 0 -P 0 -B "aggregate $i -b 8 -D" -t TCP_RR -- -b 8 -D & +done +wait +/usr/sbin/ethtool -S eth2 > after + + 52312.62 aggregate 2 -b 8 -D + 50105.65 aggregate 4 -b 8 -D + 50890.82 aggregate 1 -b 8 -D + 50869.20 aggregate 3 -b 8 -D + +beforeafter before after > delta + +grep packets delta + rx_packets: 12251544 + tx_packets: 12251550 + +@end example + +This example uses @code{ethtool} because the system being used is +running Linux. Other platforms have other tools - for example HP-UX +has lanadmin: + +@example +lanadmin -g mibstats <ppa> +@end example + +and of course one could instead use @code{netstat}. + +The @code{wait} is important because we are launching concurrent +netperfs in the background. Without it, the second ethtool command +would be run before the tests finished and perhaps even before the +last of them got started! + +The sum of the reported transaction rates is 204178 over 60 seconds, +which is a total of 12250680 transactions. Each transaction is the +exchange of a request and a response, so we multiply that by 2 to +arrive at 24501360. + +The sum of the ethtool stats is 24503094 packets which matches what +netperf was reporting very well. + +Had the request or response size differed, we would need to know how +it compared with the @dfn{MSS} for the connection. + +Just for grins, here is the exercise repeated, using @code{netstat} +instead of @code{ethtool} + +@example +netstat -s -t > before +for i in 1 2 3 4 +do + netperf -l 60 -H 192.168.2.108 -v 0 -P 0 -B "aggregate $i -b 8 -D" -t TCP_RR -- -b 8 -D & done +wait +netstat -s -t > after + + 51305.88 aggregate 4 -b 8 -D + 51847.73 aggregate 2 -b 8 -D + 50648.19 aggregate 3 -b 8 -D + 53605.86 aggregate 1 -b 8 -D + +beforeafter before after > delta + +grep segments delta + 12445708 segments received + 12445730 segments send out + 1 segments retransmited + 0 bad segments received. +@end example + +The sums are left as an exercise to the reader :) + +Things become considerably more complicated if there are non-trvial +packet losses and/or retransmissions. + +Of course all this checking is unnecessary if the test is a UDP_RR +test because UDP ``never'' aggregates multiple sends into the same UDP +datagram, and there are no ACKnowledgements in UDP. The loss of a +single request or response will not bring a ``burst'' UDP_RR test to a +screeching halt, but it will reduce the number of transactions +outstanding at any one time. A ``burst'' UDP_RR test @b{will} come to a +halt if the sum of the lost requests and responses reaches the value +specified in the test-specific @option{-b} option. + +@node Using --enable-demo, , Using --enable-burst, Using Netperf to Measure Aggregate Performance +@section Using - -enable-demo + +One can +@example +configure --enable-demo +@end example +and compile netperf to enable netperf to emit ``interim results'' at +semi-regular intervals. This enables a global @code{-D} option which +takes a reporting interval as an argument. With that specified, the +output of netperf will then look something like + +@example +$ src/netperf -D 1.25 +MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to localhost.localdomain () port 0 AF_INET : demo +Interim result: 25425.52 10^6bits/s over 1.25 seconds ending at 1327962078.405 +Interim result: 25486.82 10^6bits/s over 1.25 seconds ending at 1327962079.655 +Interim result: 25474.96 10^6bits/s over 1.25 seconds ending at 1327962080.905 +Interim result: 25523.49 10^6bits/s over 1.25 seconds ending at 1327962082.155 +Interim result: 25053.57 10^6bits/s over 1.27 seconds ending at 1327962083.429 +Interim result: 25349.64 10^6bits/s over 1.25 seconds ending at 1327962084.679 +Interim result: 25292.84 10^6bits/s over 1.25 seconds ending at 1327962085.932 +Recv Send Send +Socket Socket Message Elapsed +Size Size Size Time Throughput +bytes bytes bytes secs. 10^6bits/sec + + 87380 16384 16384 10.00 25375.66 +@end example +The units of the ``Interim result'' lines will follow the units +selected via the global @code{-f} option. If the test-specific +@code{-o} option is specified on the command line, the format will be +CSV: +@example +... +2978.81,MBytes/s,1.25,1327962298.035 +... +@end example +If the test-specific @code{-k} option is used the format will be +keyval with each keyval being given an index: +@example +... +NETPERF_INTERIM_RESULT[2]=25.00 +NETPERF_UNITS[2]=10^9bits/s +NETPERF_INTERVAL[2]=1.25 +NETPERF_ENDING[2]=1327962357.249 +... +@end example +The expectation is it may be easier to utilize the keyvals if they +have indices. + +But how does this help with aggregate tests? Well, what one can do is +start the netperfs via a script, giving each a Very Long (tm) run +time. Direct the output to a file per instance. Then, once all the +netperfs have been started, take a timestamp and wait for some desired +test interval. Once that interval expires take another timestamp and +then start terminating the netperfs by sending them a SIGALRM signal +via the likes of the @code{kill} or @code{pkill} command. The +netperfs will terminate and emit the rest of the ``usual'' output, and +you can then bring the files to a central location for post +processing to find the aggregate performance over the ``test interval.'' + +This method has the advantage that it does not require advance +knowledge of how long it takes to get netperf tests started and/or +stopped. It does though require sufficiently synchronized clocks on +all the test systems. + +While calls to get the current time can be inexpensive, that neither +has been nor is universally true. For that reason netperf tries to +minimize the number of such ``timestamping'' calls (eg +@code{gettimeofday}) calls it makes when in demo mode. Rather than +take a timestamp after each @code{send} or @code{recv} call completes +netperf tries to guess how many units of work will be performed over +the desired interval. Only once that many units of work have been +completed will netperf check the time. If the reporting interval has +passed, netperf will emit an ``interim result.'' If the interval has +not passed, netperf will update its estimate for units and continue. + +After a bit of thought one can see that if things ``speed-up'' netperf +will still honor the interval. However, if things ``slow-down'' +netperf may be late with an ``interim result.'' Here is an example of +both of those happening during a test - with the interval being +honored while throughput increases, and then about half-way through +when another netperf (not shown) is started we see things slowing down +and netperf not hitting the interval as desired. +@example +$ src/netperf -D 2 -H tardy.hpl.hp.com -l 20 +MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to tardy.hpl.hp.com () port 0 AF_INET : demo +Interim result: 36.46 10^6bits/s over 2.01 seconds ending at 1327963880.565 +Interim result: 59.19 10^6bits/s over 2.00 seconds ending at 1327963882.569 +Interim result: 73.39 10^6bits/s over 2.01 seconds ending at 1327963884.576 +Interim result: 84.01 10^6bits/s over 2.03 seconds ending at 1327963886.603 +Interim result: 75.63 10^6bits/s over 2.21 seconds ending at 1327963888.814 +Interim result: 55.52 10^6bits/s over 2.72 seconds ending at 1327963891.538 +Interim result: 70.94 10^6bits/s over 2.11 seconds ending at 1327963893.650 +Interim result: 80.66 10^6bits/s over 2.13 seconds ending at 1327963895.777 +Interim result: 86.42 10^6bits/s over 2.12 seconds ending at 1327963897.901 +Recv Send Send +Socket Socket Message Elapsed +Size Size Size Time Throughput +bytes bytes bytes secs. 10^6bits/sec + + 87380 16384 16384 20.34 68.87 +@end example +So long as your post-processing mechanism can account for that, there +should be no problem. As time passes there may be changes to try to +improve the netperf's honoring the interval but one should not +ass-u-me it will always do so. One should not assume the precision +will remain fixed - future versions may change it - perhaps going +beyond tenths of seconds in reporting the interval length etc. + +@node Using Netperf to Measure Bidirectional Transfer, The Omni Tests, Using Netperf to Measure Aggregate Performance, Top +@comment node-name, next, previous, up +@chapter Using Netperf to Measure Bidirectional Transfer + +There are two ways to use netperf to measure the performance of +bidirectional transfer. The first is to run concurrent netperf tests +from the command line. The second is to configure netperf with +@code{--enable-burst} and use a single instance of the +@ref{TCP_RR,TCP_RR} test. + +While neither method is more ``correct'' than the other, each is doing +so in different ways, and that has possible implications. For +instance, using the concurrent netperf test mechanism means that +multiple TCP connections and multiple processes are involved, whereas +using the single instance of TCP_RR there is only one TCP connection +and one process on each end. They may behave differently, especially +on an MP system. + +@menu +* Bidirectional Transfer with Concurrent Tests:: +* Bidirectional Transfer with TCP_RR:: +* Implications of Concurrent Tests vs Burst Request/Response:: +@end menu + +@node Bidirectional Transfer with Concurrent Tests, Bidirectional Transfer with TCP_RR, Using Netperf to Measure Bidirectional Transfer, Using Netperf to Measure Bidirectional Transfer +@comment node-name, next, previous, up +@section Bidirectional Transfer with Concurrent Tests + +If we had two hosts Fred and Ethel, we could simply run a netperf +@ref{TCP_STREAM,TCP_STREAM} test on Fred pointing at Ethel, and a +concurrent netperf TCP_STREAM test on Ethel pointing at Fred, but +since there are no mechanisms to synchronize netperf tests and we +would be starting tests from two different systems, there is a +considerable risk of skew error. + +Far better would be to run simultaneous TCP_STREAM and +@ref{TCP_MAERTS,TCP_MAERTS} tests from just @b{one} system, using the +concepts and procedures outlined in @ref{Running Concurrent Netperf +Tests,Running Concurrent Netperf Tests}. Here then is an example: + +@example +for i in 1 +do + netperf -H 192.168.2.108 -t TCP_STREAM -B "outbound" -i 10 -P 0 -v 0 \ + -- -s 256K -S 256K & + netperf -H 192.168.2.108 -t TCP_MAERTS -B "inbound" -i 10 -P 0 -v 0 \ + -- -s 256K -S 256K & +done + + 892.66 outbound + 891.34 inbound +@end example + +We have used a @code{for} loop in the shell with just one iteration +because that will be @b{much} easier to get both tests started at more or +less the same time than doing it by hand. The global @option{-P} and +@option{-v} options are used because we aren't interested in anything +other than the throughput, and the global @option{-B} option is used +to tag each output so we know which was inbound and which outbound +relative to the system on which we were running netperf. Of course +that sense is switched on the system running netserver :) The use of +the global @option{-i} option is explained in @ref{Running Concurrent +Netperf Tests,Running Concurrent Netperf Tests}. + +Beginning with version 2.5.0 we can accomplish a similar result with +the @ref{The Omni Tests,the omni tests} and @ref{Omni Output +Selectors,output selectors}: + +@example +for i in 1 +do + netperf -H 192.168.1.3 -t omni -l 10 -P 0 -- \ + -d stream -s 256K -S 256K -o throughput,direction & + netperf -H 192.168.1.3 -t omni -l 10 -P 0 -- \ + -d maerts -s 256K -S 256K -o throughput,direction & +done + +805.26,Receive +828.54,Send +@end example + +@node Bidirectional Transfer with TCP_RR, Implications of Concurrent Tests vs Burst Request/Response, Bidirectional Transfer with Concurrent Tests, Using Netperf to Measure Bidirectional Transfer +@comment node-name, next, previous, up +@section Bidirectional Transfer with TCP_RR + +Starting with version 2.5.0 the @code{--enable-burst} configure option +defaults to @code{yes}, and starting some time before version 2.5.0 +but after 2.4.0 the global @option{-f} option would affect the +``throughput'' reported by request/response tests. If one uses the +test-specific @option{-b} option to have several ``transactions'' in +flight at one time and the test-specific @option{-r} option to +increase their size, the test looks more and more like a +single-connection bidirectional transfer than a simple +request/response test. + +So, putting it all together one can do something like: + +@example +netperf -f m -t TCP_RR -H 192.168.1.3 -v 2 -- -b 6 -r 32K -S 256K -S 256K +MIGRATED TCP REQUEST/RESPONSE TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.1.3 (192.168.1.3) port 0 AF_INET : interval : first burst 6 +Local /Remote +Socket Size Request Resp. Elapsed +Send Recv Size Size Time Throughput +bytes Bytes bytes bytes secs. 10^6bits/sec + +16384 87380 32768 32768 10.00 1821.30 +524288 524288 +Alignment Offset RoundTrip Trans Throughput +Local Remote Local Remote Latency Rate 10^6bits/s +Send Recv Send Recv usec/Tran per sec Outbound Inbound + 8 0 0 0 2015.402 3473.252 910.492 910.492 +@end example + +to get a bidirectional bulk-throughput result. As one can see, the -v +2 output will include a number of interesting, related values. + +@quotation +@b{NOTE: The logic behind @code{--enable-burst} is very simple, and there +are no calls to @code{poll()} or @code{select()} which means we want +to make sure that the @code{send()} calls will never block, or we run +the risk of deadlock with each side stuck trying to call @code{send()} +and neither calling @code{recv()}.} +@end quotation + +Fortunately, this is easily accomplished by setting a ``large enough'' +socket buffer size with the test-specific @option{-s} and @option{-S} +options. Presently this must be performed by the user. Future +versions of netperf might attempt to do this automagically, but there +are some issues to be worked-out. + +@node Implications of Concurrent Tests vs Burst Request/Response, , Bidirectional Transfer with TCP_RR, Using Netperf to Measure Bidirectional Transfer +@section Implications of Concurrent Tests vs Burst Request/Response + +There are perhaps subtle but important differences between using +concurrent unidirectional tests vs a burst-mode request to measure +bidirectional performance. + +Broadly speaking, a single ``connection'' or ``flow'' of traffic +cannot make use of the services of more than one or two CPUs at either +end. Whether one or two CPUs will be used processing a flow will +depend on the specifics of the stack(s) involved and whether or not +the global @option{-T} option has been used to bind netperf/netserver +to specific CPUs. + +When using concurrent tests there will be two concurrent connections +or flows, which means that upwards of four CPUs will be employed +processing the packets (global @option{-T} used, no more than two if +not), however, with just a single, bidirectional request/response test +no more than two CPUs will be employed (only one if the global +@option{-T} is not used). + +If there is a CPU bottleneck on either system this may result in +rather different results between the two methods. + +Also, with a bidirectional request/response test there is something of +a natural balance or synchronization between inbound and outbound - a +response will not be sent until a request is received, and (once the +burst level is reached) a subsequent request will not be sent until a +response is received. This may mask favoritism in the NIC between +inbound and outbound processing. + +With two concurrent unidirectional tests there is no such +synchronization or balance and any favoritism in the NIC may be exposed. + +@node The Omni Tests, Other Netperf Tests, Using Netperf to Measure Bidirectional Transfer, Top +@chapter The Omni Tests + +Beginning with version 2.5.0, netperf begins a migration to the +@samp{omni} tests or ``Two routines to measure them all.'' The code for +the omni tests can be found in @file{src/nettest_omni.c} and the goal +is to make it easier for netperf to support multiple protocols and +report a great many additional things about the systems under test. +Additionally, a flexible output selection mechanism is present which +allows the user to chose specifically what values she wishes to have +reported and in what format. + +The omni tests are included by default in version 2.5.0. To disable +them, one must: +@example +./configure --enable-omni=no ... +@end example + +and remake netperf. Remaking netserver is optional because even in +2.5.0 it has ``unmigrated'' netserver side routines for the classic +(eg @file{src/nettest_bsd.c}) tests. + +@menu +* Native Omni Tests:: +* Migrated Tests:: +* Omni Output Selection:: +@end menu + +@node Native Omni Tests, Migrated Tests, The Omni Tests, The Omni Tests +@section Native Omni Tests + +One access the omni tests ``natively'' by using a value of ``OMNI'' +with the global @option{-t} test-selection option. This will then +cause netperf to use the code in @file{src/nettest_omni.c} and in +particular the test-specific options parser for the omni tests. The +test-specific options for the omni tests are a superset of those for +``classic'' tests. The options added by the omni tests are: + +@table @code +@vindex -c, Test-specific +@item -c +This explicitly declares that the test is to include connection +establishment and tear-down as in either a TCP_CRR or TCP_CC test. + +@vindex -d, Test-specific +@item -d <direction> +This option sets the direction of the test relative to the netperf +process. As of version 2.5.0 one can use the following in a +case-insensitive manner: + +@table @code +@item send, stream, transmit, xmit or 2 +Any of which will cause netperf to send to the netserver. +@item recv, receive, maerts or 4 +Any of which will cause netserver to send to netperf. +@item rr or 6 +Either of which will cause a request/response test. +@end table + +Additionally, one can specify two directions separated by a '|' +character and they will be OR'ed together. In this way one can use +the ''Send|Recv'' that will be emitted by the @ref{Omni Output +Selectors,DIRECTION} @ref{Omni Output Selection,output selector} when +used with a request/response test. + +@vindex -k, Test-specific +@item -k [@ref{Omni Output Selection,output selector}] +This option sets the style of output to ``keyval'' where each line of +output has the form: +@example +key=value +@end example +For example: +@example +$ netperf -t omni -- -d rr -k "THROUGHPUT,THROUGHPUT_UNITS" +OMNI TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to localhost.localdomain (127.0.0.1) port 0 AF_INET : demo +THROUGHPUT=59092.65 +THROUGHPUT_UNITS=Trans/s +@end example + +Using the @option{-k} option will override any previous, test-specific +@option{-o} or @option{-O} option. + +@vindex -o, Test-specific +@item -o [@ref{Omni Output Selection,output selector}] +This option sets the style of output to ``CSV'' where there will be +one line of comma-separated values, preceded by one line of column +names unless the global @option{-P} option is used with a value of 0: +@example +$ netperf -t omni -- -d rr -o "THROUGHPUT,THROUGHPUT_UNITS" +OMNI TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to localhost.localdomain (127.0.0.1) port 0 AF_INET : demo +Throughput,Throughput Units +60999.07,Trans/s +@end example + +Using the @option{-o} option will override any previous, test-specific +@option{-k} or @option{-O} option. + +@vindex -O, Test-specific +@item -O [@ref{Omni Output Selection,output selector}] +This option sets the style of output to ``human readable'' which will +look quite similar to classic netperf output: +@example +$ netperf -t omni -- -d rr -O "THROUGHPUT,THROUGHPUT_UNITS" +OMNI TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to localhost.localdomain (127.0.0.1) port 0 AF_INET : demo +Throughput Throughput + Units + + +60492.57 Trans/s +@end example + +Using the @option{-O} option will override any previous, test-specific +@option{-k} or @option{-o} option. + +@vindex -t, Test-specific +@item -t +This option explicitly sets the socket type for the test's data +connection. As of version 2.5.0 the known socket types include +``stream'' and ``dgram'' for SOCK_STREAM and SOCK_DGRAM respectively. + +@vindex -T, Test-specific +@item -T <protocol> +This option is used to explicitly set the protocol used for the +test. It is case-insensitive. As of version 2.5.0 the protocols known +to netperf include: +@table @code +@item TCP +Select the Transmission Control Protocol +@item UDP +Select the User Datagram Protocol +@item SDP +Select the Sockets Direct Protocol +@item DCCP +Select the Datagram Congestion Control Protocol +@item SCTP +Select the Stream Control Transport Protocol +@item udplite +Select UDP Lite +@end table + +The default is implicit based on other settings. +@end table + +The omni tests also extend the interpretation of some of the classic, +test-specific options for the BSD Sockets tests: + +@table @code +@item -m <optionspec> +This can set the send size for either or both of the netperf and +netserver sides of the test: +@example +-m 32K +@end example +sets only the netperf-side send size to 32768 bytes, and or's-in +transmit for the direction. This is effectively the same behaviour as +for the classic tests. +@example +-m ,32K +@end example +sets only the netserver side send size to 32768 bytes and or's-in +receive for the direction. +@example +-m 16K,32K +sets the netperf side send size to 16284 bytes, the netserver side +send size to 32768 bytes and the direction will be "Send|Recv." +@end example +@item -M <optionspec> +This can set the receive size for either or both of the netperf and +netserver sides of the test: +@example +-M 32K +@end example +sets only the netserver side receive size to 32768 bytes and or's-in +send for the test direction. +@example +-M ,32K +@end example +sets only the netperf side receive size to 32768 bytes and or's-in +receive for the test direction. +@example +-M 16K,32K +@end example +sets the netserver side receive size to 16384 bytes and the netperf +side receive size to 32768 bytes and the direction will be "Send|Recv." +@end table + +@node Migrated Tests, Omni Output Selection, Native Omni Tests, The Omni Tests +@section Migrated Tests + +As of version 2.5.0 several tests have been migrated to use the omni +code in @file{src/nettest_omni.c} for the core of their testing. A +migrated test retains all its previous output code and so should still +``look and feel'' just like a pre-2.5.0 test with one exception - the +first line of the test banners will include the word ``MIGRATED'' at +the beginning as in: + +@example +$ netperf +MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to localhost.localdomain (127.0.0.1) port 0 AF_INET : demo +Recv Send Send +Socket Socket Message Elapsed +Size Size Size Time Throughput +bytes bytes bytes secs. 10^6bits/sec + + 87380 16384 16384 10.00 27175.27 +@end example + +The tests migrated in version 2.5.0 are: +@itemize +@item TCP_STREAM +@item TCP_MAERTS +@item TCP_RR +@item TCP_CRR +@item UDP_STREAM +@item UDP_RR +@end itemize + +It is expected that future releases will have additional tests +migrated to use the ``omni'' functionality. + +If one uses ``omni-specific'' test-specific options in conjunction +with a migrated test, instead of using the classic output code, the +new omni output code will be used. For example if one uses the +@option{-k} test-specific option with a value of +``MIN_LATENCY,MAX_LATENCY'' with a migrated TCP_RR test one will see: + +@example +$ netperf -t tcp_rr -- -k THROUGHPUT,THROUGHPUT_UNITS +MIGRATED TCP REQUEST/RESPONSE TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to localhost.localdomain (127.0.0.1) port 0 AF_INET : demo +THROUGHPUT=60074.74 +THROUGHPUT_UNITS=Trans/s +@end example +rather than: +@example +$ netperf -t tcp_rr +MIGRATED TCP REQUEST/RESPONSE TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to localhost.localdomain (127.0.0.1) port 0 AF_INET : demo +Local /Remote +Socket Size Request Resp. Elapsed Trans. +Send Recv Size Size Time Rate +bytes Bytes bytes bytes secs. per sec + +16384 87380 1 1 10.00 59421.52 +16384 87380 +@end example + +@node Omni Output Selection, , Migrated Tests, The Omni Tests +@section Omni Output Selection + +The omni test-specific @option{-k}, @option{-o} and @option{-O} +options take an optional @code{output selector} by which the user can +configure what values are reported. The output selector can take +several forms: + +@table @code +@item @file{filename} +The output selections will be read from the named file. Within the +file there can be up to four lines of comma-separated output +selectors. This controls how many multi-line blocks of output are emitted +when the @option{-O} option is used. This output, while not identical to +``classic'' netperf output, is inspired by it. Multiple lines have no +effect for @option{-k} and @option{-o} options. Putting output +selections in a file can be useful when the list of selections is long. +@item comma and/or semi-colon-separated list +The output selections will be parsed from a comma and/or +semi-colon-separated list of output selectors. When the list is given +to a @option{-O} option a semi-colon specifies a new output block +should be started. Semi-colons have the same meaning as commas when +used with the @option{-k} or @option{-o} options. Depending on the +command interpreter being used, the semi-colon may have to be escaped +somehow to keep it from being interpreted by the command interpreter. +This can often be done by enclosing the entire list in quotes. +@item all +If the keyword @b{all} is specified it means that all known output +values should be displayed at the end of the test. This can be a +great deal of output. As of version 2.5.0 there are 157 different +output selectors. +@item ? +If a ``?'' is given as the output selection, the list of all known +output selectors will be displayed and no test actually run. When +passed to the @option{-O} option they will be listed one per +line. Otherwise they will be listed as a comma-separated list. It may +be necessary to protect the ``?'' from the command interpreter by +escaping it or enclosing it in quotes. +@item no selector +If nothing is given to the @option{-k}, @option{-o} or @option{-O} +option then the code selects a default set of output selectors +inspired by classic netperf output. The format will be the @samp{human +readable} format emitted by the test-specific @option{-O} option. +@end table + +The order of evaluation will first check for an output selection. If +none is specified with the @option{-k}, @option{-o} or @option{-O} +option netperf will select a default based on the characteristics of the +test. If there is an output selection, the code will first check for +@samp{?}, then check to see if it is the magic @samp{all} keyword. +After that it will check for either @samp{,} or @samp{;} in the +selection and take that to mean it is a comma and/or +semi-colon-separated list. If none of those checks match, netperf will then +assume the output specification is a filename and attempt to open and +parse the file. + +@menu +* Omni Output Selectors:: +@end menu + +@node Omni Output Selectors, , Omni Output Selection, Omni Output Selection +@subsection Omni Output Selectors + +As of version 2.5.0 the output selectors are: + +@table @code +@item OUTPUT_NONE +This is essentially a null output. For @option{-k} output it will +simply add a line that reads ``OUTPUT_NONE='' to the output. For +@option{-o} it will cause an empty ``column'' to be included. For +@option{-O} output it will cause extra spaces to separate ``real'' output. +@item SOCKET_TYPE +This will cause the socket type (eg SOCK_STREAM, SOCK_DGRAM) for the +data connection to be output. +@item PROTOCOL +This will cause the protocol used for the data connection to be displayed. +@item DIRECTION +This will display the data flow direction relative to the netperf +process. Units: Send or Recv for a unidirectional bulk-transfer test, +or Send|Recv for a request/response test. +@item ELAPSED_TIME +This will display the elapsed time in seconds for the test. +@item THROUGHPUT +This will display the throughput for the test. Units: As requested via +the global @option{-f} option and displayed by the THROUGHPUT_UNITS +output selector. +@item THROUGHPUT_UNITS +This will display the units for what is displayed by the +@code{THROUGHPUT} output selector. +@item LSS_SIZE_REQ +This will display the local (netperf) send socket buffer size (aka +SO_SNDBUF) requested via the command line. Units: Bytes. +@item LSS_SIZE +This will display the local (netperf) send socket buffer size +(SO_SNDBUF) immediately after the data connection socket was created. +Peculiarities of different networking stacks may lead to this +differing from the size requested via the command line. Units: Bytes. +@item LSS_SIZE_END +This will display the local (netperf) send socket buffer size +(SO_SNDBUF) immediately before the data connection socket is closed. +Peculiarities of different networking stacks may lead this to differ +from the size requested via the command line and/or the size +immediately after the data connection socket was created. Units: Bytes. +@item LSR_SIZE_REQ +This will display the local (netperf) receive socket buffer size (aka +SO_RCVBUF) requested via the command line. Units: Bytes. +@item LSR_SIZE +This will display the local (netperf) receive socket buffer size +(SO_RCVBUF) immediately after the data connection socket was created. +Peculiarities of different networking stacks may lead to this +differing from the size requested via the command line. Units: Bytes. +@item LSR_SIZE_END +This will display the local (netperf) receive socket buffer size +(SO_RCVBUF) immediately before the data connection socket is closed. +Peculiarities of different networking stacks may lead this to differ +from the size requested via the command line and/or the size +immediately after the data connection socket was created. Units: Bytes. +@item RSS_SIZE_REQ +This will display the remote (netserver) send socket buffer size (aka +SO_SNDBUF) requested via the command line. Units: Bytes. +@item RSS_SIZE +This will display the remote (netserver) send socket buffer size +(SO_SNDBUF) immediately after the data connection socket was created. +Peculiarities of different networking stacks may lead to this +differing from the size requested via the command line. Units: Bytes. +@item RSS_SIZE_END +This will display the remote (netserver) send socket buffer size +(SO_SNDBUF) immediately before the data connection socket is closed. +Peculiarities of different networking stacks may lead this to differ +from the size requested via the command line and/or the size +immediately after the data connection socket was created. Units: Bytes. +@item RSR_SIZE_REQ +This will display the remote (netserver) receive socket buffer size (aka +SO_RCVBUF) requested via the command line. Units: Bytes. +@item RSR_SIZE +This will display the remote (netserver) receive socket buffer size +(SO_RCVBUF) immediately after the data connection socket was created. +Peculiarities of different networking stacks may lead to this +differing from the size requested via the command line. Units: Bytes. +@item RSR_SIZE_END +This will display the remote (netserver) receive socket buffer size +(SO_RCVBUF) immediately before the data connection socket is closed. +Peculiarities of different networking stacks may lead this to differ +from the size requested via the command line and/or the size +immediately after the data connection socket was created. Units: Bytes. +@item LOCAL_SEND_SIZE +This will display the size of the buffers netperf passed in any +``send'' calls it made on the data connection for a +non-request/response test. Units: Bytes. +@item LOCAL_RECV_SIZE +This will display the size of the buffers netperf passed in any +``receive'' calls it made on the data connection for a +non-request/response test. Units: Bytes. +@item REMOTE_SEND_SIZE +This will display the size of the buffers netserver passed in any +``send'' calls it made on the data connection for a +non-request/response test. Units: Bytes. +@item REMOTE_RECV_SIZE +This will display the size of the buffers netserver passed in any +``receive'' calls it made on the data connection for a +non-request/response test. Units: Bytes. +@item REQUEST_SIZE +This will display the size of the requests netperf sent in a +request-response test. Units: Bytes. +@item RESPONSE_SIZE +This will display the size of the responses netserver sent in a +request-response test. Units: Bytes. +@item LOCAL_CPU_UTIL +This will display the overall CPU utilization during the test as +measured by netperf. Units: 0 to 100 percent. +@item LOCAL_CPU_PERCENT_USER +This will display the CPU fraction spent in user mode during the test +as measured by netperf. Only supported by netcpu_procstat. Units: 0 to +100 percent. +@item LOCAL_CPU_PERCENT_SYSTEM +This will display the CPU fraction spent in system mode during the test +as measured by netperf. Only supported by netcpu_procstat. Units: 0 to +100 percent. +@item LOCAL_CPU_PERCENT_IOWAIT +This will display the fraction of time waiting for I/O to complete +during the test as measured by netperf. Only supported by +netcpu_procstat. Units: 0 to 100 percent. +@item LOCAL_CPU_PERCENT_IRQ +This will display the fraction of time servicing interrupts during the +test as measured by netperf. Only supported by netcpu_procstat. Units: +0 to 100 percent. +@item LOCAL_CPU_PERCENT_SWINTR +This will display the fraction of time servicing softirqs during the +test as measured by netperf. Only supported by netcpu_procstat. Units: +0 to 100 percent. +@item LOCAL_CPU_METHOD +This will display the method used by netperf to measure CPU +utilization. Units: single character denoting method. +@item LOCAL_SD +This will display the service demand, or units of CPU consumed per +unit of work, as measured by netperf. Units: microseconds of CPU +consumed per either KB (K==1024) of data transferred or request/response +transaction. +@item REMOTE_CPU_UTIL +This will display the overall CPU utilization during the test as +measured by netserver. Units 0 to 100 percent. +@item REMOTE_CPU_PERCENT_USER +This will display the CPU fraction spent in user mode during the test +as measured by netserver. Only supported by netcpu_procstat. Units: 0 to +100 percent. +@item REMOTE_CPU_PERCENT_SYSTEM +This will display the CPU fraction spent in system mode during the test +as measured by netserver. Only supported by netcpu_procstat. Units: 0 to +100 percent. +@item REMOTE_CPU_PERCENT_IOWAIT +This will display the fraction of time waiting for I/O to complete +during the test as measured by netserver. Only supported by +netcpu_procstat. Units: 0 to 100 percent. +@item REMOTE_CPU_PERCENT_IRQ +This will display the fraction of time servicing interrupts during the +test as measured by netserver. Only supported by netcpu_procstat. Units: +0 to 100 percent. +@item REMOTE_CPU_PERCENT_SWINTR +This will display the fraction of time servicing softirqs during the +test as measured by netserver. Only supported by netcpu_procstat. Units: +0 to 100 percent. +@item REMOTE_CPU_METHOD +This will display the method used by netserver to measure CPU +utilization. Units: single character denoting method. +@item REMOTE_SD +This will display the service demand, or units of CPU consumed per +unit of work, as measured by netserver. Units: microseconds of CPU +consumed per either KB (K==1024) of data transferred or +request/response transaction. +@item SD_UNITS +This will display the units for LOCAL_SD and REMOTE_SD +@item CONFIDENCE_LEVEL +This will display the confidence level requested by the user either +explicitly via the global @option{-I} option, or implicitly via the +global @option{-i} option. The value will be either 95 or 99 if +confidence intervals have been requested or 0 if they were not. Units: +Percent +@item CONFIDENCE_INTERVAL +This will display the width of the confidence interval requested +either explicitly via the global @option{-I} option or implicitly via +the global @option{-i} option. Units: Width in percent of mean value +computed. A value of -1.0 means that confidence intervals were not requested. +@item CONFIDENCE_ITERATION +This will display the number of test iterations netperf undertook, +perhaps while attempting to achieve the requested confidence interval +and level. If confidence intervals were requested via the command line +then the value will be between 3 and 30. If confidence intervals were +not requested the value will be 1. Units: Iterations +@item THROUGHPUT_CONFID +This will display the width of the confidence interval actually +achieved for @code{THROUGHPUT} during the test. Units: Width of +interval as percentage of reported throughput value. +@item LOCAL_CPU_CONFID +This will display the width of the confidence interval actually +achieved for overall CPU utilization on the system running netperf +(@code{LOCAL_CPU_UTIL}) during the test, if CPU utilization measurement +was enabled. Units: Width of interval as percentage of reported CPU +utilization. +@item REMOTE_CPU_CONFID +This will display the width of the confidence interval actually +achieved for overall CPU utilization on the system running netserver +(@code{REMOTE_CPU_UTIL}) during the test, if CPU utilization +measurement was enabled. Units: Width of interval as percentage of +reported CPU utilization. +@item TRANSACTION_RATE +This will display the transaction rate in transactions per second for +a request/response test even if the user has requested a throughput in +units of bits or bytes per second via the global @option{-f} +option. It is undefined for a non-request/response test. Units: +Transactions per second. +@item RT_LATENCY +This will display the average round-trip latency for a +request/response test, accounting for number of transactions in flight +at one time. It is undefined for a non-request/response test. Units: +Microseconds per transaction +@item BURST_SIZE +This will display the ``burst size'' or added transactions in flight +in a request/response test as requested via a test-specific +@option{-b} option. The number of transactions in flight at one time +will be one greater than this value. It is undefined for a +non-request/response test. Units: added Transactions in flight. +@item LOCAL_TRANSPORT_RETRANS +This will display the number of retransmissions experienced on the +data connection during the test as determined by netperf. A value of +-1 means the attempt to determine the number of retransmissions failed +or the concept was not valid for the given protocol or the mechanism +is not known for the platform. A value of -2 means it was not +attempted. As of version 2.5.0 the meaning of values are in flux and +subject to change. Units: number of retransmissions. +@item REMOTE_TRANSPORT_RETRANS +This will display the number of retransmissions experienced on the +data connection during the test as determined by netserver. A value +of -1 means the attempt to determine the number of retransmissions +failed or the concept was not valid for the given protocol or the +mechanism is not known for the platform. A value of -2 means it was +not attempted. As of version 2.5.0 the meaning of values are in flux +and subject to change. Units: number of retransmissions. +@item TRANSPORT_MSS +This will display the Maximum Segment Size (aka MSS) or its equivalent +for the protocol being used during the test. A value of -1 means +either the concept of an MSS did not apply to the protocol being used, +or there was an error in retrieving it. Units: Bytes. +@item LOCAL_SEND_THROUGHPUT +The throughput as measured by netperf for the successful ``send'' +calls it made on the data connection. Units: as requested via the +global @option{-f} option and displayed via the @code{THROUGHPUT_UNITS} +output selector. +@item LOCAL_RECV_THROUGHPUT +The throughput as measured by netperf for the successful ``receive'' +calls it made on the data connection. Units: as requested via the +global @option{-f} option and displayed via the @code{THROUGHPUT_UNITS} +output selector. +@item REMOTE_SEND_THROUGHPUT +The throughput as measured by netserver for the successful ``send'' +calls it made on the data connection. Units: as requested via the +global @option{-f} option and displayed via the @code{THROUGHPUT_UNITS} +output selector. +@item REMOTE_RECV_THROUGHPUT +The throughput as measured by netserver for the successful ``receive'' +calls it made on the data connection. Units: as requested via the +global @option{-f} option and displayed via the @code{THROUGHPUT_UNITS} +output selector. +@item LOCAL_CPU_BIND +The CPU to which netperf was bound, if at all, during the test. A +value of -1 means that netperf was not explicitly bound to a CPU +during the test. Units: CPU ID +@item LOCAL_CPU_COUNT +The number of CPUs (cores, threads) detected by netperf. Units: CPU count. +@item LOCAL_CPU_PEAK_UTIL +The utilization of the CPU most heavily utilized during the test, as +measured by netperf. This can be used to see if any one CPU of a +multi-CPU system was saturated even though the overall CPU utilization +as reported by @code{LOCAL_CPU_UTIL} was low. Units: 0 to 100% +@item LOCAL_CPU_PEAK_ID +The id of the CPU most heavily utilized during the test as determined +by netperf. Units: CPU ID. +@item LOCAL_CPU_MODEL +Model information for the processor(s) present on the system running +netperf. Assumes all processors in the system (as perceived by +netperf) on which netperf is running are the same model. Units: Text +@item LOCAL_CPU_FREQUENCY +The frequency of the processor(s) on the system running netperf, at +the time netperf made the call. Assumes that all processors present +in the system running netperf are running at the same +frequency. Units: MHz +@item REMOTE_CPU_BIND +The CPU to which netserver was bound, if at all, during the test. A +value of -1 means that netperf was not explicitly bound to a CPU +during the test. Units: CPU ID +@item REMOTE_CPU_COUNT +The number of CPUs (cores, threads) detected by netserver. Units: CPU +count. +@item REMOTE_CPU_PEAK_UTIL +The utilization of the CPU most heavily utilized during the test, as +measured by netserver. This can be used to see if any one CPU of a +multi-CPU system was saturated even though the overall CPU utilization +as reported by @code{REMOTE_CPU_UTIL} was low. Units: 0 to 100% +@item REMOTE_CPU_PEAK_ID +The id of the CPU most heavily utilized during the test as determined +by netserver. Units: CPU ID. +@item REMOTE_CPU_MODEL +Model information for the processor(s) present on the system running +netserver. Assumes all processors in the system (as perceived by +netserver) on which netserver is running are the same model. Units: +Text +@item REMOTE_CPU_FREQUENCY +The frequency of the processor(s) on the system running netserver, at +the time netserver made the call. Assumes that all processors present +in the system running netserver are running at the same +frequency. Units: MHz +@item SOURCE_PORT +The port ID/service name to which the data socket created by netperf +was bound. A value of 0 means the data socket was not explicitly +bound to a port number. Units: ASCII text. +@item SOURCE_ADDR +The name/address to which the data socket created by netperf was +bound. A value of 0.0.0.0 means the data socket was not explicitly +bound to an address. Units: ASCII text. +@item SOURCE_FAMILY +The address family to which the data socket created by netperf was +bound. A value of 0 means the data socket was not explicitly bound to +a given address family. Units: ASCII text. +@item DEST_PORT +The port ID to which the data socket created by netserver was bound. A +value of 0 means the data socket was not explicitly bound to a port +number. Units: ASCII text. +@item DEST_ADDR +The name/address of the data socket created by netserver. Units: +ASCII text. +@item DEST_FAMILY +The address family to which the data socket created by netserver was +bound. A value of 0 means the data socket was not explicitly bound to +a given address family. Units: ASCII text. +@item LOCAL_SEND_CALLS +The number of successful ``send'' calls made by netperf against its +data socket. Units: Calls. +@item LOCAL_RECV_CALLS +The number of successful ``receive'' calls made by netperf against its +data socket. Units: Calls. +@item LOCAL_BYTES_PER_RECV +The average number of bytes per ``receive'' call made by netperf +against its data socket. Units: Bytes. +@item LOCAL_BYTES_PER_SEND +The average number of bytes per ``send'' call made by netperf against +its data socket. Units: Bytes. +@item LOCAL_BYTES_SENT +The number of bytes successfully sent by netperf through its data +socket. Units: Bytes. +@item LOCAL_BYTES_RECVD +The number of bytes successfully received by netperf through its data +socket. Units: Bytes. +@item LOCAL_BYTES_XFERD +The sum of bytes sent and received by netperf through its data +socket. Units: Bytes. +@item LOCAL_SEND_OFFSET +The offset from the alignment of the buffers passed by netperf in its +``send'' calls. Specified via the global @option{-o} option and +defaults to 0. Units: Bytes. +@item LOCAL_RECV_OFFSET +The offset from the alignment of the buffers passed by netperf in its +``receive'' calls. Specified via the global @option{-o} option and +defaults to 0. Units: Bytes. +@item LOCAL_SEND_ALIGN +The alignment of the buffers passed by netperf in its ``send'' calls +as specified via the global @option{-a} option. Defaults to 8. Units: +Bytes. +@item LOCAL_RECV_ALIGN +The alignment of the buffers passed by netperf in its ``receive'' +calls as specified via the global @option{-a} option. Defaults to +8. Units: Bytes. +@item LOCAL_SEND_WIDTH +The ``width'' of the ring of buffers through which netperf cycles as +it makes its ``send'' calls. Defaults to one more than the local send +socket buffer size divided by the send size as determined at the time +the data socket is created. Can be used to make netperf more processor +data cache unfriendly. Units: number of buffers. +@item LOCAL_RECV_WIDTH +The ``width'' of the ring of buffers through which netperf cycles as +it makes its ``receive'' calls. Defaults to one more than the local +receive socket buffer size divided by the receive size as determined +at the time the data socket is created. Can be used to make netperf +more processor data cache unfriendly. Units: number of buffers. +@item LOCAL_SEND_DIRTY_COUNT +The number of bytes to ``dirty'' (write to) before netperf makes a +``send'' call. Specified via the global @option{-k} option, which +requires that --enable-dirty=yes was specified with the configure +command prior to building netperf. Units: Bytes. +@item LOCAL_RECV_DIRTY_COUNT +The number of bytes to ``dirty'' (write to) before netperf makes a +``recv'' call. Specified via the global @option{-k} option which +requires that --enable-dirty was specified with the configure command +prior to building netperf. Units: Bytes. +@item LOCAL_RECV_CLEAN_COUNT +The number of bytes netperf should read ``cleanly'' before making a +``receive'' call. Specified via the global @option{-k} option which +requires that --enable-dirty was specified with configure command +prior to building netperf. Clean reads start were dirty writes ended. +Units: Bytes. +@item LOCAL_NODELAY +Indicates whether or not setting the test protocol-specific ``no +delay'' (eg TCP_NODELAY) option on the data socket used by netperf was +requested by the test-specific @option{-D} option and +successful. Units: 0 means no, 1 means yes. +@item LOCAL_CORK +Indicates whether or not TCP_CORK was set on the data socket used by +netperf as requested via the test-specific @option{-C} option. 1 means +yes, 0 means no/not applicable. +@item REMOTE_SEND_CALLS +@item REMOTE_RECV_CALLS +@item REMOTE_BYTES_PER_RECV +@item REMOTE_BYTES_PER_SEND +@item REMOTE_BYTES_SENT +@item REMOTE_BYTES_RECVD +@item REMOTE_BYTES_XFERD +@item REMOTE_SEND_OFFSET +@item REMOTE_RECV_OFFSET +@item REMOTE_SEND_ALIGN +@item REMOTE_RECV_ALIGN +@item REMOTE_SEND_WIDTH +@item REMOTE_RECV_WIDTH +@item REMOTE_SEND_DIRTY_COUNT +@item REMOTE_RECV_DIRTY_COUNT +@item REMOTE_RECV_CLEAN_COUNT +@item REMOTE_NODELAY +@item REMOTE_CORK +These are all like their ``LOCAL_'' counterparts only for the +netserver rather than netperf. +@item LOCAL_SYSNAME +The name of the OS (eg ``Linux'') running on the system on which +netperf was running. Units: ASCII Text +@item LOCAL_SYSTEM_MODEL +The model name of the system on which netperf was running. Units: +ASCII Text. +@item LOCAL_RELEASE +The release name/number of the OS running on the system on which +netperf was running. Units: ASCII Text +@item LOCAL_VERSION +The version number of the OS running on the system on which netperf +was running. Units: ASCII Text +@item LOCAL_MACHINE +The machine architecture of the machine on which netperf was +running. Units: ASCII Text. +@item REMOTE_SYSNAME +@item REMOTE_SYSTEM_MODEL +@item REMOTE_RELEASE +@item REMOTE_VERSION +@item REMOTE_MACHINE +These are all like their ``LOCAL_'' counterparts only for the +netserver rather than netperf. +@item LOCAL_INTERFACE_NAME +The name of the probable egress interface through which the data +connection went on the system running netperf. Example: eth0. Units: +ASCII Text. +@item LOCAL_INTERFACE_VENDOR +The vendor ID of the probable egress interface through which traffic +on the data connection went on the system running netperf. Units: +Hexadecimal IDs as might be found in a @file{pci.ids} file or at +@uref{http://pciids.sourceforge.net/,the PCI ID Repository}. +@item LOCAL_INTERFACE_DEVICE +The device ID of the probable egress interface through which traffic +on the data connection went on the system running netperf. Units: +Hexadecimal IDs as might be found in a @file{pci.ids} file or at +@uref{http://pciids.sourceforge.net/,the PCI ID Repository}. +@item LOCAL_INTERFACE_SUBVENDOR +The sub-vendor ID of the probable egress interface through which +traffic on the data connection went on the system running +netperf. Units: Hexadecimal IDs as might be found in a @file{pci.ids} +file or at @uref{http://pciids.sourceforge.net/,the PCI ID +Repository}. +@item LOCAL_INTERFACE_SUBDEVICE +The sub-device ID of the probable egress interface through which +traffic on the data connection went on the system running +netperf. Units: Hexadecimal IDs as might be found in a @file{pci.ids} +file or at @uref{http://pciids.sourceforge.net/,the PCI ID +Repository}. +@item LOCAL_DRIVER_NAME +The name of the driver used for the probable egress interface through +which traffic on the data connection went on the system running +netperf. Units: ASCII Text. +@item LOCAL_DRIVER_VERSION +The version string for the driver used for the probable egress +interface through which traffic on the data connection went on the +system running netperf. Units: ASCII Text. +@item LOCAL_DRIVER_FIRMWARE +The firmware version for the driver used for the probable egress +interface through which traffic on the data connection went on the +system running netperf. Units: ASCII Text. +@item LOCAL_DRIVER_BUS +The bus address of the probable egress interface through which traffic +on the data connection went on the system running netperf. Units: +ASCII Text. +@item LOCAL_INTERFACE_SLOT +The slot ID of the probable egress interface through which traffic +on the data connection went on the system running netperf. Units: +ASCII Text. +@item REMOTE_INTERFACE_NAME +@item REMOTE_INTERFACE_VENDOR +@item REMOTE_INTERFACE_DEVICE +@item REMOTE_INTERFACE_SUBVENDOR +@item REMOTE_INTERFACE_SUBDEVICE +@item REMOTE_DRIVER_NAME +@item REMOTE_DRIVER_VERSION +@item REMOTE_DRIVER_FIRMWARE +@item REMOTE_DRIVER_BUS +@item REMOTE_INTERFACE_SLOT +These are all like their ``LOCAL_'' counterparts only for the +netserver rather than netperf. +@item LOCAL_INTERVAL_USECS +The interval at which bursts of operations (sends, receives, +transactions) were attempted by netperf. Specified by the +global @option{-w} option which requires --enable-intervals to have +been specified with the configure command prior to building +netperf. Units: Microseconds (though specified by default in +milliseconds on the command line) +@item LOCAL_INTERVAL_BURST +The number of operations (sends, receives, transactions depending on +the test) which were attempted by netperf each LOCAL_INTERVAL_USECS +units of time. Specified by the global @option{-b} option which +requires --enable-intervals to have been specified with the configure +command prior to building netperf. Units: number of operations per burst. +@item REMOTE_INTERVAL_USECS +The interval at which bursts of operations (sends, receives, +transactions) were attempted by netserver. Specified by the +global @option{-w} option which requires --enable-intervals to have +been specified with the configure command prior to building +netperf. Units: Microseconds (though specified by default in +milliseconds on the command line) +@item REMOTE_INTERVAL_BURST +The number of operations (sends, receives, transactions depending on +the test) which were attempted by netperf each LOCAL_INTERVAL_USECS +units of time. Specified by the global @option{-b} option which +requires --enable-intervals to have been specified with the configure +command prior to building netperf. Units: number of operations per burst. +@item LOCAL_SECURITY_TYPE_ID +@item LOCAL_SECURITY_TYPE +@item LOCAL_SECURITY_ENABLED_NUM +@item LOCAL_SECURITY_ENABLED +@item LOCAL_SECURITY_SPECIFIC +@item REMOTE_SECURITY_TYPE_ID +@item REMOTE_SECURITY_TYPE +@item REMOTE_SECURITY_ENABLED_NUM +@item REMOTE_SECURITY_ENABLED +@item REMOTE_SECURITY_SPECIFIC +A bunch of stuff related to what sort of security mechanisms (eg +SELINUX) were enabled on the systems during the test. +@item RESULT_BRAND +The string specified by the user with the global @option{-B} +option. Units: ASCII Text. +@item UUID +The universally unique identifier associated with this test, either +generated automagically by netperf, or passed to netperf via an omni +test-specific @option{-u} option. Note: Future versions may make this +a global command-line option. Units: ASCII Text. +@item MIN_LATENCY +The minimum ``latency'' or operation time (send, receive or +request/response exchange depending on the test) as measured on the +netperf side when the global @option{-j} option was specified. Units: +Microseconds. +@item MAX_LATENCY +The maximum ``latency'' or operation time (send, receive or +request/response exchange depending on the test) as measured on the +netperf side when the global @option{-j} option was specified. Units: +Microseconds. +@item P50_LATENCY +The 50th percentile value of ``latency'' or operation time (send, receive or +request/response exchange depending on the test) as measured on the +netperf side when the global @option{-j} option was specified. Units: +Microseconds. +@item P90_LATENCY +The 90th percentile value of ``latency'' or operation time (send, receive or +request/response exchange depending on the test) as measured on the +netperf side when the global @option{-j} option was specified. Units: +Microseconds. +@item P99_LATENCY +The 99th percentile value of ``latency'' or operation time (send, receive or +request/response exchange depending on the test) as measured on the +netperf side when the global @option{-j} option was specified. Units: +Microseconds. +@item MEAN_LATENCY +The average ``latency'' or operation time (send, receive or +request/response exchange depending on the test) as measured on the +netperf side when the global @option{-j} option was specified. Units: +Microseconds. +@item STDDEV_LATENCY +The standard deviation of ``latency'' or operation time (send, receive or +request/response exchange depending on the test) as measured on the +netperf side when the global @option{-j} option was specified. Units: +Microseconds. +@item COMMAND_LINE +The full command line used when invoking netperf. Units: ASCII Text. +@item OUTPUT_END +While emitted with the list of output selectors, it is ignored when +specified as an output selector. +@end table + +@node Other Netperf Tests, Address Resolution, The Omni Tests, Top +@chapter Other Netperf Tests + +Apart from the typical performance tests, netperf contains some tests +which can be used to streamline measurements and reporting. These +include CPU rate calibration (present) and host identification (future +enhancement). + +@menu +* CPU rate calibration:: +* UUID Generation:: +@end menu + +@node CPU rate calibration, UUID Generation, Other Netperf Tests, Other Netperf Tests +@section CPU rate calibration + +Some of the CPU utilization measurement mechanisms of netperf work by +comparing the rate at which some counter increments when the system is +idle with the rate at which that same counter increments when the +system is running a netperf test. The ratio of those rates is used to +arrive at a CPU utilization percentage. + +This means that netperf must know the rate at which the counter +increments when the system is presumed to be ``idle.'' If it does not +know the rate, netperf will measure it before starting a data transfer +test. This calibration step takes 40 seconds for each of the local or +remote systems, and if repeated for each netperf test would make taking +repeated measurements rather slow. + +Thus, the netperf CPU utilization options @option{-c} and and +@option{-C} can take an optional calibration value. This value is +used as the ``idle rate'' and the calibration step is not +performed. To determine the idle rate, netperf can be used to run +special tests which only report the value of the calibration - they +are the LOC_CPU and REM_CPU tests. These return the calibration value +for the local and remote system respectively. A common way to use +these tests is to store their results into an environment variable and +use that in subsequent netperf commands: + +@example +LOC_RATE=`netperf -t LOC_CPU` +REM_RATE=`netperf -H <remote> -t REM_CPU` +netperf -H <remote> -c $LOC_RATE -C $REM_RATE ... -- ... +... +netperf -H <remote> -c $LOC_RATE -C $REM_RATE ... -- ... +@end example + +If you are going to use netperf to measure aggregate results, it is +important to use the LOC_CPU and REM_CPU tests to get the calibration +values first to avoid issues with some of the aggregate netperf tests +transferring data while others are ``idle'' and getting bogus +calibration values. When running aggregate tests, it is very +important to remember that any one instance of netperf does not know +about the other instances of netperf. It will report global CPU +utilization and will calculate service demand believing it was the +only thing causing that CPU utilization. So, you can use the CPU +utilization reported by netperf in an aggregate test, but you have to +calculate service demands by hand. + +@node UUID Generation, , CPU rate calibration, Other Netperf Tests +@section UUID Generation + +Beginning with version 2.5.0 netperf can generate Universally Unique +IDentifiers (UUIDs). This can be done explicitly via the ``UUID'' +test: +@example +$ netperf -t UUID +2c8561ae-9ebd-11e0-a297-0f5bfa0349d0 +@end example + +In and of itself, this is not terribly useful, but used in conjunction +with the test-specific @option{-u} option of an ``omni'' test to set +the UUID emitted by the @ref{Omni Output Selectors,UUID} output +selector, it can be used to tie-together the separate instances of an +aggregate netperf test. Say, for instance if they were inserted into +a database of some sort. + +@node Address Resolution, Enhancing Netperf, Other Netperf Tests, Top +@comment node-name, next, previous, up +@chapter Address Resolution + +Netperf versions 2.4.0 and later have merged IPv4 and IPv6 tests so +the functionality of the tests in @file{src/nettest_ipv6.c} has been +subsumed into the tests in @file{src/nettest_bsd.c} This has been +accomplished in part by switching from @code{gethostbyname()}to +@code{getaddrinfo()} exclusively. While it was theoretically possible +to get multiple results for a hostname from @code{gethostbyname()} it +was generally unlikely and netperf's ignoring of the second and later +results was not much of an issue. + +Now with @code{getaddrinfo} and particularly with AF_UNSPEC it is +increasingly likely that a given hostname will have multiple +associated addresses. The @code{establish_control()} routine of +@file{src/netlib.c} will indeed attempt to chose from among all the +matching IP addresses when establishing the control connection. +Netperf does not _really_ care if the control connection is IPv4 or +IPv6 or even mixed on either end. + +However, the individual tests still ass-u-me that the first result in +the address list is the one to be used. Whether or not this will +turn-out to be an issue has yet to be determined. + +If you do run into problems with this, the easiest workaround is to +specify IP addresses for the data connection explicitly in the +test-specific @option{-H} and @option{-L} options. At some point, the +netperf tests _may_ try to be more sophisticated in their parsing of +returns from @code{getaddrinfo()} - straw-man patches to +@email{netperf-feedback@@netperf.org} would of course be most welcome +:) + +Netperf has leveraged code from other open-source projects with +amenable licensing to provide a replacement @code{getaddrinfo()} call +on those platforms where the @command{configure} script believes there +is no native getaddrinfo call. As of this writing, the replacement +@code{getaddrinfo()} as been tested on HP-UX 11.0 and then presumed to +run elsewhere. + +@node Enhancing Netperf, Netperf4, Address Resolution, Top +@comment node-name, next, previous, up +@chapter Enhancing Netperf + +Netperf is constantly evolving. If you find you want to make +enhancements to netperf, by all means do so. If you wish to add a new +``suite'' of tests to netperf the general idea is to: + +@enumerate +@item +Add files @file{src/nettest_mumble.c} and @file{src/nettest_mumble.h} +where mumble is replaced with something meaningful for the test-suite. +@item +Add support for an appropriate @option{--enable-mumble} option in +@file{configure.ac}. +@item +Edit @file{src/netperf.c}, @file{netsh.c}, and @file{netserver.c} as +required, using #ifdef WANT_MUMBLE. +@item +Compile and test +@end enumerate + +However, with the addition of the ``omni'' tests in version 2.5.0 it +is preferred that one attempt to make the necessary changes to +@file{src/nettest_omni.c} rather than adding new source files, unless +this would make the omni tests entirely too complicated. + +If you wish to submit your changes for possible inclusion into the +mainline sources, please try to base your changes on the latest +available sources. (@xref{Getting Netperf Bits}.) and then send email +describing the changes at a high level to +@email{netperf-feedback@@netperf.org} or perhaps +@email{netperf-talk@@netperf.org}. If the consensus is positive, then +sending context @command{diff} results to +@email{netperf-feedback@@netperf.org} is the next step. From that +point, it is a matter of pestering the Netperf Contributing Editor +until he gets the changes incorporated :) + +@node Netperf4, Concept Index, Enhancing Netperf, Top +@comment node-name, next, previous, up +@chapter Netperf4 + +Netperf4 is the shorthand name given to version 4.X.X of netperf. +This is really a separate benchmark more than a newer version of +netperf, but it is a descendant of netperf so the netperf name is +kept. The facetious way to describe netperf4 is to say it is the +egg-laying-woolly-milk-pig version of netperf :) The more respectful +way to describe it is to say it is the version of netperf with support +for synchronized, multiple-thread, multiple-test, multiple-system, +network-oriented benchmarking. + +Netperf4 is still undergoing evolution. Those wishing to work with or +on netperf4 are encouraged to join the +@uref{http://www.netperf.org/cgi-bin/mailman/listinfo/netperf-dev,netperf-dev} +mailing list and/or peruse the +@uref{http://www.netperf.org/svn/netperf4/trunk,current sources}. + +@node Concept Index, Option Index, Netperf4, Top +@unnumbered Concept Index + +@printindex cp + +@node Option Index, , Concept Index, Top +@comment node-name, next, previous, up +@unnumbered Option Index + +@printindex vr +@bye + +@c LocalWords: texinfo setfilename settitle titlepage vskip pt filll ifnottex +@c LocalWords: insertcopying cindex dfn uref printindex cp |