aboutsummaryrefslogtreecommitdiff
path: root/README.md
diff options
context:
space:
mode:
authorhayati ayguen <h_ayguen@web.de>2019-12-27 12:33:21 +0100
committerhayati ayguen <h_ayguen@web.de>2019-12-27 12:33:21 +0100
commit969eecbea792cf74ccdfeab9a1b643ea067160bd (patch)
tree15d7600a31a99164284e97125a1fad1704d9eb48 /README.md
parentb9804a2e356ab7e65215cca21add0002e7d00939 (diff)
downloadpffft-969eecbea792cf74ccdfeab9a1b643ea067160bd.tar.gz
benchmark results in separate repository: https://github.com/hayguen/pffft_benchmarks
Signed-off-by: hayati ayguen <h_ayguen@web.de>
Diffstat (limited to 'README.md')
-rw-r--r--README.md352
1 files changed, 4 insertions, 348 deletions
diff --git a/README.md b/README.md
index ecf1851..c36aa54 100644
--- a/README.md
+++ b/README.md
@@ -100,354 +100,10 @@ provides "unordered" FFTs , and a fourier domain convolution
operation.
-## Benchmark results (cpu tested: core i7 2600, core 2 quad, core 1 duo, atom N270, cortex-A9, cortex-A15, A8X)
+## Benchmark results
-The benchmark shows the performance of various fft implementations measured in
-MFlops, with the number of floating point operations being defined as 5Nlog2(N)
-for a length N complex fft, and 2.5*Nlog2(N) for a real fft.
-See [here](http://www.fftw.org/speed/method.html) for an explanation of these formulas.
+The benchmark results are stored in a separate git-repository:
+See [https://github.com/hayguen/pffft_benchmarks](https://github.com/hayguen/pffft_benchmarks).
-### MacOS Lion, gcc 4.2, 64-bit, fftw 3.3 on a 3.4 GHz core i7 2600
-
-Built with:
-
-```
-gcc-4.2 -o test_pffft -arch x86_64 -O3 -Wall -W pffft.c test_pffft.c fftpack.c -L/usr/local/lib -I/usr/local/include/ -DHAVE_VECLIB -framework veclib -DHAVE_FFTW -lfftw3f
-```
-
-| input len |real FFTPack| real vDSP | real FFTW | real PFFFT |cplx FFTPack| cplx vDSP | cplx FFTW | cplx PFFFT |
-|----------:|-----------:|-----------:|-----------:|-----------:|-----------:|-----------:|-----------:|-----------:|
-| 64 | 2816 | 8596 | 7329 | 8187 | 2887 | 14898 | 14668 | 11108 |
-| 96 | 3298 | n/a | 8378 | 7727 | 3953 | n/a | 15680 | 10878 |
-| 128 | 3507 | 11575 | 9266 | 10108 | 4233 | 17598 | 16427 | 12000 |
-| 160 | 3391 | n/a | 9838 | 10711 | 4220 | n/a | 16653 | 11187 |
-| 192 | 3919 | n/a | 9868 | 10956 | 4297 | n/a | 15770 | 12540 |
-| 256 | 4283 | 13179 | 10694 | 13128 | 4545 | 19550 | 16350 | 13822 |
-| 384 | 3136 | n/a | 10810 | 12061 | 3600 | n/a | 16103 | 13240 |
-| 480 | 3477 | n/a | 10632 | 12074 | 3536 | n/a | 11630 | 12522 |
-| 512 | 3783 | 15141 | 11267 | 13838 | 3649 | 20002 | 16560 | 13580 |
-| 640 | 3639 | n/a | 11164 | 13946 | 3695 | n/a | 15416 | 13890 |
-| 768 | 3800 | n/a | 11245 | 13495 | 3590 | n/a | 15802 | 14552 |
-| 800 | 3440 | n/a | 10499 | 13301 | 3659 | n/a | 12056 | 13268 |
-| 1024 | 3924 | 15605 | 11450 | 15339 | 3769 | 20963 | 13941 | 15467 |
-| 2048 | 4518 | 16195 | 11551 | 15532 | 4258 | 20413 | 13723 | 15042 |
-| 2400 | 4294 | n/a | 10685 | 13078 | 4093 | n/a | 12777 | 13119 |
-| 4096 | 4750 | 16596 | 11672 | 15817 | 4157 | 19662 | 14316 | 14336 |
-| 8192 | 3820 | 16227 | 11084 | 12555 | 3691 | 18132 | 12102 | 13813 |
-| 9216 | 3864 | n/a | 10254 | 12870 | 3586 | n/a | 12119 | 13994 |
-| 16384 | 3822 | 15123 | 10454 | 12822 | 3613 | 16874 | 12370 | 13881 |
-| 32768 | 4175 | 14512 | 10662 | 11095 | 3881 | 14702 | 11619 | 11524 |
-| 262144 | 3317 | 11429 | 6269 | 9517 | 2810 | 11729 | 7757 | 10179 |
-| 1048576 | 2913 | 10551 | 4730 | 5867 | 2661 | 7881 | 3520 | 5350 |
-
-
-### Debian 6, gcc 4.4.5, 64-bit, fftw 3.3.1 on a 3.4 GHz core i7 2600
-
-Built with:
-
-```
-gcc -o test_pffft -DHAVE_FFTW -msse2 -O3 -Wall -W pffft.c test_pffft.c fftpack.c -L$HOME/local/lib -I$HOME/local/include/ -lfftw3f -lm
-```
-
-| N (input length) | real FFTPack | real FFTW | real PFFFT | cplx FFTPack | cplx FFTW | cplx PFFFT |
-|-----------------:|-------------:|-------------:|-------------:|-------------:|-------------:|-------------:|
-| 64 | 3840 | 7680 | 8777 | 4389 | 20480 | 11171 |
-| 96 | 4214 | 9633 | 8429 | 4816 | 22477 | 11238 |
-| 128 | 3584 | 10240 | 10240 | 5120 | 23893 | 11947 |
-| 192 | 4854 | 11095 | 12945 | 4854 | 22191 | 14121 |
-| 256 | 4096 | 11703 | 16384 | 5120 | 23406 | 13653 |
-| 384 | 4395 | 14651 | 12558 | 4884 | 19535 | 14651 |
-| 512 | 5760 | 13166 | 15360 | 4608 | 23040 | 15360 |
-| 768 | 4907 | 14020 | 16357 | 4461 | 19628 | 14020 |
-| 1024 | 5120 | 14629 | 14629 | 5120 | 20480 | 15754 |
-| 2048 | 5632 | 14080 | 18773 | 4693 | 12516 | 16091 |
-| 4096 | 5120 | 13653 | 17554 | 4726 | 7680 | 14456 |
-| 8192 | 4160 | 7396 | 13312 | 4437 | 14791 | 13312 |
-| 9216 | 4210 | 6124 | 13473 | 4491 | 7282 | 14970 |
-| 16384 | 3976 | 11010 | 14313 | 4210 | 11450 | 13631 |
-| 32768 | 4260 | 10224 | 10954 | 4260 | 6816 | 11797 |
-| 262144 | 3736 | 6896 | 9961 | 2359 | 8965 | 9437 |
-| 1048576 | 2796 | 4534 | 6453 | 1864 | 3078 | 5592 |
-
-
-
-### MacOS Snow Leopard, gcc 4.0, 32-bit, fftw 3.3 on a 1.83 GHz core 1 duo
-
-Built with:
-
-```
-gcc -o test_pffft -DHAVE_FFTW -DHAVE_VECLIB -O3 -Wall -W pffft.c test_pffft.c fftpack.c -L/usr/local/lib -I/usr/local/include/ -lfftw3f -framework veclib
-```
-
-| input len |real FFTPack| real vDSP | real FFTW | real PFFFT |cplx FFTPack| cplx vDSP | cplx FFTW | cplx PFFFT |
-|----------:|-----------:|-----------:|-----------:|-----------:|-----------:|-----------:|-----------:|-----------:|
-| 64 | 745 | 2145 | 1706 | 2028 | 961 | 3356 | 3313 | 2300 |
-| 96 | 877 | n/a | 1976 | 1978 | 1059 | n/a | 3333 | 2233 |
-| 128 | 951 | 2808 | 2213 | 2279 | 1202 | 3803 | 3739 | 2494 |
-| 192 | 1002 | n/a | 2456 | 2429 | 1186 | n/a | 3701 | 2508 |
-| 256 | 1065 | 3205 | 2641 | 2793 | 1302 | 4013 | 3912 | 2663 |
-| 384 | 845 | n/a | 2759 | 2499 | 948 | n/a | 3729 | 2504 |
-| 512 | 900 | 3476 | 2956 | 2759 | 974 | 4057 | 3954 | 2645 |
-| 768 | 910 | n/a | 2912 | 2737 | 975 | n/a | 3837 | 2614 |
-| 1024 | 936 | 3583 | 3107 | 3009 | 1006 | 4124 | 3821 | 2697 |
-| 2048 | 1057 | 3585 | 3091 | 2837 | 1089 | 3889 | 3701 | 2513 |
-| 4096 | 1083 | 3524 | 3092 | 2733 | 1039 | 3617 | 3462 | 2364 |
-| 8192 | 874 | 3252 | 2967 | 2363 | 911 | 3106 | 2789 | 2302 |
-| 9216 | 898 | n/a | 2420 | 2290 | 865 | n/a | 2676 | 2204 |
-| 16384 | 903 | 2892 | 2506 | 2421 | 899 | 3026 | 2797 | 2289 |
-| 32768 | 965 | 2837 | 2550 | 2358 | 920 | 2922 | 2763 | 2240 |
-| 262144 | 738 | 2422 | 1589 | 1708 | 610 | 2038 | 1436 | 1091 |
-| 1048576 | 528 | 1207 | 845 | 880 | 606 | 1020 | 669 | 1036 |
-
-
-
-### Ubuntu 11.04, gcc 4.5, 32-bit, fftw 3.2 on a 2.66 core 2 quad
-
-Built with:
-
-```
-gcc -o test_pffft -DHAVE_FFTW -msse -mfpmath=sse -O3 -Wall -W pffft.c test_pffft.c fftpack.c -L/usr/local/lib -I/usr/local/include/ -lfftw3f -lm
-```
-
-| input len |real FFTPack| real FFTW | real PFFFT |cplx FFTPack| cplx FFTW | cplx PFFFT |
-|----------:|-----------:|-----------:|-----------:|-----------:|-----------:|-----------:|
-| 64 | 1920 | 3614 | 5120 | 2194 | 7680 | 6467 |
-| 96 | 1873 | 3549 | 5187 | 2107 | 8429 | 5863 |
-| 128 | 2240 | 3773 | 5514 | 2560 | 7964 | 6827 |
-| 192 | 1765 | 4569 | 7767 | 2284 | 9137 | 7061 |
-| 256 | 2048 | 5461 | 7447 | 2731 | 9638 | 7802 |
-| 384 | 1998 | 5861 | 6762 | 2313 | 9253 | 7644 |
-| 512 | 2095 | 6144 | 7680 | 2194 | 10240 | 7089 |
-| 768 | 2230 | 5773 | 7549 | 2045 | 10331 | 7010 |
-| 1024 | 2133 | 6400 | 8533 | 2133 | 10779 | 7877 |
-| 2048 | 2011 | 7040 | 8665 | 1942 | 10240 | 7768 |
-| 4096 | 2194 | 6827 | 8777 | 1755 | 9452 | 6827 |
-| 8192 | 1849 | 6656 | 6656 | 1752 | 7831 | 6827 |
-| 9216 | 1871 | 5858 | 6416 | 1643 | 6909 | 6266 |
-| 16384 | 1883 | 6223 | 6506 | 1664 | 7340 | 6982 |
-| 32768 | 1826 | 6390 | 6667 | 1631 | 7481 | 6971 |
-| 262144 | 1546 | 4075 | 5977 | 1299 | 3415 | 3551 |
-| 1048576 | 1104 | 2071 | 1730 | 1104 | 1149 | 1834 |
-
-
-
-### Ubuntu 11.04, gcc 4.5, 32-bit, fftw 3.3 on a 1.6 GHz Atom N270
-
-Built with:
-
-```
-gcc -o test_pffft -DHAVE_FFTW -msse -mfpmath=sse -O3 -Wall -W pffft.c test_pffft.c fftpack.c -L/usr/local/lib -I/usr/local/include/ -lfftw3f -lm
-```
-
-| N (input length) | real FFTPack | real FFTW | real PFFFT | cplx FFTPack | cplx FFTW | cplx PFFFT |
-|-----------------:|-------------:|-------------:|-------------:|-------------:|-------------:|-------------:|
-| 64 | 452 | 1041 | 1336 | 549 | 2318 | 1781 |
-| 96 | 444 | 1297 | 1297 | 503 | 2408 | 1686 |
-| 128 | 527 | 1525 | 1707 | 543 | 2655 | 1886 |
-| 192 | 498 | 1653 | 1849 | 539 | 2678 | 1942 |
-| 256 | 585 | 1862 | 2156 | 594 | 2777 | 2244 |
-| 384 | 499 | 1870 | 1998 | 511 | 2586 | 1890 |
-| 512 | 562 | 2095 | 2194 | 542 | 2973 | 2194 |
-| 768 | 545 | 2045 | 2133 | 545 | 2365 | 2133 |
-| 1024 | 595 | 2133 | 2438 | 569 | 2695 | 2179 |
-| 2048 | 587 | 2125 | 2347 | 521 | 2230 | 1707 |
-| 4096 | 495 | 1890 | 1834 | 492 | 1876 | 1672 |
-| 8192 | 469 | 1548 | 1729 | 438 | 1740 | 1664 |
-| 9216 | 468 | 1663 | 1663 | 446 | 1585 | 1531 |
-| 16384 | 453 | 1608 | 1767 | 398 | 1476 | 1664 |
-| 32768 | 456 | 1420 | 1503 | 387 | 1388 | 1345 |
-| 262144 | 309 | 385 | 726 | 262 | 415 | 840 |
-| 1048576 | 280 | 351 | 739 | 261 | 313 | 797 |
-
-
-
-### Windows 7, visual c++ 2010 on a 1.6 GHz Atom N270
-
-Built with:
-
-```
-cl /Ox -D_USE_MATH_DEFINES /arch:SSE test_pffft.c pffft.c fftpack.c
-```
-
-(visual c++ is definitively not very good with SSE intrinsics...)
-
-| N (input length) | real FFTPack | real PFFFT | cplx FFTPack | cplx PFFFT |
-|-----------------:|-------------:|-------------:|-------------:|-------------:|
-| 64 | 173 | 1009 | 174 | 1159 |
-| 96 | 169 | 1029 | 188 | 1201 |
-| 128 | 195 | 1242 | 191 | 1275 |
-| 192 | 178 | 1312 | 184 | 1276 |
-| 256 | 196 | 1591 | 186 | 1281 |
-| 384 | 172 | 1409 | 181 | 1281 |
-| 512 | 187 | 1640 | 181 | 1313 |
-| 768 | 171 | 1614 | 176 | 1258 |
-| 1024 | 186 | 1812 | 178 | 1223 |
-| 2048 | 190 | 1707 | 186 | 1099 |
-| 4096 | 182 | 1446 | 177 | 975 |
-| 8192 | 175 | 1345 | 169 | 1034 |
-| 9216 | 165 | 1271 | 168 | 1023 |
-| 16384 | 166 | 1396 | 165 | 949 |
-| 32768 | 172 | 1311 | 161 | 881 |
-| 262144 | 136 | 632 | 134 | 629 |
-| 1048576 | 134 | 698 | 127 | 623 |
-
-
-
-### Ubuntu 12.04, gcc-4.7.3, 32-bit, with fftw 3.3.3 (built with --enable-neon), on a 1.2GHz ARM Cortex A9 (Tegra 3)
-
-Built with:
-
-```
-gcc-4.7 -O3 -DHAVE_FFTW -march=armv7-a -mtune=cortex-a9 -mfloat-abi=hard -mfpu=neon -ffast-math test_pffft.c pffft.c -o test_pffft_arm fftpack.c -lm -I/usr/local/include/ -L/usr/local/lib/ -lfftw3f
-```
-
-| input len |real FFTPack| real FFTW | real PFFFT |cplx FFTPack| cplx FFTW | cplx PFFFT |
-|----------:|-----------:|-----------:|-----------:|-----------:|-----------:|-----------:|
-| 64 | 549 | 452 | 731 | 512 | 602 | 640 |
-| 96 | 421 | 272 | 702 | 496 | 571 | 602 |
-| 128 | 498 | 512 | 815 | 597 | 618 | 652 |
-| 160 | 521 | 536 | 815 | 586 | 669 | 625 |
-| 192 | 539 | 571 | 883 | 485 | 597 | 626 |
-| 256 | 640 | 539 | 975 | 569 | 611 | 671 |
-| 384 | 499 | 610 | 879 | 499 | 602 | 637 |
-| 480 | 518 | 507 | 877 | 496 | 661 | 616 |
-| 512 | 524 | 591 | 1002 | 549 | 678 | 668 |
-| 640 | 542 | 612 | 955 | 568 | 663 | 645 |
-| 768 | 557 | 613 | 981 | 491 | 663 | 598 |
-| 800 | 514 | 353 | 882 | 514 | 360 | 574 |
-| 1024 | 640 | 640 | 1067 | 492 | 683 | 602 |
-| 2048 | 587 | 640 | 908 | 486 | 640 | 552 |
-| 2400 | 479 | 368 | 777 | 422 | 376 | 518 |
-| 4096 | 511 | 614 | 853 | 426 | 640 | 534 |
-| 8192 | 415 | 584 | 708 | 386 | 622 | 516 |
-| 9216 | 419 | 571 | 687 | 364 | 586 | 506 |
-| 16384 | 426 | 577 | 716 | 398 | 606 | 530 |
-| 32768 | 417 | 572 | 673 | 399 | 572 | 468 |
-| 262144 | 219 | 380 | 293 | 255 | 431 | 343 |
-| 1048576 | 202 | 274 | 237 | 265 | 282 | 355 |
-
-Same platform as above, but this time pffft and fftpack are built with clang 3.2:
-
-```
-clang -O3 -DHAVE_FFTW -march=armv7-a -mtune=cortex-a9 -mfloat-abi=hard -mfpu=neon -ffast-math test_pffft.c pffft.c -o test_pffft_arm fftpack.c -lm -I/usr/local/include/ -L/usr/local/lib/ -lfftw3f
-```
-
-| input len |real FFTPack| real FFTW | real PFFFT |cplx FFTPack| cplx FFTW | cplx PFFFT |
-|----------:|-----------:|-----------:|-----------:|-----------:|-----------:|-----------:|
-| 64 | 427 | 452 | 853 | 427 | 602 | 1024 |
-| 96 | 351 | 276 | 843 | 337 | 571 | 963 |
-| 128 | 373 | 512 | 996 | 390 | 618 | 1054 |
-| 160 | 426 | 536 | 987 | 375 | 669 | 914 |
-| 192 | 404 | 571 | 1079 | 388 | 588 | 1079 |
-| 256 | 465 | 539 | 1205 | 445 | 602 | 1170 |
-| 384 | 366 | 610 | 1099 | 343 | 594 | 1099 |
-| 480 | 356 | 507 | 1140 | 335 | 651 | 931 |
-| 512 | 411 | 591 | 1213 | 384 | 649 | 1124 |
-| 640 | 398 | 612 | 1193 | 373 | 654 | 901 |
-| 768 | 409 | 613 | 1227 | 383 | 663 | 1044 |
-| 800 | 411 | 348 | 1073 | 353 | 358 | 809 |
-| 1024 | 427 | 640 | 1280 | 413 | 692 | 1004 |
-| 2048 | 414 | 626 | 1126 | 371 | 640 | 853 |
-| 2400 | 399 | 373 | 898 | 319 | 368 | 653 |
-| 4096 | 404 | 602 | 1059 | 357 | 633 | 778 |
-| 8192 | 332 | 584 | 792 | 308 | 616 | 716 |
-| 9216 | 322 | 561 | 783 | 299 | 586 | 687 |
-| 16384 | 344 | 568 | 778 | 314 | 617 | 745 |
-| 32768 | 342 | 564 | 737 | 314 | 552 | 629 |
-| 262144 | 201 | 383 | 313 | 227 | 435 | 413 |
-| 1048576 | 187 | 262 | 251 | 228 | 281 | 409 |
-
-So it looks like, on ARM, gcc 4.7 is the best at scalar floating point
-(the fftpack performance numbers are better with gcc), while clang is
-the best with neon intrinsics (see how pffft perf has improved with
-clang 3.2).
-
-
-### NVIDIA Jetson TK1 board, gcc-4.8.2. The cpu is a 2.3GHz cortex A15 (Tegra K1).
-
-Built with:
-
-```
-gcc -O3 -march=armv7-a -mtune=native -mfloat-abi=hard -mfpu=neon -ffast-math test_pffft.c pffft.c -o test_pffft_arm fftpack.c -lm
-```
-
-| input len |real FFTPack| real PFFFT |cplx FFTPack| cplx PFFFT |
-|----------:|-----------:|-----------:|-----------:|-----------:|
-| 64 | 1735 | 3308 | 1994 | 3744 |
-| 96 | 1596 | 3448 | 1987 | 3572 |
-| 128 | 1807 | 4076 | 2255 | 3960 |
-| 160 | 1769 | 4083 | 2071 | 3845 |
-| 192 | 1990 | 4233 | 2017 | 3939 |
-| 256 | 2191 | 4882 | 2254 | 4346 |
-| 384 | 1878 | 4492 | 2073 | 4012 |
-| 480 | 1748 | 4398 | 1923 | 3951 |
-| 512 | 2030 | 5064 | 2267 | 4195 |
-| 640 | 1918 | 4756 | 2094 | 4184 |
-| 768 | 2099 | 4907 | 2048 | 4297 |
-| 800 | 1822 | 4555 | 1880 | 4063 |
-| 1024 | 2232 | 5355 | 2187 | 4420 |
-| 2048 | 2176 | 4983 | 2027 | 3602 |
-| 2400 | 1741 | 4256 | 1710 | 3344 |
-| 4096 | 1816 | 3914 | 1851 | 3349 |
-| 8192 | 1716 | 3481 | 1700 | 3255 |
-| 9216 | 1735 | 3589 | 1653 | 3094 |
-| 16384 | 1567 | 3483 | 1637 | 3244 |
-| 32768 | 1624 | 3240 | 1655 | 3156 |
-| 262144 | 1012 | 1898 | 983 | 1503 |
-| 1048576 | 876 | 1154 | 868 | 1341 |
-
-The performance on the tegra K1 is pretty impressive. I'm not
-including the FFTW numbers as they as slightly below the scalar
-fftpack numbers, so something must be wrong (however it seems to be
-correctly configured and is using neon simd instructions).
-
-When using clang 3.4 the pffft version is even a bit faster, reaching
-5.7 GFlops for real ffts of size 1024.
-
-
-### iPad Air 2 with iOS9, xcode 8.0, arm64. The cpu is an Apple A8X, supposedly running at 1.5GHz.
-
-| input len |real FFTPack| real vDSP | real PFFFT |cplx FFTPack| cplx vDSP | cplx PFFFT |
-|----------:|-----------:|-----------:|-----------:|-----------:|-----------:|-----------:|
-| 64 | 2517 | 7995 | 6086 | 2725 | 13006 | 8495 |
-| 96 | 2442 | n/a | 6691 | 2256 | n/a | 7991 |
-| 128 | 2664 | 10186 | 7877 | 2575 | 15115 | 9115 |
-| 160 | 2638 | n/a | 8283 | 2682 | n/a | 8806 |
-| 192 | 2903 | n/a | 9083 | 2634 | n/a | 8980 |
-| 256 | 3184 | 11452 | 10039 | 3026 | 15410 | 10199 |
-| 384 | 2665 | n/a | 10100 | 2275 | n/a | 9247 |
-| 480 | 2546 | n/a | 9863 | 2341 | n/a | 8892 |
-| 512 | 2832 | 12197 | 10989 | 2547 | 16768 | 10154 |
-| 640 | 2755 | n/a | 10461 | 2569 | n/a | 9666 |
-| 768 | 2998 | n/a | 11355 | 2585 | n/a | 9813 |
-| 800 | 2516 | n/a | 10332 | 2433 | n/a | 9164 |
-| 1024 | 3109 | 12965 | 12114 | 2869 | 16448 | 10519 |
-| 2048 | 3027 | 12996 | 12023 | 2648 | 17304 | 10307 |
-| 2400 | 2515 | n/a | 10372 | 2355 | n/a | 8443 |
-| 4096 | 3204 | 13603 | 12359 | 2814 | 16570 | 9780 |
-| 8192 | 2759 | 13422 | 10824 | 2153 | 15652 | 7884 |
-| 9216 | 2700 | n/a | 9938 | 2241 | n/a | 7900 |
-| 16384 | 2280 | 13057 | 7976 | 593 | 4272 | 2534 |
-| 32768 | 768 | 4269 | 2882 | 606 | 4405 | 2604 |
-| 262144 | 724 | 3527 | 2630 | 534 | 2418 | 2157 |
-| 1048576 | 674 | 1467 | 2135 | 530 | 1621 | 2055 |
-
-I double-checked to make sure I did not make a mistake in the time
-measurements, as the numbers are much higher than what I initially
-expected. They are in fact higher than the number I get on the 2.8GHz
-Xeon of my 2008 mac pro.. (except for FFT lengths >= 32768 where
-having a big cache is useful). A good surprise is also that the perf
-is not too far from apple's vDSP (at least for the real FFT).
-
-
-## Benchmarking with GnuPlot figures
-
-On Linux systems, you can execute the shell script `bench_all.sh` to
-generate all benchmarks. If possible it also generates the gnuplot
-figures and creates the archive `pffft_bench_${HOSTNAME}.tar.gz` including
-the generated .csv tables and also the .png plot files.
-
-`bench_all.sh` accepts command line options: first optional argument
-is `ON` or `OFF` to switch usage of FFTW. Other arguments are forwarded
-into a text file, which is added to the archive file.
+This is to keep the sources small.