A real life single thread benchmark for Sparc versus Intel

At first special thanks to all Oracle team (Oracle Turkey, Martin and Michele) we worked on this benchmark together.

Comments about Benchmarks
Benchmarks are good to give some basic idea about the performance or capabilities of your system or application etc. For example, if an application lives a network bottleneck you can test it first with IPERF network benchmark and if performance is good there, you can then diagnose your application perhaps a parameter in your app cant use network well. So benchmarks are good but reality is not benchmark, your reality is always your application.

Dont use GCC for compiling code on SPARC CPU

we observed a real-life sample issue. We used GCC to compile the benchmark code. Then Oracle proved us that GCC did not recognized SPARC CPU modulo(%) command and used its own __umoddi3 software solution, software solution instead of hardware solution caused really worse performance result. So, when compiling code on SPARC, use Developer Studio instead of GCC.

Applied Single thread benchmark  - Quick Sort

Source codes are also attached. We used Developer Studio for compiling.
You can examine in detail from attachments.  (please examine
single-thread-benchmark-QUICK-SORT.zip)
Only results are Shown below.

 


SPARC CPU is M7 CPU with 4133 MHz
OS is Solaris 11.3 SRU 15.4

 


Intel CPU is E5-2690 2.90 GHz
OS is RHEL 6.8

 

RUN 1


1064 ms


410 ms

 

RUN 2


496 ms

 


384 ms

 

Lessons Learned

-       Dont use GCC for compiling code on SPARC, use Developer Studio instead.

-       Results differ a lot between 2 RUN. So, changing small parameters and making some tuning can differ a lot.

-       One of the major differences between 2 RUN are, code in RUN-1 includes FLOAT definitions but code in RUN-2 includes INTEGER definitions instead of FLOAT. SPARC CPU performs better with Integer operations. That was what Oracle told us before. Please examine the doc è When and How to use SPARC CPU

-       Oracle also told us before again that SPARC CPU can be maximum %30 worse (due to pipeline design) than Intel CPU on single thread performance, our results confirmed this. We made this benchmark because we lived a single-thread application issue after platform change, afterwards we wanted to eliminate CPU effect and application effect among performance results.

-       It is already known but again confirmed that CPU clock speed is not the essential parameter for performance, pipelines are more important. Oracle explained that Intel was 2 times faster at RUN-1 because RUN-1 included Floating Point and Intel’s Floating Point pipeline is twice of Oracle FP pipeline.

 

Please feel free to communicate by bulent.yucesoy@gmail.com