We took an Intel Core i7-7820X for a spin and compared the speed-up for scientific computations to Intel Core i5-4670. In the table below you can see some results, which are very typical across a large range of different scientific algorithms. The test run is from our "Efficient multithreading" example in the MtxVec demo. The code computes DFT using vectorized sin, cos, add, multiply and sum of vector.
| i5-4670, 3.40 GHz
|| i7-7820X, 3.60 GHz
|Pure pascal, one CPU core (not vectorized)||40.24s||34.59s|
|MtxVec, one CPU core (vectorized)||7.12s||5.86s|
|MtxVec with blocks, one CPU core||6.80s||4.67s|
|MtxVec with hand-written blocks||5.75s||4.25s|
|MtxVec threaded (naive)||9.12s||7.22s|
|MtxVec (threaded, with blocks)||1.77s||1.22s|
|MtxVec (threaded, blocks, DoForLoop, Annonymous method)||1.78s||1.18s|
|MtxVec (threaded, hand written blocks, DoForLoop)||1.54s||1.11s|
|MtxVec (threaded, blocks, TParallel.For)||2.93s||2.27s|
The code executed with MtxVec takes full advantage of all instruction set features. This includes AVX-512 included with i7 7820X. Both CPUs were using 4 cores only. Typical improvement across all variants is around 30%. Note that "turbo" frequencies between both CPUs are different. When using AVX, the CPU will also not "turbo boost" up to the highest frequency. i7-7820X was mostly boosting up to 4.0GHz and the i5-4670 remained at 3.4GHz. The test was run with "default" optimized motherboard configuration and without overclocking.
- Created on .