UPDATED ON: 2022-08-13
▒ PREFACE WORDS
As you remember, my previous daily driver was Ryzen 9590x 16c/32t. After some fun with it I’ve decided to drop it in favor to retro stuff.
Intel Xeon CPU based around Broadwell chip: E5-2696v4 22c/44t became the chip of the choice. It is OEM version of top of the line E5-2699v4 . It is a bit slower, around marginal 3-4%, but costs significantly less than 2699v4. Date of production: end of 2016.
▒ 14C/28T E5-2680v4 vs 22C/44T E5-2696v4
Testbed mobo: Huananzhi X99 F8
RAM: Samsung 64GB ECC REG 2133Mhz Quad channel configuration
Windows 7 score is the same for 2680v4 and 2696v4: 7.8 pts
My first build came with 14c/28t E5-2680v4.
So, let’s compare it with chosen 22c/44t E5-2696v4.
Modern i9-12900k provided just for comparison.
► MEMORY/CPU/FPU PERFORMANCE
|AIDA64 v6.70||E5-2680v4 14c/28t||E5-2696v4 22c/44t||i9-12900K 16c/32t|
|MEMORY READ||50527 MB/s||50088 MB/s||73393 MB/s|
|MEMORY WRITE||52992 MB/s||52710 MB/s||60842 MB/s|
|MEMORY COPY||47545 MB/s||49126 MB/s||61763 MB/s|
|MEMORY LATENCY||83.4 ns||83.9 ns||77.2ns|
|CPU PHOTOWORXX||23570 Mpix/s||23001 Mpix/s||39906 Mpix/s|
|CPU ZLIB||926 MB/s||1395 MB/s||1593 MB/s|
|CPU AES||46110 MB/s||69975 MB/s||207691 MB/s|
|CPU SHA3||3780 MB/s||5736 MB/s||6040 MB/s|
|FP32 RAY-TRACE||13856 KRay/s||20344 KRay/s||28059KRay/s|
|FP64 RAY-TRACE||7529 KRay/s||11080 KRay/s||15375KRay/s|
I’m quite amazed about the fact that in CPU Queen and FPU SinJulia tests mighty E-5 2696v4 managed to beat modern’n’efficient i9-12900K.
► FPU INTEGER AND DOUBLE PERFORMANCE
Mid range videocard results provided for comparison.
|AIDA64 v6.70||E5-2680v4 14c/28t||E5-2696v4 22c/44t||Geforce 1650 Ti|
|Single-Precision FLOPS||1254 GFLOPS||1830 GFLOPS||3349 GFLOPS|
|Double-Precision FLOPS||627 GFLOPS||915 GFLOPS||104 GFLOPS|
|24-bit Integer IOPS||314 GIOPS||458 GIOPS||3343 GIOPS|
|32-bit Integer IOPS||314 GIOPS||458 GIOPS||3343 GIOPS|
|64-bit Integer IOPS||81 GIOPS||123 GIOPS||815 GIOPS|
|AES-256||46106 MB/s||69971 MB/s||8868 MB/s|
|SHA-1 Hash||11025 MB/s||16728 MB/s||21954 MB/s|
|Single-Precision Julia||337 FPS||492 FPS||895 FPS|
|Double-Precision Mandel||179 FPS||261 FPS||25 FPS|
► CACHE PERFORMANCE
How fast CPU get access to frequently used data?
|AIDA64 v6.70||E5-2680v4 14c/28t||E5-2696v4 22c/44t|
|L1 Cache READ||2492 GB/s||3779 GB/s|
|L1 Cache WRITE||1268 GB/s||1923 GB/s|
|L1 Cache COPY||2534 GB/s||3844 GB/s|
|L1 Cache LATENCY||1.2 ns||1.1 ns|
|L2 Cache READ||845 GB/s||1287 GB/s|
|L2 Cache WRITE||395 GB/s||587 GB/s|
|L2 Cache COPY||574 GB/s||823 GB/s|
|L2 Cache LATENCY||3.7 ns||3.3 ns|
|L3 Cache READ||345 GB/s||368 GB/s|
|L3 Cache WRITE||197 GB/s||223 GB/s|
|L3 Cache COPY||260 GB/s||293 GB/s|
|L3 Cache LATENCY||19.6 ns||20.6 ns|
► NVME PERFORMANCE
Tested NVME: Samsung 980 PRO 1Tb [drive from my Ryzen build, obviously PCIe v3.0 will bottleneck such device]
|CRYSTAL DISK v8.0.4||E5-2680v4 14c/28t||E5-2696v4 22c/44t|
|READ SEQ1M/Q8T1||2832 MB/s||2790 MB/s|
|READ SEQ128K/Q32T1||3441 MB/s||2764 MB/s|
|READ RND4K/Q32T16||1440 MB/s||1580 MB/s|
|READ RND4K/Q1T1||80 MB/s||92 MB/s|
|WRITE SEQ1M/Q8T1||2913 MB/s||2220 MB/s|
|WRITE SEQ128K/Q32T1||3282 MB/s||3141 MB/s|
|WRITE RND4K/Q32T16||2116 MB/s||1953 MB/s|
|WRITE RND4K/Q1T1||261 MB/s||264 MB/s|
Weird results. It looks like that CPU with less cores dominates the situation, but in random readout of small 4k files hi-core version of CPU prevails by a small margin.
In 3D rendering Xeon E5-2696v4 sits right above AMD Threadripper 1950X.
As we can observe 2696v4 provides significant boost to multicore performance.
So, i’m pretty happy that switched CPU and think that it is a top bang-per-buck in terms of ratio between price and multicore performance.
Stuffed with fast NVME drive and pleotra of 2400Mhz DDR4 ECC memory high end server platfrom from 2018 is still relevant in 2022 in serious workloads, which require high core count and have no preference to performance of single core.
In terms of single core performance of course it cannot compete with modern Ryzen architecture, but if we look from multi-threaded perspective we can find out that E5-26xx chips are very competitive, especially if we consider their recent dirt-cheap prices on Aliexpress and broad wide availability of decent Chinese clones of x99 motherboards. For my current build I’ve sticked with Huananzhi X99 F8 , it delivers support of high TDP cpus > 140Watt. E5-2696v4 consumes 150Watt under peak loads.
I’m very satisfied with Renoise performance.
Vegas Pro 18  can exploit all 44 threads without any issues.
Raster editing is super fast. Modern internet browsing and office workloads are a breeze.
Not long ago I have purchased second-hand Asrock X99 Extreme4
board to supply myself with spare mobo in case of short circuit or something.
Asrock has far superior power power circuitry compared to Chinese motherboards.
Will try it out later on.
TRIED AND NOT IMPRESSED, PERFORMANCE WISE HUANANZHI MATCHES ASROCK.
▒ 64C/128T THREADRIPPER 3990X
It is the most powerful CPU, which can be detected by Windows 7 correctly.
128 threads in task manager. How cool is that?
BY THE WAY, WINDOWS 7 FULLY SUPORTS UP TO 256 THREADS.
WHO SAID THAT WINDOWS 7 IS OBSOLETE?
As far as i know, later models like Threadripper PRO 3995WX with octal memory on socket SWRX80 are not compatible with Windows 7. Anything more recent is out of the question because of too big changes in how modern chipsets are working.
When cpu lithography will reach 1nm or even go fully quantum and prices will drop dramatically i’ll build 3990x configuration.
So, be prepared for maxed out Windows 7 workstation build ever!
▒ FUTURE TESTS
- 2133Mhz vs 2400Mhz DDR4 ECC REG memory benchmark probability 100% [will drop a link to appropriate article]
Asrock X99 Extreme4 vs Huananzhi X99 F8 speed benchmark probability 100%[Same performance as Huananzhi]
- AMD Threadripper 3990X 64C/128T under Windows 7 probability 88% [will drop a link to appropriate article]