Dukutek.com - Browsing through a manufacturer’s website can offer a startling view of
the product line up. Such was the case when I sprawled through
Gigabyte’s range, only to find that they offer server line products,
including dual processor motherboards. These are typically sold in a
B2B environment (to system builders and integrators) rather than to the
public, but after a couple of emails they were happy to send over their
GA-7PESH1 model and a couple of Xeon CPUs for testing. Coming from a
background where we used dual processor systems for some serious CPU
Workstation throughput, it was interesting to see how the Sandy Bridge-E
Xeons compared to consumer grade hardware for getting the job done.
In my recent academic career as a computational chemist, we developed
our own code to solve issues of diffusion and migration. This started
with implicit grid solvers – everyone in the research group (coming from
chemistry backgrounds rather than computer science backgrounds), as
part of their training, wrote their own grid and solver classes in C++
which would be the backbone of the results obtained in their doctorate
degree. Due to the idiosyncratic nature of coders and learning how to
code, some of the students naturally wrote classes were easily
multi-threaded at a high level, whereas some used a large amount of
localized cache which made multithreading impractical. Nevertheless,
single threaded performance was a major part in being able to obtain the
results of the simulations which could last from seconds to weeks. As
part of my role in the group, I introduced the chemists to OpenMP which
sped up some of their simulations, but as a result caused the shift in
writing this code towards the multithreaded. I orchestrated the
purchasing of dual processor (DP) Nehalem workstations from Dell (the
preferred source of IT equipment for the academic institution (despite
my openness to build in-house custom hardware) in order to speed up the
newly multithreaded code (with ECC memory for safety), and then embarked
on my own research which looked at off-the-shelf FEM solvers then
explicit calculations to parallelize the code at a low level, which took
me to GPUs, which resulted in nine first author research papers overall
in those three years.
In a lot of the simulations written during that period by the multiple
researchers, one element was consistent – trying to use as much
processor power as possible. When one of us needed more horsepower for a
larger number of simulations, we used each other’s machines to get the
job done quicker. Thus when it came to purchasing those DP machines, I
explored the SR-2 route and the possibility of self-building the
machines, but this was quickly shot down by the IT department who
preferred pre-built machines with a warranty. In the end we purchased
three dual E5520 systems, to give each machine 8 cores / 16 threads of
processing power, as well as some ECC memory (thankfully the nature of
the simulations required no more than a few megabytes each), to fit into
the budget. When I left that position, these machines were still going
strong, with one colleague using all three to correlate the theoretical
predictions with experimental results.
Since leaving that position and working for AnandTech, I still partake
in exploring other avenues where my research could go into, albeit in my
spare time without funding. Thankfully moving to a single OCed Sandy
Bridge-E processor let me keep the high level CPU code comparable to
during the research group, even if I don’t have the ECC memory. The GPU
code is also faster, moving from a GTX480 during research to 580/680s
now. One of the benchmarks in my motherboard reviews is derived from
one of my research papers – regular readers of our motherboard reviews
will recognize the 3DPM benchmark from those reviews and in the review
today, just to see how far computation has gone. Being a chemist rather
than a computer scientist, the code for this benchmark could be
comparable to similar non-CompSci trained individuals – from a
complexity point of view it is very basic, slightly optimized to perform
faster calculations (FMA) but not the best it could be in terms of full
blown SSE/SSE2/AVX extensions et al.
With the vast number of possible uses for high performance systems, it
would be impossible for me to cover them all. Johan de Gelas, our
server reviewer, lives and breathes this type of technology, and hence
his benchmark suite deals more with virtualization, VMs and database
accessing. As my perspective is usually from performance and utility,
the review of this motherboard will be based around my history and
perspective. As I mentioned previously, this product is primarily B2B
(business to business) rather than B2C (business to consumer), however
from a home build standpoint, it offers an alternative to the two main
Sandy Bridge-E based Xeon home-build workstation products in the market –
the ASUS Z9PE-D8 WS and the EVGA SR-X. Hopefully we will get these
other products in as comparison points for you.
0 comments:
Post a Comment