I first tried Acumem SlowSpotter on some highly optimized code. ...
No other tool would
have managed to pinpoint [the remaining] problem in such an obvious and intuitive
manner!
Jeffrey M. Birnbaum
Chief Architect, Merrill Lynch
At Pantor we use SlowSpotter to quickly gain a thorough understanding
of code performance qualities. We also use SlowSpotter on a regular
basis to alert us of harmful code and to quickly catch performance
regressions. It improves performance work productivity by offering
unique insight into these problems.
Any tool that lowers the barrier to taking advantage of these [multi-core] processors is going to be critical. It's clear that this focus on memory bandwidth is going to go far beyond HPC. To me that's huge.
Josh Simons
Distinguished engineer at Sun Microsystems
Acumem is a leading provider of intelligent software technology
which analyzes and optimizes the computing performance in single- and
multi core environments.
Our goal is to contribute to our customers success in maximizing
the benefits of multicore technology, achieving the full potential of
their systems.
Our performance experts ... use [Acumem SlowSpotter] to find and fix multicore
performance bottlenecks in a wide range of data-intensive applications.
Bjorn Andersson
Director, HPC and Integrated Systems, SUN Microsystems
... HP's Multicore Toolkit used in
conjunction with Acumem SlowSpotter, offers customers a complete
multicore hardware and software solution that maximizes application
performance ...
Ed
Turkel
Manager
of HPC Product Marketing for HP's Scalable Computing and
Infrastructure Organization
The continuous quest for increased performance has already forced the silicon vendors to introduce multi-cores on a large scale and the paradigm shift is happening now, it is happening quickly and it is happening all over the industry. The problem is that no one asked the software vendors!
When talking to partners and customers we see the large interest in and the vast opportunity being brought by multi-core, but we also see the concerns and the hurdles that needs to be overcome. For most existing applications the performance improvement enjoyed when moving to multi-cores will be disappointing at best. Hence, basically all performance intensive software applications need to be optimized to reap the benefits of multi-core technology. Millions of programmers will require new algorithms, new runtime systems and new tools. Acumem's technology has unique solutions for all three areas and will make the transition easier and limit the concerns and lower the hurdles. Summarized in four points are the key things on our customers and partners minds in terms of multi-core:
Our customers are concerned about multi-core investments and want to use the full potential of their systems
Our customers are looking for enhanced productivity in optimizing software
Our customers are wrestling with multi-core optimization and parallelization
Our customers would like to get an understanding for to what degree their applications are multi-core ready
Our customers are concerned about multi-core investments and want to use the full potential of their systems
For most existing applications performance improvement enjoyed when moving from single core to multi-core systems the will be disappointing at best. Moving beyond dual-core we even see degrading performance in many applications turning good multi-core hardware investments into bad performance investments. Hence, basically all performance intensive software applications need to be optimized to reap the benefits of multi-core technology and to utilize the new multi-core systems.
An investment in a new multi-core system is not completed before the relevant applications are optimized for the new architecture from a software perspective. When making a hardware investment make sure you also take into account and plan for the resource need of the software optimization.
Acumem enables you to get the full potential out of your multi-core systems!
Our customers are looking for enhanced productivity in optimizing software
As we all know optimizing code for performance is a time consuming and tedious task in many case handed over to experts. The multi-core era means a huge number of applications needs to be optimized. Existing experts will be too few and the time too short. The solution is to give all programmers the tools they need to make adequate optimizations for their existing code and the knowledge on how to make it right from the beginning for new code. As experts are still needed we need to make them more productive by enabling them to focus on solving the difficult problems and explaining vital code properties that could previously not be practically measured.
Acumem SlowSpotter and Acumem ThreadSpotter give hands on advice pinpointing SlowSpots and the relevant line of code to be changed hence enabling also non-expert to solve complex performance issues. For experts Acumem ThreadSpotter gives more flexibility in the data capturing, allows for an unlimited number of threads to be analyzed and also provides advice on more complex code changes.
Our customers are wrestling with multi-core optimization and parallelization
For most programmers terms like parallelization, loop fusion and cache misses are new or at least not something they worry about every day. Moving into multi-core this will change and these terms needs to be moved from the passive to the active vocabulary. The example below show what the impact could be of changing one data structure and making one loop fusion in a multi-core environment. It also highlights the performance issues you run into when not doing any optimizations to the existing code base.
Case study of SPEC2006 benchmark 462.libquantum
The original version shows little improvement in total throughput when deploying several instances on a multicore system. After analyzing this application using Acumem SlowSpotter or Acumem ThreadSpotter, and addressing the issues found (Poor spatial locality and a few cases of data re-use opportunities: loop fusion), we can get a dramatic increase in scalability and raw performance. All measurements in relation to throughput of one instance of the original code running on one core.
Acumem SlowSpotter and Acumem ThreadSpotter pinpoints the slowspots and gives the advice need for programmers to optimize their code without being a performance expert or spending days looking for opportunities they do not even know exists. In addition, Acumem SlowSpotter includes an educational context sensitive manual that allows the user to learn more and more about multi-core and optimization for multi-core as he or she makes the appropriate changes.
Our customers would like to get an understanding for to what degree their applications are multi-core ready
That introduction of multi-core processors is a significant change is easy to understand but how can I find out whether or not my application is ready for multi-core or if there are improvement areas? And if there are indeed slowspots how do I know if it is worth while optimizing? When is the right time to get back to my development group or ISV to ask them to make changes in the code for my new multi-core system? Questions like these are important but not easy to answer, but Acumem technology can assist also here. Acumem SpotLite is a entry level product that identifies slowspots and explains the characteristics of these. Try your applications and get a first idea of the potential improvement areas when you add additional cores to your system.