That is right Multi-Core CPU’s while great for use desktops and laptops to support many background processes running is bad for scientific simulations. I will present data for this using Namd 2.6.
In the old days CPU builders focused on increasing single core speed. Most systems had only 1 core and only high end server had two cores, which lived in two sockets. Many research applications (Like Namd, or the better known Gromacs which is used in Folding@Home. Already were built to run on multiple CPU’s using MPI. MPI allowed researchers to tie together the power of many smaller systems to reach supper computer performance better than that available on purpose built supercomputers. Now even such purpose built machines are using commodity processors and MPI to reach new performance numbers.
To MPI a multi core CPU looks like N CPU’s where N is the number of cores. Thus now with modern quad core CPU’s users can run ‘mpirun -np 8 namd2′ (run namd on 8 cpus/cores). So what is bad about having many cores? CPU builders are rapidly increasing the total performance available in the same socket 1U (1.75 Inch) systems by adding cores. While it is great that a 8 total core box has a lot of performance, in the past that extra performance came from singe core improvements. Thus serial codes (those who can not use multiple cores) benefited and MPI codes benefited also.
With multi core, CPU builders have been lowering the performance of individual cores. That is serial applications, or applications with serial portions will now run slower. Look at the plot of namd running on a cluster of AMD cpus.

The CPU types are dual core Opteron 2218’s at 2.6 GHz and quad core Opteron 2356 “Barcelona’s” at 2.4 GHz. The Barcelona is AMD’s current (as of 12/2008) CPU and from the data it is shown that on 1 core the 2218 is faster. So if NAMD was a serial code the older 2218 would be a better choice. Now in the case of NAMD which scales fine to 8 cores we see that having quad core (if that is the only way to get more performance in a box) is ok. Remember that 4 cores total of 2218’s costs the same about as 8 cores of 2356’s because of the dual vs quad issue. This does not include the cost of power, rack, network etc. Thus for the same cost the 2356 is better because on 8 core namd reaches .45 Days/NanoSecond. While the 4 cores of the 2218 reaches only .70 Days/NanoSecond. Thus in a cost/performance at these small numbers the 2356 is a great deal for small labs running parallel codes up to a few tens of cores.
That is fine for many labs, and they will benefit greatly. The problem, and why Multi-Core is bad for science. Is at the margin. Some place some researcher is trying to run Namd not at 32 cpus as in the plot, but at 2048 and 4096. Namd will have a hard time reaching this limit. Many codes will have no speed improvements from 32 cores and up. Scaling many times has to do with network performance and memory bandwidth. Some applications can not be made parallel! Thus the the above user reached a given performance on 2218’s at 2048 cores, he will need many more 2356 (the newer and better CPU) cores to reach the performance he had before. That may not be possible.
As CPU builders add more and more cores and individual core speed drops, many researchers will find them selves needing to find new ways to make their applications which worked great at smaller number to scale further to just maintain performance on new hardware.
There are limitations to scaling. For example the simulation above was ~29,000 atoms. Atoms in the simulation are spread across cores, thus my upper limit is 29,000 cores. Not good, we need individual core speed to increase and be easily accessible to the programmer. More on that latter.