Back in the late 70s, my dad ordered a Heathkit H-89 computer. It had a 2 Mhz 8-bit Z-80 processor, took months to arrive, cost $1600 (in 1980 dollars!) and we had to put it together. Now, you can go to the nearest Best Buy and walk out with a ~3Ghz system for a few hundred dollars. While that’s a staggering increase in price performance, you may not have noticed: we haven’t seen anything much faster than 3-4 Ghz for a few years.
That’s because we’ve “hit the wall” for single-processor CPU performance, and we’re at the limit for CMOS processes, circuit performance, and instruction level parallelism (ILP). New processors from Intel and AMD are “spreading sideways”, implementing multiple CPU cores (2, 4 or even 8 processors). Future processors will have even more cores, and you can “rent” as many additional processors as you need, in the cloud, on the fly.
This is a fundamental change in CPU performance architecture, and it’s forcing software developers to think differently. For decades, you could speed up your software by just waiting for the next (faster) CPU. Now, that’s no longer the case.
This leaves us with many large, complicated legacy code bases (e.g. database engines, solid modeling kernels, etc.) that need to be completely redesigned to take full advantage of multiple cores. That, in turn, will create new opportunities — someone will step in to build multi-core scalable implementations of this stuff.