Meanwhile...
While the RISC philosophy was coming into its own, new ideas about how to dramatically increase performance of the CPUs were starting to gel.
In the early 1980s it was thought that existing design was reaching theoretical limits. Future improvements in speed would be primarily through improved "process", that is, smaller features on the chip. The complexity of the chip would remain largely the same, but the smaller size would allow it to run at higher clock rates. A considerable amount of effort was put into designing chips for parallel computing, with built-in communications links. Instead of making faster chips, a large number of chips would be used, dividing up problems among them. However history has shown that the original fears were not valid, and there were a number of ideas that dramatically improved performance in the late 1980s.
One idea was to include a pipeline which would break down instructions into steps, and work on one step of several different instructions at the same time. A normal processor might read an instruction, decode it, fetch the memory the instruction asked for, perform the operation, and then write the results back out. The key to pipelining is that the processor can start reading the next instruction as soon as it finishes the last, meaning that there are now two instructions being worked on (one is being read, the next is being decoded), and after another cycle there will be three. While no single instruction completed any faster, the ''next'\' instruction would complete right after. The illusion was of a much faster system.
Yet another solution was to use several processing elements inside the processor and run them in parallel. Instead of working on one instruction to add two numbers, these superscalar processors would look at the next instruction in the pipeline and attempt to run it at the same time in an identical unit. This is not a very easy thing to do however, as many instructions in computing depend on the results of some other instruction.
Both of these techniques relied on increasing speed by adding complexity to the basic layout of the CPU, as opposed to the instructions running on them. With chip space being a finite quantity, in order to include these features something else would have to be removed to make room. RISC was tailor made to take advantage of these techniques, because the core logic of the CPU was considerably simpler than in CISC designs. Although the first RISC designs had marginal performance, they were able to quickly add these new design features and by the late 1980s they were completely outperforming their CISC counterparts. In time this would be addressed as process improved to the point where all of this could be added to a CISC design and still fit on a single chip, but this took most of the late-80s and early 90s.
The long and short of it is that for any given level of general performance, a RISC chip will typically have many fewer transistors dedicated to the core logic. This allows the designers considerable flexibility; they can, for instance:
- increase the size of the register set
- implement measures to increase internal parallelism
- add huge caches
- add other functionality, like I/O and timers for microcontrollers
- add vector (SIMD) processors like AltiVec
- build the chips on older lines, which would otherwise go unused
- do nothing; offer the chip for low-power or size-limited applications
Features which are generally found in RISC designs are uniform instruction encoding (e.g. the op-code is always in the same bit positions in each instruction, which is always one word long), which allows faster decoding; a homogenous register set, allowing any register to be used in any context and simplifying compiler design (although there are almost always separate