ARCHIVED: What is superscalar architecture?

This content has been archived, and is no longer maintained by Indiana University. Information here may no longer be accurate, and links may no longer be available or reliable.

Superscalar architecture is a method of parallel computing used in many processors. In a superscalar computer, the central processing unit (CPU) manages multiple instruction pipelines to execute several instructions concurrently during a clock cycle. This is achieved by feeding the different pipelines through a number of execution units within the processor. To successfully implement a superscalar architecture, the CPU's instruction fetching mechanism must intelligently retrieve and delegate instructions. Otherwise, pipeline stalls may occur, resulting in execution units that are often idle.

To visualize how this works, consider a hospital surgical unit that consists of areas for admittance, surgery, and recovery. Patients can move in only one direction, from admittance to recovery, and it takes the same amount of time to go through each of the areas. Assume the admitting area can handle three patients at a time and there are three surgical teams, each of which can work on a single patient. Also assume the recovery area has an indeterminate number of beds, but can accommodate only one person per bed. When the unit is working correctly, the admitting area processes three patients at a time, sends one to each of the teams, and immediately processes another three patients. Even though the surgical teams can handle only one patient at a time, because there are three of them, they will have passed their charges on by the time the new ones arrive. The paths the three patients take are analogous to instructions flowing through three pipelines in a CPU clock cycle. The admitting area is like a fetching mechanism, the surgery teams are like execution units, and the recovery room is like the registers or cache to which the units write their results.

To illustrate the kind of problems that can occur in superscalar architectures, consider what would happen if the staff of the admitting area in the example were not very competent. For example, if they passed a patient in need of a kidney transplant to a surgical team before the donor kidney was available, the team wouldn't be able to go to work. Suddenly, there would be a bottleneck at the admitting area because only two surgical teams would be available for new patients. Another bottleneck could occur if a surgical team tried to assign a patient to an already occupied bed in the recovery area. Again, a bottleneck would appear because the team would not be available until the bed was emptied and the team could move the current patient into it. Stalls like this happen in processors when an execution unit tries to perform a task that is dependent on the results of as yet uncalculated instructions. This is why it is important that CPUs carefully manage the order in which they process instructions.