|
|
|||
|
|
|||
|
Next: Symbolic Analysis of Concurrent Systems Up: Research Subfields Previous: Design of VLSI Architectures for Low power Index: Contents Page Software Pipelining for Super-scalar and VLSI processorsSoftware pipelining is a widespread family of techniques aiming at finding an instruction-level parallel schedule of a loop. These techniques are useful for processors that can issue more that one instruction at a time (as super-scalar processors) or execute multiple operations simultaneously by using a ``long instruction'' to define them (as Very Long Instruction Word processors, commonly denoted as VLIW). The objective of this techniques is execute the loop as parallel as possible by optimizing machine resources. This implies execute each iteration of the loop as fast as possible by using the resources available in the machine, that are limited by its architecture. This problem is NP-Hard. During the last five years we have been working in software pipelining with resource constraints, attempting to reduce the execution time of a loop by using a given set of functional units. Our best heuristic techniques produce optimal results most times. By using an integer linear programming approach, we can also obtain optimal results (for all cases) or test how good the results obtained by the heuristic techniques are. Our heuristic techniques consume much less time that our integer linear programming techniques, and obtain very similar results. Software pipelining a loop increases a lot the ``register pressure''. Currently, we are working developing techniques to reduce the register pressure in parallel schedules without decreasing their execution throughput. This is an important problem in current machines that traditionally has been neglected. However, if enough registers are not available to execute the loop, some variables must be ``spilled out'' to memory, or the number of cycles of the schedule (initiation interval) must be increased. In both cases the schedule throughput decreases. Our current efforts are addressed to optimize both the throughput and the register pressure of the schedule, by using the set of functional units available in the architecture.
Next: Symbolic Analysis of Concurrent Systems Up: Research Subfields Previous: Design of VLSI Architectures for Low power Index: Contents Page
|
| Inicio | Presentación | Docencia | Investigación | Centros de Investigación | Novedades |
| ||
|
|