Heterogeneous Technology Mapping:

Scott Chin, Clarence Lee, Ernie Lin, Steve Wilton, University of British Columbia


On-chip storage has become an essential component of high-density FPGAs. The large systems that will be implemented on these FPGAs often require storage; implementing this storage on-chip results in faster clock frequencies and lower system costs. Two implementations of on-chip memory in FPGAs have emerged: fine-grained and course-grained. In FPGAs employing fine-grained on-chip storage, such as the Xilinx 4000 FPGAs, each lookup table can be configured as a small RAM, and these RAMs can be combined to implement larger user memories. FPGAs employing the coarse-grained approach, on the other hand, contain large embedded arrays which are used to implement the storage parts of circuits. Examples of such devies are the Altera 10K devices, the Actel 3200DX and SPGA parts, and the Lattice ispLSI 6192 FPGAs.

The coarse-grained approach results in significantly denser memory implementations, since the per-bit overhead is much smaller. Unfortunately, it also requires the FPGA vendor to partition the chip into memory and logic regions when the FPGA is designed. Since circuits have widely-varying memory requirements, this ``average-case'' partitioning may result in poor device utilizations for logic-intensive or memory-intensive circuits. In particular, if a circuit does not use all the available memory arrays to implement storage, the chip area devoted to the unused arrays is wasted.

This chip area need not be wasted, however, if the unused memory arrays are used to implement logic. Configuring the arrays as ROMs results in large multi-output lookup-tables that can very efficiently implement some logic circuits. An algorithm to do this autmotically is the focus of this research. We have developed an algorithm, called SMAP, that packs as much circuit information as possible into the available memory arrays, and maps the rest of the circuit into four-input lookup-tables. We have shown that this technique results in extremely dense logic implementations for many circuits; not only is the chip area of the unused arrays not wasted, but it is used more efficiently than if the arrays were replaced by logic blocks. Thus, even customers that do not require storage can benefit from embedded memory arrays.

We have developed two versions of SMAP: one that targets single-port memory arrays and one that targets dual-port memory arrays. We have shown that, on average, the dual-port algorithms packs between 29% and 35% more logic than the algorithm that targets single-port arrays. We have also shown, however, that even with this algorithm, dual-port arrays are still not as area-efficient as single-port arrays when implementing logic.

You can get the benchmark circuits used to evaluate the algorithm by clicking here.

Funding for this project has been provided by Cadence Design Systems.


Publications from this research project:


Back to Steve Wilton's Research Page

Back to Steve Wilton's Home Page