Title | Hardware Support for Prescient Instruction Prefetch |
Publication Type | Conference Paper |
Year of Publication | 2004 |
Authors | Aamodt, T. M., P. Chow, P. Hammarlund, H. Wang, and J. P. Shen |
Conference Name | High Performance Computer Architecture, 2004. HPCA-10. Proceedings. 10th International Symposium on |
Pagination | 84 - 84 |
Date Published | feb. |
Abstract | This paper proposes and evaluates hardware mechanisms for supporting prescient instruction prefetch #8212; an approach to improving single-threaded application performance by using helper threads to perform instruction prefetch. We demonstrate the need for enabling store-to-load communication and selective instruction execution when directly pre-executing future regions of an application that suffer I-cache misses. Two novel hardware mechanisms, safe-store and YAT-bits, are introduced that help satisfy these requirements. This paper also proposes and evaluates .nite state machine recall, a technique for limiting pre-execution to branches that are hard to predict by leveraging a counted I-prefetch mechanism. On a research Itanium #174;SMT processor with next line and streaming I-prefetch mechanisms that incurs latencies representative of next generation processors, prescient instruction prefetch can improve performance by an average of 10.0% to 22% on a set of SPEC 2000 benchmarks that suffer significant I-cache misses. Prescient instruction prefetch is found to be competitive against even the most aggressive research hardware instruction prefetch technique: fetch directed instruction prefetch. |
URL | http://dx.doi.org/10.1109/HPCA.2004.10028 |
DOI | 10.1109/HPCA.2004.10028 |