Hybrid analytical modeling of pending cache hits, data prefetching, and MSHRs

TitleHybrid analytical modeling of pending cache hits, data prefetching, and MSHRs
Publication TypeConference Paper
Year of Publication2008
AuthorsChen, X. E., and T. M. Aamodt
Conference NameMicroarchitecture, 2008. MICRO-41. 2008 41st IEEE/ACM International Symposium on
Pagination59 -70
Date Publishednov.
Keywordscache storage, data prefetching, hybrid analytical modeling, instruction profile window, instruction trace analysis, latency memory systems, logic design, microprocessor chips, miss status holding register, MSHR, pending cache hits
Abstract

As the number of transistors integrated on a chip continues to increase, a growing challenge is accurately modeling performance in the early stages of processor design. Analytical models have been employed to rapidly search for higher performance designs, and can provide insights that detailed simulators may not. This paper proposes techniques to predict the impact of pending cache hits, hardware prefetching, and realistic miss status holding register (MSHR) resources on superscalar performance in the presence of long latency memory systems when employing hybrid analytical models that apply instruction trace analysis. Pending cache hits are secondary references to a cache block for which a request has already been initiated but has not yet completed. We find pending hits resulting from spatial locality and the fine-grained selection of instruction profile window blocks used for analysis both have non-negligible influences on the accuracy of hybrid analytical models and subsequently propose techniques to account for their effects. We then introduce techniques to estimate the performance impact of data prefetching by modeling the timeliness of prefetches and to account for a limited number of MSHRs by restricting the size of profile window blocks. As with earlier hybrid analytical models, our approach is roughly two orders of magnitude faster than detailed simulations. When modeling pending hits for a processor with unlimited outstanding misses we improve the accuracy of our baseline by a factor of 3.9, decreasing average error from 39.7% to 10.3%. When modeling a processor with data prefetching, a limited number of MSHRs, or both, the techniques result in an average error of 13.8%, 9.5% and 17.8%, respectively.

URLhttp://dx.doi.org/10.1109/MICRO.2008.4771779
DOI10.1109/MICRO.2008.4771779

a place of mind, The University of British Columbia

Electrical and Computer Engineering
2332 Main Mall
Vancouver, BC Canada V6T 1Z4
Tel +1.604.822.2872
Fax +1.604.822.5949
Email:

Emergency Procedures | Accessibility | Contact UBC | © Copyright 2019 The University of British Columbia