RASCAL -- Timestamp-Based Selective Cache Allocation
It has been shown that optimal allocation algorithms with knowledge about the future can drastically cut the miss ratio by only allocating data objects that are reused before the objects currently residing in the L1 cache. In this paper we show that the effect of optimal allocation can be further enhanced by the introduction of a small filtering cache, K1, used as a staging buffer in addition to the L1. We further evaluate three options for practical selective cache allocation algorithms in combination with a small staging cache. We show that the RASCAL,Runtime Adaptive Selective Cache ALlocation algorithm, an algorithm based on time stamps, is a practical and efficient option.
The RASCAL algorithm detects a good L1 object by monitoring the duration between two consecutive allocations of a cache block as measured in the new time unit cache allocation ticks, CAT. CAT is shown to be a fairly accurate and application-independent way of detecting good allocation candidates. A reduced RASCAL implementation requires 8 extra bits of time stamp information for each 64 byte cache block in the L2. Resulting in less than 2% SRAM overhead. Given a 2KB streaming buffer the RASCAL algorithm reduces the miss ratio with more than 40 %, for 5 out of 14 applications and performs better or comparable to a conventional cache with optimal allocation in 4 out of fourteen applications.
ASTEC seminar and UART-seminar