line

RASCAL -- Timestamp-Based Selective Cache Allocation

Martin Karlsson
Information Technology
Uppsala Architecture Research Team
Uppsala University

Abstract
The behavior of the memory hierarchy is key to high performance in today's GHz microprocessors. The cache level closest to the processor is limited in size and associativity in order to match the short cycle time of the CPU. While a substantial amount of cache research has been devoted to replacement policies that minimize the conflict misses caused by limited associativity, little effort has been put into finding selective cache allocation policies that treats the limited space of the L1 cache with care. Even though only the data objects that are reused soon again will benefit from allocation in the small L1 cache, all data objects are allocated in most of the L1 cache implementations.

It has been shown that optimal allocation algorithms with knowledge about the future can drastically cut the miss ratio by only allocating data objects that are reused before the objects currently residing in the L1 cache. In this paper we show that the effect of optimal allocation can be further enhanced by the introduction of a small filtering cache, K1, used as a staging buffer in addition to the L1. We further evaluate three options for practical selective cache allocation algorithms in combination with a small staging cache. We show that the RASCAL,Runtime Adaptive Selective Cache ALlocation algorithm, an algorithm based on time stamps, is a practical and efficient option.

The RASCAL algorithm detects a good L1 object by monitoring the duration between two consecutive allocations of a cache block as measured in the new time unit cache allocation ticks, CAT. CAT is shown to be a fairly accurate and application-independent way of detecting good allocation candidates. A reduced RASCAL implementation requires 8 extra bits of time stamp information for each 64 byte cache block in the L2. Resulting in less than 2% SRAM overhead. Given a 2KB streaming buffer the RASCAL algorithm reduces the miss ratio with more than 40 %, for 5 out of 14 applications and performs better or comparable to a conventional cache with optimal allocation in 4 out of fourteen applications.


ASTEC seminar and UART-seminar
June 12, 2001

Place: Information technology, Uppsala University
Room: 1113
Time: 15.15-16.00 (+ discussions)

Room 1113 is in building 1, floor 1, room 13 (in the southern part of the building).

Help on how to find ASTEC Seminars.

There will be an extended period for discussions after the seminar.

Speakers are encouraged to give an short (5 min) introduction to the subject at the begining of the talk.
Listeners are excused if they have to leave after 16.00.

Everyone is welcome !

line
Updated 06-Jun-2001 17:40 by Roland Grönroos
e-mail: info -at- astec.uu.se    Location: http://www.astec.uu.se/Seminars/01/0612.shtml