replacement policy. We then optimized the proposed policy
to further reduce hardware overhead. Overall, the proposed
replacement policy outperforms DRRIP (non-PC-based policy)
and achieves comparable performance to existing PC-based
replacement policies.
ACKNOWLEDGMENT
The authors would like to thank the anonymous reviewers
for their valuable feedback. Authors would also like to thank
Yasuko Eckert and Gabriel Loh for their feedback that helped
guide this work in its early stages.
REFERENCES
[1] “SimPoint,” http://www.cs.ucsd.edu/users/calder/simpoint/.
[2] “SPEC CPU 2006,” https://www.spec.org/cpu2006/.
[3]
“The 2nd cache replacement championship,” 2017. [Online]. Available:
https://crc2.ece.tamu.edu/
[4]
N. Beckmann and D. Sanchez, “Maximizing cache performance under
uncertainty,” in HPCA, 2017.
[5]
S. Das, T. M. Aamodt, and W. J. Dally, “Reuse distance-based
probabilistic cache replacement,” TACO, vol. 12, no. 4, 2015.
[6]
N. Duong, D. Zhao, T. Kim, R. Cammarota, M. Valero, and A. V.
Veidenbaum, “Improving cache management policies using dynamic
reuse distances,” in MICRO, 2012.
[7]
V. V. Fedorov, S. Qiu, A. N. Reddy, and P. V. Gratz, “Ari: Adaptive
llc-memory traffic management,” TACO, vol. 10, no. 4, 2013.
[8]
M. Ferdman, A. Adileh, O. Kocberber, S. Volos, M. Alisafaee,
D. Jevdjic, C. Kaynak, A. D. Popescu, A. Ailamaki, and B. Falsafi,
“Clearing the clouds: a study of emerging scale-out workloads on
modern hardware,” ACM SIGPLAN Notices, vol. 47, no. 4, 2012.
[9]
Q. Fettes, M. Clark, R. Bunescu, A. Karanth, and A. Louri, “Dynamic
voltage and frequency scaling in nocs with supervised and reinforcement
learning techniques,” IEEE Transactions on Computers, vol. 68, no. 3,
2018.
[10]
Henry, intel 32nm-22nm comparison. [Online]. Available: http:
//blog.stuffedcow.net/2012/10/intel32nm-22nm-core-i5-comparison
[11]
A. Jain and C. Lin, “Back to the future: leveraging belady’s algorithm
for improved cache replacement,” in ISCA, 2016.
[12]
A. Jaleel, K. B. Theobald, S. C. Steely Jr, and J. Emer, “High
performance cache replacement using re-reference interval prediction
(rrip),” ACM SIGARCH Computer Architecture News, vol. 38, no. 3,
2010.
[13]
D. A. Jim
´
enez and C. Lin, “Dynamic branch prediction with percep-
trons,” in HPCA, 2001.
[14]
D. A. Jim
´
enez and E. Teran, “Multiperspective reuse prediction,” in
MICRO, 2017.
[15]
G. Keramidas, P. Petoumenos, and S. Kaxiras, “Cache replacement
based on reuse-distance prediction,” in ICCD, 2007.
[16]
S. Khan, A. R. Alameldeen, C. Wilkerson, O. Mutluy, and D. A.
Jimenezz, “Improving cache performance using read-write partitioning,”
in 2014 IEEE 20th International Symposium on High Performance
Computer Architecture (HPCA). IEEE, 2014, pp. 452–463.
[17]
S. M. Khan, Y. Tian, and D. A. Jim
´
enez, “Dead block replacement and
bypass with a sampling predictor,” in MICRO, 2010.
[18]
M. Kharbutli and Y. Solihin, “Counter-based cache replacement and
bypassing algorithms,” IEEE Transactions on Computers, vol. 57, no. 4,
2008.
[19]
J. Kim, E. Teran, P. V. Gratz, D. A. Jim
´
enez, S. H. Pugsley, and
C. Wilkerson, “Kill the program counter: Reconstructing program
behavior in the processor cache hierarchy,” 2017.
[20]
H. Liu, M. Ferdman, J. Huh, and D. Burger, “Cache bursts: A new
approach for eliminating dead blocks and increasing cache efficiency,”
in MICRO, 2008.
[21]
V. Mnih, K. Kavukcuoglu, D. Silver, A. A. Rusu, J. Veness, M. G.
Bellemare, A. Graves, M. Riedmiller, A. K. Fidjeland, G. Ostrovski,
S. Petersen, C. Beattie, A. Sadik, I. Antonoglou, H. King, D. Kumaran,
D. Wierstra, S. Legg, and D. Hassabis, “Human-level control through
deep reinforcement learning,” Nature, vol. 518, no. 7540, 2015.
[22]
S. optimizing memory controllers: A reinforcementlearning approach,
“Self-optimizing memory controllers: A reinforcement learning ap-
proach,” in ISCA, 2008.
[23]
M. K. Qureshi, A. Jaleel, Y. N. Patt, S. C. Steely, and J. Emer, “Adaptive
insertion policies for high performance caching,” ACM SIGARCH
Computer Architecture News, vol. 35, no. 2, 2007.
[24]
Z. Shi, X. Huang, A. Jain, and C. Lin, “Applying deep learning to the
cache replacement problem,” in MICRO, 2019.
[25] R. Sutton and A. Barto, Reinforcement Learning. MIT Press, 1998.
[26]
Synopsys, Design Compiler User Guide. [Online]. Available:
http://www.synopsys.com/
[27]
M. Takagi and K. Hiraki, “Inter-reference gap distribution replacement:
an improved replacement algorithm for set-associative caches,” in
Supercomputing, 2004.
[28]
E. Teran, Z. Wang, and D. A. Jim
´
enez, “Perceptron learning for reuse
prediction,” in MICRO, 2016.
[29]
G. Tesauro, “Online resource allocation using decompositional rein-
forcement learning,” in AAAI, 2005.
[30]
C.-J. Wu, A. Jaleel, W. Hasenplaugh, M. Martonosi, S. C. Steely Jr,
and J. Emer, “Ship: Signature-based hit predictor for high performance
caching,” in MICRO, 2011.
[31]
C.-J. Wu and M. Martonosi, “Adaptive timekeeping replacement: Fine-
grained capacity management for shared cmp caches,” TACO, vol. 8,
no. 1, 2011.
[32]
J. Yin, Y. Eckert, S. Che, M. Oskin, and G. H. Loh, “Toward more
efficient noc arbitration: A deep reinforcement learning approach,”
AIDArch 2018.
[33]
J. Yin, S. Sethumurugan, Y. Eckert, C. Patel, A. Smith, E. Morton,
M. Oskin, N. E. Jerger, and G. H. Loh, “Experiences with ml-driven
design: A noc case study,” in HPCA, 2020.
[34]
V. Young, C.-C. Chou, A. Jaleel, and M. Qureshi, “Ship++: Enhancing
signature-based hit predictor for improved cache performance,” in CRC,
2017.
[35]
Y. Zeng and X. Guo, “Long short term memory based hardware
prefetcher: a case study,” in MEMSYS, 2017.
[36]
H. Zheng and A. Louri, “An energy-efficient network-on-chip design
using reinforcement learning,” in DAC, 2019.