报告人简介: Chita Das是宾夕法尼亚州立大学计算机科学和工程系统卓越教授。他主要的研究兴趣包括大规模计算、多核体系结构、性能评价、容错计算和云计算。他在片上互连网络和高速网络互连分析及设计方向做出突出成绩。他在上述领域发表了200多篇论文,很多获得最佳论文奖。他是很多学术会议和组织的主席或者成员,也是很多著名学术期刊的主编。他也是IEEE会士。 Biography: Chita Das is a Distinguished Professor of Computer Science and Engineering at the Pennsylvania State University. His main areas of interest include CMPs and manycore architectures, performance evaluation, fault-tolerant computing, and Clouds/datacenters. In particular, he has worked extensively in the area of design and analysis of interconnection networks/on-chip interconnects. He has published more than 200 papers in the above areas, has received several best paper awards, and has served on many program committees, and editorial boards. He is a Fellow of the IEEE.
报告摘要: STT存储器是未来的非易失性内存技术,它具有很多特性,包括高密度、低漏电和读延迟。但是STT存储器的一个主要缺点是有较长的写延迟,这传统基于静态RAM的缓冲相比,这一点限制其在片上缓冲的广泛使用。然而使用合适的机制能够减小STT内存的写开销,因此有可能为片上多处理器设计出高能效和高密度的缓冲。 在本次讲座中,我将讨论两个补充技术去减少STT内存的写延迟。第一个方面重点设计一个优化的网络级方案,这个方案基于调度的思想,对于STT内存bank,若出现大量写繁忙的请求,将这些请求调度到其他的处于空闲的bank中,以此减少写繁忙带来的延迟。该方法的目的是通过调度的方法将STT内存写延迟隐藏起来。我们采用的第二种方法是通过调整数据保存的时间以达到减少写延迟的目的。我们主张通过适当的放松STT内存的非易失性,也就是通过适当的调整STT内存中数据保存的时间来换取写延迟的提升。我们采用该技术优化片上多处理器环境下的片上缓存结构。此外,我们深入比较了该技术与传统的SRAM技术。 Abstract: Spin-Transfer Torque RAM (STT-RAM) is an emerging non-volatile memory technology that possesses many attractive characteristics such as high density, low leakage and low read access latency. However, one of the major drawbacks of STT-RAM technology is its long write latency, which impedes its progress for wide spread adoption for on-chip caches compared to the traditional SRAM based caches. By adopting suitable mechanisms that can minimize the latency overhead of STT-RAM writes, it is possible to design energy-efficient and high density caches for CMPs. In this talk, I will discuss two complementary techniques to mitigate the write overhead of STT-RAM. The first approach centers on designing an elegant network level solution. This approach is based on the observation that instead of staggering requests to a write-busy STT-RAM bank, the network should schedule requests to other idle cache banks for effectively hiding the latency. While the first approach attempts to hide the STT-RAM write latency, our second approach focuses on reducing this write latency by tuning its data-retention time. We argue that by relaxing the non-volatility feature of STT-RAMs to have data-retention time in the range of milliseconds, we can optimize the on-chip cache architecture for CMPs. The advantages of both these techniques compared to the SRAM based cache architecture will be discussed. |