Loading...
Search for: cache-memories
0.006 seconds
Total 119 records

    A highly fault detectable cache architecture for dependable computing

    , Article Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) ; Volume 3219 , 2004 , Pages 45-59 ; 03029743 (ISSN); 3540231765 (ISBN); 9783540231769 (ISBN) Zarandi, H. R ; Miremadi, S. G ; Sharif University of Technology
    Springer Verlag  2004
    Abstract
    Information integrity in cache memories is a fundamental requirement for dependable computing. As caches comprise much of a CPU chip area and transistor counts, they are reasonable targets for single and multiple transient faults. This paper presents: 1) a fault detection scheme for tag arrays of cache memories and 2) an architectural cache to improve dependability as well as performance. In this architecture, cache space is divided into sets of different sizes and different tag lengths. The error detection scheme and the cache architecture have been evaluated using a trace driven simulation with soft error injection and SPEC 2000 applications. The results show that error detection... 

    Cache replacement policy based on expected hit count

    , Article IEEE Computer Architecture Letters ; 2017 ; 15566056 (ISSN) Vakil Ghahani, A ; Mahdizadeh Shahri, S ; Lotfi Namin, M ; Bakhshalipour, M ; Lotfi Kamran, P ; Sarbazi Azad, H ; Sharif University of Technology
    Abstract
    Memory-intensive workloads operate on massive amounts of data that cannot be captured by last-level caches (LLCs) of modern processors. Consequently, processors encounter frequent off-chip misses, and hence, lose significant performance potential. One of the components of a modern processor that has a prominent influence on the off-chip miss traffic is LLC's replacement policy. Existing processors employ a variation of least recently used (LRU) policy to determine the victim for replacement. Unfortunately, there is a large gap between what LRU offers and that of Belady's MIN, which is the optimal replacement policy. Belady's MIN requires selecting a victim with the longest reuse distance,... 

    Cache replacement policy based on expected hit count

    , Article IEEE Computer Architecture Letters ; Volume 17, Issue 1 , 2018 , Pages 64-67 ; 15566056 (ISSN) Vakil Ghahani, A ; Mahdizadeh Shahri, S ; Lotfi Namin, M. R ; Bakhshalipour, M ; Lotfi Kamran, P ; Sarbazi Azad, H ; Sharif University of Technology
    Institute of Electrical and Electronics Engineers Inc  2018
    Abstract
    Memory-intensive workloads operate on massive amounts of data that cannot be captured by last-level caches (LLCs) of modern processors. Consequently, processors encounter frequent off-chip misses, and hence, lose significant performance potential. One of the components of a modern processor that has a prominent influence on the off-chip miss traffic is LLC's replacement policy. Existing processors employ a variation of least recently used (LRU) policy to determine the victim for replacement. Unfortunately, there is a large gap between what LRU offers and that of Belady's MIN, which is the optimal replacement policy. Belady's MIN requires selecting a victim with the longest reuse distance,... 

    An Intelligent L2 Management Method in GPUs

    , M.Sc. Thesis Sharif University of Technology Javadinezhad, Ahmad (Author) ; Sarbazi Azad, Hamid (Supervisor)
    Abstract
    To capture on-chip memory locality, tolerate off-chip memory latency, and expeditiously process memory-bound GPGPU applications, Graphics Processing Units (GPUs) introduce a local L1D cache and a shared L2 cache within and between streaming multiprocessors (SMs), respectively. The L2 cache solves the problem of data coherency and sharing between SMs (unlike the L1D cache). Prior work shows that loading all data into the L2 cache without a proper mechanism to manage the input data rate, poses some challenges (e.g., cache contention/trashing, increased write-back traffic, and bandwidth inefficiency) and ultimately puts a lot of pressure on off-chip memory. In this paper, we make the... 

    Adaptive prefetching using global history buffer in multicore processors

    , Article Journal of Supercomputing ; Vol. 68, issue. 3 , June , 2014 , p. 1302-1320 ; ISSN: 9208542 Naderan Tahan, M ; Sarbazi Azad, H ; Sharif University of Technology
    Abstract
    Data prefetching is a well-known technique to hide the memory latency in the last-level cache (LCC). Among many prefetching methods in recent years, the Global History Buffer (GHB) proves to be efficient in terms of cost and speedup. In this paper, we show that a fixed value for detecting patterns and prefetch degree makes GHB to (1) be conservative while there are more opportunities to create new addresses and (2) generate wrong addresses in the presence of constant strides. To resolve these problems, we separate the pattern length from the prefetching degree. The result is an aggressive prefetcher that can generate more addresses with a given pattern length. Furthermore with a variable... 

    On the optimality of 0–1 data placement in cache networks

    , Article IEEE Transactions on Communications ; 2017 ; 00906778 (ISSN) Salehi, M. J ; Motahari, S. A ; Hossein Khalaj, B ; Sharif University of Technology
    Abstract
    Considering cache enabled networks, optimal content placement minimizing the total cost of communication in such networks is studied, leading to a surprising fundamental 0–1 law for non-redundant cache placement strategies, where the total cache sizes associated with each file does not exceed the file size. In other words, for such strategies we prove that any nonredundant cache placement strategy can be transformed, with no additional cost, to a strategy in which at every node, each file is either cached completely or not cached at all. Moreover, we obtain a sufficient condition under which the optimal cache placement strategy is in fact non-redundant. This result together with the 0–1 law... 

    RAW-Tag: Replicating in altered cache ways for correcting multiple-bit errors in tag array

    , Article IEEE Transactions on Dependable and Secure Computing ; Volume 16, Issue 4 , 2019 , Pages 651-664 ; 15455971 (ISSN) Farbeh, H ; Mozafari, F ; Zabihi, M ; Miremadi, S. G ; Sharif University of Technology
    Institute of Electrical and Electronics Engineers Inc  2019
    Abstract
    Tag array in on-chip caches is one of the most vulnerable components to radiation-induced soft errors. Protecting the tag array in some processors is limited to error detection using the parity check, since the overheads of error correcting codes are not affordable in this component. State-of-The-Art tag protection schemes combine the parity check with replication to provide error correction capability. Classifying these replication-based schemes into partial-replication and full-replication, the former offers a low overhead protection in which a large fraction of detectable errors remain uncorrectable, whereas the latter imposes a significant overhead to correct all of the errors. This... 

    On the optimality of 0-1 data placement in cache networks

    , Article IEEE Transactions on Communications ; Volume 66, Issue 3 , March , 2018 , Pages 1053-1063 ; 00906778 (ISSN) Salehi, M. J ; Motahari, S. A ; Hossein Khalaj, B ; Sharif University of Technology
    Institute of Electrical and Electronics Engineers Inc  2018
    Abstract
    Considering cache enabled networks, optimal content placement minimizing the total cost of communication in such networks is studied, leading to a surprising fundamental 0-1 law for non-redundant cache placement strategies, where the total cache sizes associated with each file does not exceed the file size. In other words, for such strategies, we prove that any non-redundant cache placement strategy can be transformed, with no additional cost, to a strategy in which at every node, each file is either cached completely or not cached at all. Moreover, we obtain a sufficient condition under which the optimal cache placement strategy is in fact non-redundant. This result together with the 0-1... 

    Improving The Performance of Network Processors Based of the Advanced Processors Schemes

    , M.Sc. Thesis Sharif University of Technology Khajuee, Farokh (Author) ; Jahangir, Amir Hossein (Supervisor)
    Abstract
    Up to now two kinds of hardware have been used in routers: ASIC processors or general purpose processors. These two solutions have their own flaws. ASIC hardwares are fast but they are not flexible for implementation or development of new applications. And general purpose processors are not fast enough for fast data rate of network. On the other hand, there is an ever increasing gap between the memory access speed and network data rate. At this time the main bottleneck in packet processing systems is memory. For solving this problem, two groups of solution have been introduced. The first group tries to reduce the demand for memory by using cache or longer system words. The second group uses... 

    Performance and Power-Efficient Design of Non-Volatile Shared Caches in Multi-Core Systems

    , M.Sc. Thesis Sharif University of Technology Shafahi, Mohammad Hassan (Author) ; Sarbazi Azad, Hamid (Supervisor)
    Abstract
    Emerging memory technologies such as STT-RAM, PCM and resistive RAM are probable technologies for caches and main memories of the future multi-core architectures. This is because of their high density, low leakage current and non-volatility. Nevertheless, the overhead of latency and energy consumption of write operation in these technologies are the main open problems. Previous works have suggested various solutions, in architecture and circuit levels, to reduce the writing overheads. In this research, we study the integration of STT-RAM in 3-dimensional multi-core environments; and propose solutions to address the problem of writing overheads when using this technology in cache... 

    Reliability Improvement of Non-Volatile Cache Memories Against Wearout

    , M.Sc. Thesis Sharif University of Technology Asadi, Sina (Author) ; Miremadi, Ghassem (Supervisor)
    Abstract
    In recent years we have witnessed a growth in handheld or wearable technologies among others, which have tested the limits of SRAM. The technological advancements of recent years and a persistent trend in reduced feature sizes have been testing the limits of SRAM for quite a while. It is becoming clear that because of high leakage the SRAM technology does not satisfy the ever increasing demand for reduced cost and dimensions, and, as a result Non-Volatile Memory with lower leakage property is a promising alternative to bridge the aforementioned gaps. But limited endurance is one of its weaknesses.In this paper we propose a novel method dubbed “online write-prevention coding” to incorporate... 

    Reliability Enhancement of Cache Memories Based on Non-Volatile Cells

    , M.Sc. Thesis Sharif University of Technology Ghaemi, Golsana (Author) ; Miremadi, Ghassem (Supervisor)
    Abstract
    Nowadays, leakage energy constitutes up to 80% of total cache energy consumption and tag array is responsible for a considerable fraction of static energy consumption. An approach to reduce static energy consumption is to replace SRAMs by STT-RAMs with near zero leakage power. However, a problem of an STT-RAM cell is its limited write endurance. In spite of previous studies which have targeted the data array, in this study STT-RAMs are used in the L1 tag array. To solve the write endurance problem, this study proposes an STT-RAM/SRAM tag architecture. Considering the spatial locality of memory references, the lower significant bit-lines of the tag update more. The SRAM part handles the... 

    A Spatial Locality-based Block Replacement Algoritghm in Cache Memories

    , M.Sc. Thesis Sharif University of Technology Ardalani, Newsha (Author) ; Sarbazi-Azad, Hamid (Supervisor)
    Abstract
    From the programmer’s point of view, main memory allocation to a program is contigious, however this is non-contigious and the program is scattered in physical memory here and there. Assuming main memory partitioned into regions, each program accesses in its life time different regions which are not necessariliy close and occupy different percentages of the cache capacity. Considering what the replacement policy is chosen, the cache would be partitioned differently among regions, e.g., the commonly used LRU policy partitions the cache among regions on a demand basis, giving the more cache resource to regions whose miss ratio is higher, which is not necessarily optimal. In this thesis, we... 

    Fault detection enhancement in cache memories using a high performance placement algorithm

    , Article Proceedings - 10th IEEE International On-Line Testing Symposium, IOLTS 2004, Madeira Island, 12 July 2004 through 14 July 2004 ; 2004 , Pages 101-106 ; 0769521800 (ISBN); 9780769521800 (ISBN) Zarandi, H. R ; Miremadi, S. G ; Sarbazi Azad, H ; Sharif University of Technology
    2004
    Abstract
    Data integrity of words coming out of the caches needs to be checked to assure their correctness. This paper proposes a cache placement scheme, which provides high performance as well as high fault detection coverage. In this scheme, the cache space is divided into sets of different sizes. Here, the length of tag fields associated to each set is unique and is different from the other sets. The other remained bits of tags are used for protecting the tag using a fault detection scheme e.g., generalized parity. This leads to protect the cache without compromising performance and area with respect to the similar one, fully associative cache. The results obtained from simulating some standard... 

    Design and Evaluation of an Efficient Cache Memory Used in Solid-State Disk Drives

    , M.Sc. Thesis Sharif University of Technology Haghdoost, Alireza (Author) ; Asadi, Hossein (Supervisor)
    Abstract
    In the past two decades, there has been a significant performance enhancement in processors by leveraging nano-scale semiconductor technologies and micro-architectural techniques. At the same time, there has been a limited performance improvement in storage devices. This performance gap results in a performance bottleneck in computer systems. To fill this gap, Solid-State Disks (SSDs) has been proposed in the previous work. Due to not using mechanical parts, SSDs can provide higher performance and lower power consumption compared to hard disk drives. Typically, SSDs use flash memory chips to store user data. Flash memory has some shortcomings such as limited endurance and low write... 

    System-Level Vulnerability Estimation For Components of Multiprocessor Systems

    , M.Sc. Thesis Sharif University of Technology Saadat, Mohammad Hashem (Author) ; Gorshi, Alireza (Supervisor)
    Abstract
    Cache memory is one of the most important parts of the microprocessors. Caches improve performance by bringing data and instructions near the processors and decreasing the access time to data and instructions. But caches are also vulnerable to some kind of error called soft error. When an error happens in a cache, suddenly can propagate throughout the system and affect the integrity and reliability of the overall system. There are actually two types of errors in cache memories, permanent (hard) errors and transient (soft) errors. Previous studies have shown that about 92% of system reboots are initiated by soft-error occurring in cache memory. Soft-errors have two main sources, alpha... 

    Data Tiering in Redundant Array of Independent Disks

    , M.Sc. Thesis Sharif University of Technology Tarihi, Mojtaba (Author) ; Asadi, Hossein (Supervisor) ; Sarbazi Azad, Hamid (Co-Advisor)
    Abstract
    With the technological advances in silicon technology, the price of flash based storage devices has significantly fallen and they are now a main choice for mass data storage. Solid state disks are however, still much more expensive per-capacity than hard disks and tiering can be utilized to use them cost-effectively. Tiering attempts to make use of the diversity offered by storage devices and backend configurations to support the diverse needs of I/O workloads. Tiering is generally done on top of static storage hierarchies and as such, moving a data block into a certain tier will dictate its configuration as well. In this research, an architecture is proposed than can independently encode... 

    A Communication Model between SIMT Cores for Improving GPU Performance

    , M.Sc. Thesis Sharif University of Technology Keshtegar, Mohammad Mahdi (Author) ; Hesabi, Shahin (Supervisor)
    Abstract
    In recent years, GPUs are becoming an ideal candidate for processing a variety of high performance applications. By relying on thousands concurrent threads in applications and the computational power of large numbers of computing units; GPGPUs have provided high performance and throughput. To achieve the potential computational power of GPGPUs in broader kinds of applications, we need to apply some modifications in their architecture. In the baseline architecture, the maximum part of chip area is devoted to SIMT cores which their communication is handled through an interconnection network and a slow off-chip memory. Recent research shows that out of many types of miss events the last level... 

    An Efficient Cache Design for Solid-state Drives

    , M.Sc. Thesis Sharif University of Technology Sharifi, Sina (Author) ; Sarbazi Azad, Hamid (Supervisor)
    Abstract
    The use of raed cache of hard disk or solid state drives has a magnificent effect on their performance, power consumption, and their endurance. However, one of the main problems of their read cache is its low hit ratio. Thus, the read cache must be used more efficiently. Based on previous studies, nearly all of the blocks which are brought to the read cache are zero-reuse or dead-on-arrival, which means that after the blocks enter to the read cache, they are not accessed until they are evicted. Thus, a large portion of the disk read cache is nearly unused because of these dead blocks. In this thesis, we will design a disk read cache which does not allow to the dead blocks to enter the... 

    Cache Management in Named Data Networks

    , M.Sc. Thesis Sharif University of Technology Ehsanpour, Mahsa (Author) ; Hemmatyar, Ali Mohammad Afshin (Supervisor)
    Abstract
    The Internet was originally designed for host-to-host communications but is currently being used for data dissemination and retrieval. This structural mismatch makes the Internet an inefficient architecture. To overcome these inefficiencies, NDN has been proposed as a promising architecture over the Internet. One of the fundamental characteristics of this novel architecture is in-network caching. In this thesis, by focusing on the in-network cache management challenge of NDN, we first formally state the problem of social welfare maximization subject to the cache capacity constraint for the cache-enabled nodes, and then propose a distributed in-network caching algorithm based on matching...