Sharif Digital Repository / Sharif University of Technology / Search result

MASTER: Reclamation of hybrid scratchpad memory to maximize energy saving in multi-core edge systems

, Article IEEE Transactions on Sustainable Computing ; 2021 ; 23773782 (ISSN) Shekarisaz, M ; Hoseinghorban, A ; Bazzaz, M ; Salehi, M ; Ejlali, A ; Sharif University of Technology

Institute of Electrical and Electronics Engineers Inc 2021

Abstract

Most modern multi-core edge devices work in outdoor situations with limited power supplies like energy harvester and batteries. Therefore, energy consumption is a fundamental issue in which the memory subsystem has a significant role. Scratchpad memories (SPM) can provide a broad potential for energy saving. Still, due to the insufficient SPM capacity in such edge devices, a rigorous SPM data allocation scheme is necessary to reduce the energy consumption of the memory subsystem. Emerging non-volatile memories (NVMs) are very useful to reduce the energy consumption of the memory subsystem. Therefore, embedded and edge devices can take advantage of hybrid SPM composed of both NVM and SRAM to...

ETICA: Efficient two-level I/O caching architecture for virtualized platforms

, Article IEEE Transactions on Parallel and Distributed Systems ; Volume 32, Issue 10 , 2021 , Pages 2415-2433 ; 10459219 (ISSN) Ahmadian, S ; Salkhordeh, R ; Mutlu, O ; Asadi, H ; Sharif University of Technology

IEEE Computer Society 2021

Abstract

In recent years, increased I/O demand of Virtual Machines (VMs) in large-scale data centers and cloud computing has encouraged system architects to design high-performance storage systems. One common approach to improving performance is to employ fast storage devices such as Solid-State Drives (SSDs) as an I/O caching layer for slower storage devices. SSDs provide high performance, especially on random requests, but they also have limited endurance: They support only a limited number of write operations and can therefore wear out relatively fast due to write operations. In addition to the write requests generated by the applications, each read miss in the SSD cache is served at the cost of...

Etica: Efficient Two-Level I/O caching architecture for virtualized platforms

, Article IEEE Transactions on Parallel and Distributed Systems ; Volume 32, Issue 10 , 2021 , Pages 2415-2433 ; 10459219 (ISSN) Ahmadian, S ; Salkhordeh, R ; Mutlu, O ; Asadi, H ; Sharif University of Technology

IEEE Computer Society 2021

Abstract

In recent years, increased I/O demand of Virtual Machines (VMs) in large-scale data centers and cloud computing has encouraged system architects to design high-performance storage systems. One common approach to improving performance is to employ fast storage devices such as Solid-State Drives (SSDs) as an I/O caching layer for slower storage devices. SSDs provide high performance, especially on random requests, but they also have limited endurance: They support only a limited number of write operations and can therefore wear out relatively fast due to write operations. In addition to the write requests generated by the applications, each read miss in the SSD cache is served at the cost of...

Virtual reservoir computer using an optical resonator

, Article Optical Materials Express ; Volume 12, Issue 3 , 2022 , Pages 1140-1153 ; 21593930 (ISSN) Boshgazi, S ; Jabbari, A ; Mehrany, K ; Memarian, M ; Sharif University of Technology

The Optical Society 2022

Abstract

Reservoir computing is a machine learning approach that enables us to use recurrent neural networks without involving the complexity of training algorithms and make hardware implementation possible. We present a novel photonic architecture of a reservoir computer that employs a nonlinear node and a resonator to implement a virtual recurrent neural network. This resonator behaves as an echo generator component that substitutes the delay line in delaybased reservoir computers available in the literature. The virtual neural network formed in our implementation is fundamentally different from the delay-based reservoir computers. Different virtual architectures based on the FSR and the Finesse of...

A case for PIM support in general-purpose compilers

, Article IEEE Design and Test ; Volume 39, Issue 2 , 2022 , Pages 84-89 ; 21682356 (ISSN) Sadeghi, P ; Ejlali, A ; Sharif University of Technology

IEEE Computer Society 2022

Abstract

This work presents a case for general support for processing-in-memory (PIM) in compilers and puts forth an approach to face it along with a simple model. The ultimate goal of the work is to implement the features in a general-purpose compiler that can compile for any homogeneous ISA system, so the benefits from PIM are not limited to niche use-cases. © 2013 IEEE

Adaptive characterisation of a human hand model during intercations with a telemanipulation system

, Article International Conference on Robotics and Mechatronics, ICROM 2015, 7 October 2015 through 9 October 2015 ; 2015 , Pages 688-693 ; 9781467372343 (ISBN) Esfandiari, M ; Sadeghnejad, S ; Farahmand, F ; Vosoughi, G ; Sharif University of Technology

2015

Abstract

Proper modeling of the human arm dynamic, as it interacts with telemanipulation and haptic systems, is important in enhancing the transparency of these systems. In this article, we introduced an adaptive identifier to estimate the impedance characteristic of a human operator as it interacts with a single translational degree of freedom mechanism. The five parameter model, including an extra spring and damper for a better approximation of the dynamic behavior of human arm, has been used. Since the impedance characteristic of human arm differs from one individual to another, it is important to estimate these parameters for each individual and update the controller to enhance the transparency...

Cluster-based approach for improving graphics processing unit performance by inter streaming multiprocessors locality

, Article IET Computers and Digital Techniques ; Volume 9, Issue 5 , August , 2015 , Pages 275-282 ; 17518601 (ISSN) Keshtegar, M. M ; Falahati, H ; Hessabi, S ; Sharif University of Technology

Institution of Engineering and Technology 2015

Abstract

Owing to a new platform for high performance and general-purpose computing, graphics processing unit (GPU) is one of the most promising candidates for faster improvement in peak processing speed, low latency and high performance. As GPUs employ multithreading to hide latency, there is a small private data cache in each single instruction multiple thread (SIMT) core. Hence, these cores communicate in many applications through the global memory. Access to this public memory takes long time and consumes large amount of power. Moreover, the memory bandwidth is limited which is quite challenging in parallel processing. The missed memory requests in last level cache that are followed by accesses...

Dynamic shared SPM reuse for real-time multicore embedded systems

, Article ACM Transactions on Architecture and Code Optimization ; Volume 12, Issue 2 , 2015 ; 15443566 (ISSN) Mohajjel Kafshdooz, M ; Ejlali, A ; Sharif University of Technology

Association for Computing Machinery 2015

Abstract

Allocating the scratchpad memory (SPM) space to tasks is a challenging problem in real-time multicore embedded systems that use shared SPM. Proper SPM space allocation is important, as it considerably influences the application worst-case execution time (WCET), which is of great importance in real-time applications. To address this problem, in this article we present a dynamic SPM reuse scheme, where SPM space can be reused by other tasks during runtime without requiring any static SPM partitioning. Although the proposed scheme is applied dynamically at runtime, the required decision making is fairly complex and hence cannot be performed at runtime. We have developed techniques to perform...

Application-based dynamic reconfiguration in optical network-on-chip

, Article Computers and Electrical Engineering ; Volume 45 , July , 2015 , Pages 417-429 ; 00457906 (ISSN) Falahati, H ; Koohi, S ; Hessabi, S ; Sharif University of Technology

Elsevier Ltd 2015

Abstract

We propose a new optical reconfigurable Network-on-Chip (NoC), named ReFaT ONoC (Reconfigurable Flat and Tree Optical NoC). ReFaT is a dynamically reconfigurable architecture, which customizes the topology and routing paths based on the application characteristics. ReFaT, as an all-optical NoC, routes optical packets based on their wavelengths. For this purpose, we propose a novel architecture for the optical switch, which eliminates the need for optical resource reservation, and thus avoids the corresponding latency and area overheads. As a key idea for dynamic reconfiguration, each application is mapped to a specific set of wavelengths and utilizes its dedicated routing algorithm. We...

A morphable phase change memory architecture considering frequent zero values

, Article Proceedings - IEEE International Conference on Computer Design: VLSI in Computers and Processors ; 2011 , Pages 373-380 ; 10636404 (ISSN) ; 9781457719523 (ISBN) Arjomand, M ; Jadidi, A ; Shafiee, A ; Sarbazi Azad, H ; Sharif University of Technology

Abstract

Phase Change Memory (PCM) is emerging as a high-dense and power-efficient choice for future main memory systems. While PCM cell size is marching towards minimum achievable feature size, recent prototypes effectively improve device scalability by storing multiple bits per each cell. Unfortunately, Multi-Level Cell (MLC) PCM devices offer higher access time and energy when compared to Single-Level Cell (SLC) counterparts making it difficult to incorporate MLC in main memory. To address this challenge, we proposes Zero-value-based Morphable PCM, ZM-PCM for short, a novel MLC-PCM main memory architecture which tries incorporating benefits of both MLC and SLC devices within the same structure....

A cache-assisted scratchpad memory for multiple-bit-error correction

, Article IEEE Transactions on Very Large Scale Integration (VLSI) Systems ; Volume 24, Issue 11 , 2016 , Pages 3296-3309 ; 10638210 (ISSN) Farbeh, H ; Sadat Mirzadeh, N ; Farhady Ghalaty, N ; Miremadi, S. G ; Fazeli, M ; Asadi, H ; Sharif University of Technology

Institute of Electrical and Electronics Engineers Inc 2016

Abstract

Scratchpad memory (SPM) is widely used in modern embedded processors to overcome the limitations of cache memory. The high vulnerability of SPM to soft errors, however, limits its usage in safety-critical applications. This paper proposes an efficient fault-tolerant scheme, called cache-assisted duplicated SPM (CADS), to protect SPM against soft errors. The main aim of CADS is to utilize cache memory to provide a replica for SPM lines. Using cache memory, CADS is able to guarantee a full duplication of all SPM lines. We also further enhance the proposed scheme by presenting buffered CADS (BCADS) that significantly improves the CADS energy efficiency. BCADS is compared with two well-known...

A hybrid Non-Volatile Cache Design for Solid-State Drives using comprehensive I/O characterization

, Article IEEE Transactions on Computers ; Volume 65, Issue 6 , 2016 , Pages 1678-1691 ; 00189340 (ISSN) Tarihi, M ; Asadi, H ; Haghdoost, A ; Arjomand, M ; Sarbazi Azad, H ; Sharif University of Technology

IEEE Computer Society

Abstract

The emergence of new memory technologies provides us with opportunity to enhance the properties of existing memory architectures. One such technology is Phase Change Memory (PCM) which boasts superior scalability, power savings, non-volatility, and a performance competitive to Dynamic Random Access Memory (DRAM). In this paper, we propose a write buffer architecture for Solid-State Drives (SSDs) which attempts to exploit PCM as a DRAM alternative while alleviating its issues such as long write latency, high write energy, and finite endurance. To this end and based on thorough I/O characterization of desktop and enterprise applications, we propose a hybrid DRAM-PCM SSD cache design with an...

Reconfigurable multicast routing for Networks on Chip

, Article Microprocessors and Microsystems ; Volume 42 , 2016 , Pages 180-189 ; 01419331 (ISSN) Nasiri, F ; Sarbazi Azad, H ; Khademzadeh, A ; Sharif University of Technology

Elsevier

Abstract

Several unicast and multicast routing protocols have been presented for MPSoCs. Multicast protocols in NoCs are used for cache coherency in distributed shared memory systems, replication, barrier synchronization, or clock synchronization. Unicast routing algorithms are not suitable for multicast, as they increase traffic, congestion and deadlock probability. Famous multicast schemes such as tree-based and path-based schemes have been proposed originally for multicomputers and recently adapted to NoCs. In this paper, we propose a switch tree-based multicast scheme, called STBA. This method supports tree construction with a minimum number of routers. Our evaluation results reveal that, for...

An operating system level data migration scheme in hybrid DRAM-NVM memory architecture

, Article Proceedings of the 2016 Design, Automation and Test in Europe Conference and Exhibition, DATE 2016, 14 March 2016 through 18 March 2016 ; 2016 , Pages 936-941 ; 9783981537062 (ISBN) Salkhordeh, R ; Asadi, H ; Sharif University of Technology

Institute of Electrical and Electronics Engineers Inc 2016

Abstract

With the emergence of Non-Volatile Memories (NVMs) and their shortcomings such as limited endurance and high power consumption in write requests, several studies have suggested hybrid memory architecture employing both Dynamic Random Access Memory (DRAM) and NVM in a memory system. By conducting a comprehensive experiments, we have observed that such studies lack to consider very important aspects of hybrid memories including the effect of: a) data migrations on performance, b) data migrations on power, and c) the granularity of data migration. This paper presents an efficient data migration scheme at the Operating System level in a hybrid DRAM-NVM memory architecture. In the proposed...

Data block partitioning for recovering stuck-at faults in PCMs

, Article 2017 IEEE International Conference on Networking, Architecture, and Storage, NAS 2017 - Proceedings, 7 August 2017 through 9 August 2017 ; 2017 ; 9781538634868 (ISBN) Asadinia, M ; Jalili, M ; Sarbazi Azad, H ; Sharif University of Technology

Abstract

Main burdens to the DRAM scalability are leakage and charge storage restrictions. Phase Change Memory (PCM) is being known as a promising candidate for the replacement of DRAM among competitive non-volatile memories. However, this memory suffers from low cell reliability due to limited write endurance. This problem can lead to some memory cells permanently stuck at either '0' or '1'. Therefore, a robust error recovery scheme is needed to overcome this problem and recover from hard errors. State-of-the-art solutions apply error correction and recovery techniques at inter-line or intra-line level. Precisely, they can improve PCM endurance either by remapping failed lines to spares (in...

OPTIMAS: overwrite purging through in-execution memory address snooping to improve lifetime of NVM-based scratchpad memories

, Article IEEE Transactions on Device and Materials Reliability ; Volume 17, Issue 3 , 2017 , Pages 481-489 ; 15304388 (ISSN) Hosseini Monazzah, A. M ; Farbeh, H ; Miremadi, S. G ; Sharif University of Technology

Abstract

SRAM-based scratchpad memories (SPMs) used in embedded systems impose high leakage power. Designing SPMs based on non-volatile memories (NVMs) were proposed as NVMs have negligible leakage power. The main problem of utilizing NVMs across the SPM is their limited number of write cycles (endurance). This problem threatens the reliability of NVM-based SPMs. To alleviate the problem of limited endurance in NVM-based SPMs, this paper proposes a method, called overwrite purging through in-execution memory address snooping (OPTIMAS). The main idea behind the proposed method is to control the lifetime of NVM-based SPMs, directly by a hardware unit, outside of the SPM mapping algorithm. This idea...

High-Performance predictable NVM-based instruction memory for real-time embedded systems

, Article IEEE Transactions on Emerging Topics in Computing ; 2018 ; 21686750 (ISSN) Bazzaz, M ; Hoseinghorban, A ; Poursafaei, F ; Ejlali, A ; Sharif University of Technology

IEEE Computer Society 2018

Abstract

Worst case execution time and energy consumption are two of the most important design constraints of real-time embedded systems. Many recent studies have tried to improve the memory subsystem of embedded systems by using emerging non-volatile memories. However, accessing these memories imposes performance and energy overhead and using them as the code memory could increase the worst case execution time of the system. In this paper, a new code memory architecture for non-volatile memories is proposed which reduces the effective memory access latency by employing memory access interleaving technique. Unlike common instruction access latency improvement techniques such as prefetching and...

A resistive ram-based FPGA architecture equipped with efficient programming circuitry

, Article IEEE Transactions on Circuits and Systems I: Regular Papers ; Volume 65, Issue 7 , 2018 , Pages 2196-2209 ; 15498328 (ISSN) Khaleghi, B ; Asadi, H ; Sharif University of Technology

Institute of Electrical and Electronics Engineers Inc 2018

Abstract

Despite the considerable effort has been put on the application of Non-Volatile Memories (NVMs) in Field-Programmable Gate Arrays FPGAs, previously suggested designs are not mature enough to substitute the state of-the-art SRAM-based counterparts mainly due to the inefficient building blocks and/or the overhead of programming structure which can impair their potential benefits. In this paper, we present a Resistive Random Access Memory RRAM-based FPGA architecture employing efficient Switch Box (SB) and Look-Up Table (LUT) designs with programming circuitry integrated in both SB and LUT designs that creates area and power efficient programmable components while precluding performance...

An efficient hybrid I/O caching architecture using heterogeneous SSDs

, Article IEEE Transactions on Parallel and Distributed Systems ; Volume 30, Issue 6 , 2019 , Pages 1238-1250 ; 10459219 (ISSN) Salkhordeh, R ; Hadizadeh, M ; Asadi, H ; Sharif University of Technology

IEEE Computer Society 2019

Abstract

Storage subsystem is considered as the performance bottleneck of computer systems in data-intensive applications. Solid-State Drives (SSDs) are emerging storage devices which unlike Hard Disk Drives (HDDs), do not have mechanical parts and therefore, have superior performance compared to HDDs. Due to the high cost of SSDs, entirely replacing HDDs with SSDs is not economically justified. Additionally, SSDs can endure a limited number of writes before failing. To mitigate the shortcomings of SSDs while taking advantage of their high performance, SSD caching is practiced in both academia and industry. Previously proposed caching architectures have only focused on either performance or endurance...

Evaluating reliability of SSD-Based I/O caches in enterprise storage systems

, Article IEEE Transactions on Emerging Topics in Computing ; 2019 ; 21686750 (ISSN) Ahmadian, S ; Taheri, F ; Asadi, H ; Sharif University of Technology

IEEE Computer Society 2019

Abstract

I/O caching techniques are widely employed in enterprise storage systems in order to enhance performance of I/O intensive applications in large-scale data centers. Due to higher performance compared to Hard Disk Drives (HDDs) and lower price and nonvolatility compared to Dynamic Random-Access Memories (DRAM), Flash-based Solid-State Drives (SSDs) are used as a main media in the caching layer of storage systems. Although SSDs are known as non-volatile devices but recent studies have reported large number of data failures due to power outage in SSDs. To overcome the reliability implications of SSD-based I/O caching schemes, RAID-1 (mirrored) configuration is commonly used to avoid data loss...