Sharif Digital Repository / Sharif University of Technology / Search result

Dynamically adaptive register file architecture for energy reduction in embedded processors

, Article Microprocessors and Microsystems ; Volume 39, Issue 2 , March , 2015 , Pages 49-63 ; 01419331 (ISSN) Khavari Tavana, M ; Ahmadian Khameneh, S ; Goudarzi, M ; Sharif University of Technology

Elsevier 2015

Abstract

Energy reduction in embedded processors is a must since most embedded systems run on batteries and processor energy reduction helps increase usage time before needing a recharge. Register files are among the most power consuming parts of a processor core. Register file power consumption mainly depends on its size (height as well as width), especially in newer technologies where leakage power is increasing. We provide a register file architecture that, depending on the application behavior, dynamically (i) adapts the width of individual registers, and (ii) puts partitions of temporarily unused registers into low-power mode, so as to save both static and dynamic power. We show that our scheme...

Towards dark silicon era in FPGAs using complementary hard logic design

, Article Conference Digest - 24th International Conference on Field Programmable Logic and Applications, FPL 2014 ; Sept , 2014 , pp. 1 - 6 ; ISBN: 9783000446450 Ahari, A ; Khaleghi, B ; Ebrahimi, Z ; Asadi, H ; Tahoori, M. B ; Sharif University of Technology

Abstract

While the transistor density continues to grow exponentially in Field-Programmable Gate Arrays (FPGAs), the increased leakage current of CMOS transistors act as a power wall for the aggressive integration of transistors in a single die. One recently trend to alleviate the power wall in FPGAs is to turn off inactive regions of the silicon die, referred to as dark silicon. This paper presents a reconfigurable architecture to enable effective fine-grained power gating of unused Logic Blocks (LBs) in FPGAs. In the proposed architecture, the traditional soft logic is replaced with Mega Cells (MCs), each consists of a set of complementary Generic Reconfigurable Hard Logic (GRHL) and a conventional...

An accurate instruction-level energy estimation model and tool for embedded systems

, Article IEEE Transactions on Instrumentation and Measurement ; Volume 62, Issue 7 , March , 2013 , Pages 1927-1934 ; 00189456 (ISSN) Bazzaz, M ; Salehi, M ; Ejlali, A ; Sharif University of Technology

2013

Abstract

Estimating the energy consumption of applications is a key aspect in optimizing embedded systems energy consumption. This paper proposes a simple yet accurate instruction-level energy estimation model for embedded systems. As a case study, the model parameters were determined for a commonly used ARM7TDMI-based microcontroller. The total energy includes the energy consumption of the processor core, Flash memory, memory controller, and SRAM. The model parameters are instructions opcode, number of shift operations, register bank bit flips, instructions weight and their Hamming distance, and different types of memory accesses. Also, the effect of pipeline stalls have been considered. In order to...

An operating system level data migration scheme in hybrid DRAM-NVM memory architecture

, Article Proceedings of the 2016 Design, Automation and Test in Europe Conference and Exhibition, DATE 2016, 14 March 2016 through 18 March 2016 ; 2016 , Pages 936-941 ; 9783981537062 (ISBN) Salkhordeh, R ; Asadi, H ; Sharif University of Technology

Institute of Electrical and Electronics Engineers Inc 2016

Abstract

With the emergence of Non-Volatile Memories (NVMs) and their shortcomings such as limited endurance and high power consumption in write requests, several studies have suggested hybrid memory architecture employing both Dynamic Random Access Memory (DRAM) and NVM in a memory system. By conducting a comprehensive experiments, we have observed that such studies lack to consider very important aspects of hybrid memories including the effect of: a) data migrations on performance, b) data migrations on power, and c) the granularity of data migration. This paper presents an efficient data migration scheme at the Operating System level in a hybrid DRAM-NVM memory architecture. In the proposed...

Efficient nearest-neighbor data sharing in GPUs

, Article ACM Transactions on Architecture and Code Optimization ; Volume 18, Issue 1 , 2021 ; 15443566 (ISSN) Nematollahi, N ; Sadrosadati, M ; Falahati, H ; Barkhordar, M ; Drumond, M. P ; Sarbazi Azad, H ; Falsafi, B ; Sharif University of Technology

Association for Computing Machinery 2021

Abstract

Stencil codes (a.k.a. nearest-neighbor computations) are widely used in image processing, machine learning, and scientific applications. Stencil codes incur nearest-neighbor data exchange because the value of each point in the structured grid is calculated as a function of its value and the values of a subset of its nearest-neighbor points. When running on Graphics Processing Unit (GPUs), stencil codes exhibit a high degree of data sharing between nearest-neighbor threads. Sharing is typically implemented through shared memories, shuffle instructions, and on-chip caches and often incurs performance overheads due to the redundancy in memory accesses. In this article, we propose Neighbor Data...