Loading...
Search for:
high-power-consumption
0.009 seconds
Highly concurrent latency-tolerant register files for GPUs
, Article ACM Transactions on Computer Systems ; Volume 37, Issue 1-4 , 2021 ; 07342071 (ISSN) ; Mirhosseini, A ; Hajiabadi, A ; Ehsani, S. B ; Falahati, H ; Sarbazi Azad, H ; Drumond, M ; Falsafi, B ; Ausavarungnirun, R ; Mutlu, O ; Sharif University of Technology
Association for Computing Machinery
2021
Abstract
Graphics Processing Units (GPUs) employ large register files to accommodate all active threads and accelerate context switching. Unfortunately, register files are a scalability bottleneck for future GPUs due to long access latency, high power consumption, and large silicon area provisioning. Prior work proposes hierarchical register file to reduce the register file power consumption by caching registers in a smaller register file cache. Unfortunately, this approach does not improve register access latency due to the low hit rate in the register file cache. In this article, we propose the Latency-Tolerant Register File (LTRF) architecture to achieve low latency in a two-level hierarchical...
Low-power arithmetic unit for DSP applications
, Article International Symposium on System on Chip, SoC ; 31 October- 2 November , 2011 , pp. 68-71 ; ISBN: 9781457706721 ; Nikounia, S. H ; Jahangir, A. H ; Sharif University of Technology
Abstract
DSP algorithms are one of the most important components of modern embedded computer systems. These applications generally include fixed point and floating-point arithmetic operations and trigonometric functions which have long latencies and high power consumption. Nonetheless, DSP applications enjoy from some interesting characteristics such as tolerating slight loss of accuracy and high degree of value locality which can be exploited to improve their power consumption and performance. In this paper, we present an application-specific result-cache that aims to reduce the power consumption and latency of DSP algorithms by reusing the results of the arithmetic operations executed on the same...
Low-power arithmetic unit for DSP applications
, Article 2015 IEEE International Conference on Signal Processing, Informatics, Communication and Energy Systems, SPICES 2015 ; 2011 , Pages 68-71 ; 9781457706721 (ISBN) ; Nikounia, S. H ; Jahangir, A. H ; Sharif University of Technology
Abstract
DSP algorithms are one of the most important components of modern embedded computer systems. These applications generally include fixed point and floating-point arithmetic operations and trigonometric functions which have long latencies and high power consumption. Nonetheless, DSP applications enjoy from some interesting characteristics such as tolerating slight loss of accuracy and high degree of value locality which can be exploited to improve their power consumption and performance. In this paper, we present an application-specific result-cache that aims to reduce the power consumption and latency of DSP algorithms by reusing the results of the arithmetic operations executed on the same...
An operating system level data migration scheme in hybrid DRAM-NVM memory architecture
, Article Proceedings of the 2016 Design, Automation and Test in Europe Conference and Exhibition, DATE 2016, 14 March 2016 through 18 March 2016 ; 2016 , Pages 936-941 ; 9783981537062 (ISBN) ; Asadi, H ; Sharif University of Technology
Institute of Electrical and Electronics Engineers Inc
2016
Abstract
With the emergence of Non-Volatile Memories (NVMs) and their shortcomings such as limited endurance and high power consumption in write requests, several studies have suggested hybrid memory architecture employing both Dynamic Random Access Memory (DRAM) and NVM in a memory system. By conducting a comprehensive experiments, we have observed that such studies lack to consider very important aspects of hybrid memories including the effect of: a) data migrations on performance, b) data migrations on power, and c) the granularity of data migration. This paper presents an efficient data migration scheme at the Operating System level in a hybrid DRAM-NVM memory architecture. In the proposed...
Energy-Efficient permanent fault tolerance in hard real-time systems
, Article IEEE Transactions on Computers ; 2019 ; 00189340 (ISSN) ; Bakhshalipour, M ; Sadrosadati, M ; Sarbazi Azad, H ; Sharif University of Technology
IEEE Computer Society
2019
Abstract
Triple Modular Redundancy (TMR) is a historical and long-time-used approach for masking various kinds of faults. By employing redundancy and analyzing the results of three separate executions of the same program, TMR is able to attain excellent levels of reliability. While TMR provides a desirable level of reliability, it suffers from the high power consumption of the redundant hardware, a severe detriment to its broad adoption. The energy consumption of TMR can be mitigated if its operations are divided into two stages, and one stage is dropped in the absence of fault. Such an approach, which is evaluated in recent research, however, quickly fails in the presence of permanent faults, as we...
Energy-Efficient permanent fault tolerance in hard real-time systems
, Article IEEE Transactions on Computers ; 2019 ; 00189340 (ISSN) ; Bakhshalipour, M ; Sadrosadati, M ; Sarbazi Azad, H ; Sharif University of Technology
IEEE Computer Society
2019
Abstract
Triple Modular Redundancy (TMR) is a historical and long-time-used approach for masking various kinds of faults. By employing redundancy and analyzing the results of three separate executions of the same program, TMR is able to attain excellent levels of reliability. While TMR provides a desirable level of reliability, it suffers from the high power consumption of the redundant hardware, a severe detriment to its broad adoption. The energy consumption of TMR can be mitigated if its operations are divided into two stages, and one stage is dropped in the absence of fault. Such an approach, which is evaluated in recent research, however, quickly fails in the presence of permanent faults, as we...
An efficient SRAM-Based reconfigurable architecture for embedded processors
, Article IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems ; Volume 38, Issue 3 , 2019 , Pages 466-479 ; 02780070 (ISSN) ; Ebrahimi, Z ; Khaleghi, B ; Asadi, H ; Sharif University of Technology
Institute of Electrical and Electronics Engineers Inc
2019
Abstract
Nowadays, embedded processors are widely used in wide range of domains from low-power to safety-critical applications. By providing prominent features such as variant peripheral support and flexibility to partial or major design modifications, field-programmable gate arrays (FPGAs) are commonly used to implement either an entire embedded system or a hardware description language-based processor, known as soft-core processor. FPGA-based designs, however, suffer from high power consumption, large die area, and low performance that hinders common use of soft-core processors in low-power embedded systems. In this paper, we present an efficient reconfigurable architecture to implement soft-core...
Peak-power-aware energy management for periodic real-time applications
, Article IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems ; Volume 39, Issue 4 , 2020 , Pages 779-788 ; Yeganeh Khaksar, A ; Safari, S ; Ejlali, A ; Sharif University of Technology
Institute of Electrical and Electronics Engineers Inc
2020
Abstract
Two main objectives in designing real-time embedded systems are high reliability and low power consumption. Hardware replication (e.g., standby-sparing) can provide high reliability while keeping the power consumption under control. In this paper, we consider a standby-sparing system where the main tasks on primary cores are scheduled by our proposed peak-power-aware earliest-deadline-first policy while the backup tasks on spare cores are scheduled by our proposed peak-power-aware earliest-deadline-late policy to meet the chip thermal design power (TDP) constraint. These policies provide the best opportunity to shift the task executions as much as possible to minimize execution overlaps...