Loading...
Search for: register-files
0.005 seconds

    LTRF: enabling high-capacity register files for GPUs via hardware/software cooperative register prefetching

    , Article 23rd International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS 2018, 24 March 2018 through 28 March 2018 ; 2018 , Pages 489-502 ; 9781450349116 (ISBN) Sadrosadati, M ; Mirhosseini, A ; Ehsani, S. B ; Sarbazi Azad, H ; Drumond, M ; Falsafi, B ; Ausavarungnirun, R ; Mutlu, O ; Sharif University of Technology
    Association for Computing Machinery  2018
    Abstract
    Graphics Processing Units (GPUs) employ large register files to accommodate all active threads and accelerate context switching. Unfortunately, register files are a scalability bottleneck for future GPUs due to long access latency, high power consumption, and large silicon area provisioning. Prior work proposes hierarchical register file, to reduce the register file power consumption by caching registers in a smaller register file cache. Unfortunately, this approach does not improve register access latency due to the low hit rate in the register file cache. In this paper, we propose the Latency-Tolerant Register File (LTRF) architecture to achieve low latency in a two-level hierarchical... 

    A low energy soft error-tolerant register file architecture for embedded processors

    , Article 11th IEEE High Assurance Systems Engineering Symposium, HASE 2008, Nanjing, 3 December 2008 through 5 December 2008 ; December , 2008 , Pages 109-116 ; 15302059 (ISSN); 9780769534824 (ISBN) Fazeli, M ; Ahmadian, S. N ; Miremadi, S. G ; Nanjing University; IEEE Computer Society; IEEE Reliability Society ; Sharif University of Technology
    2008
    Abstract
    This paper presents a soft error-tolerant architecture to protect embedded processors register files. The proposed architecture is based on selectively duplication of the most vulnerable registers values in a cache memory embedded beside the processor register file so called register cache. To do this, two parity bits are added to each register of the processor to detect up to three contiguous errors. To recover the erroneous register value, two distinct cache memories are utilized for storing the redundant copy of the vulnerable registers, one for short lived registers and the other one for long lived registers. The proposed method has two key advantageous as compared to fully ECC protected... 

    Analyzing fault effects in the 32-bit OpenRISC 1200 microprocessor

    , Article ARES 2008 - 3rd International Conference on Availability, Security, and Reliability, Proceedings, 4 March 2008 through 7 March 2008, Barcelona ; 2008 , Pages 648-652 ; 0769531024 (ISBN); 9780769531021 (ISBN) Mehdizadeh, N ; Shokrolah Shirazi, M ; Miremadi, S. G ; Sharif University of Technology
    2008
    Abstract
    This paper presents an analysis of the effects and propagation of faults in the open-core 32-bit OpenRISC 1200 microprocessor. The analysis is based on a total of 13,000 transient faults injected into 65 parts of the CPU module in the OpenRISC 1200 core described at the RTL model. A comparison of the effects of faults on the various parts of the CPU including the pipeline's registers, the CPU component such as the register file, the control unit, and the ALU, and the data and address buses is done. It is shown that about 30%, 40% and 27% of injected faults terminated in address, data, and control errors respectively. About 28% of all injected faults resulted in failures. © 2008 IEEE  

    , M.Sc. Thesis Sharif University of Technology Jahanghir, Elahe (Author) ; Jahed, Mehran (Supervisor) ; Miremadi, Ghasem (Supervisor)
    Abstract
    Considering ever expanding applications of embedded systems in all aspects of human life, reliability and fault tolerance of these systems have become vital. To increase the reliability in a microprocessor as the most critical component of an embedded system, one may notice the essential role that is offered by its register bank. In fact the register bank is the most critical subcomponent of an embedded system, greatly affecting the reliability of the overall system. The operation of the embedded system is further critically affected through optimal and efficient usage of power as most systems relay on battery. In this project, to evaluate the availability of register banks various... 

    Value-Aware low-power register file architecture

    , Article CADS 2012 - 16th CSI International Symposium on Computer Architecture and Digital Systems ; 2012 , Pages 44-49 ; 9781467314824 (ISBN) Ahmadian, S. N ; Fazeli, M ; Ghalaty, N. F ; Miremadi, S. G ; Sharif University of Technology
    2012
    Abstract
    In this paper, we propose a low power register file architecture for embedded processors. The proposed architecture, "Value-Aware Partitioned Register File (VAP-RF)", employs a partitioning technique that divides the register file into two partitions such that the most frequently accessed registers are stored in the smaller register partition. In our partitioning algorithm, we introduce an aggressive clock-gating scheme based on narrow-value registers to furthermore reduce power. Experimental results on an ARM processor for selected MiBench workloads show that the proposed architecture has an average power saving of 70% over generic register file structure  

    An efficient technique to tolerate MBU faults in register file of embedded processors

    , Article CADS 2012 - 16th CSI International Symposium on Computer Architecture and Digital Systems ; 2012 , Pages 115-120 ; 9781467314824 (ISBN) Abazari, M. A ; Fazeli, M ; Patooghy, A ; Miremadi, S. G ; Sharif University of Technology
    2012
    Abstract
    This paper presents a Data Width-aware Register file Protection (DWRP) technique to cope with Multiple Bit Upsets (MBUs) occurring in the register file of embedded processors. The DWRP technique has been proposed based on the fact that there are often a significant number of bits in the register file, which are not fully occupied by data. The DWRP technique efficiently exploits these available free bits for reliability enhancement purposes. In this regard, every register is equipped with three extra tag bits to specify the amount of available free bits in a register. Then the appropriate parity or hamming code is used based on the information of the tag field to protect the register file... 

    Proposing a Scalable and Energy-aware Architecture for Register File of GPUs

    , Ph.D. Dissertation Sharif University of Technology Sadrosadati, Mohammad (Author) ; Sarbazi-Azad, Hamid (Supervisor)
    Abstract
    Graphics Processing Units (GPUs) employ large register files to accommodate all active threads and accelerate context switching. Unfortunately, register files are a scalability bottleneck for future GPUs due to long access latency, high power consumption, and large silicon area provisioning. In this thesis, we propose the Latency-Tolerant Register File (LTRF) architecture to achieve low latency in a two-level hierarchical structure. We observe that compile-time interval analysis enables us to divide GPU program execution into intervals with an accurate estimate of a warp’s aggregate register working-set within each interval. The key idea of LTRF is to prefetch the estimated register... 

    Improving the Efficiency of GPUs by Reducing Register File Accesses

    , M.Sc. Thesis Sharif University of Technology Mohammadpur Fard, Ali (Author) ; Sarbazi Azad, Hamid (Supervisor)
    Abstract
    Graphiⅽs Proⅽessing Units (GPUs) use a ⅼarge register fiⅼe to support a ⅼarge nuⅿber of paraⅼⅼeⅼ threaⅾs, whiⅽh is responsibⅼe for a ⅼarge fraⅽtion of the ⅾeviⅽe’s totaⅼ power ⅽonsuⅿption, anⅾ ⅾie area. Ⅾue to the ⅽonventionaⅼ RISⅭ−ⅼike instruⅽtion set arⅽhiteⅽture, a reasonabⅼy ⅼarge fraⅽtion of aⅼⅼ aⅽⅽesses to the register fiⅼe are perforⅿeⅾ to aⅽⅽoⅿⅿoⅾate the ⅿeⅿory aⅽⅽesses perforⅿeⅾ by the threaⅾs, whiⅽh ⅼiⅿits the avaiⅼabⅼe register fiⅼe banⅾwiⅾth for other ⅽonⅽurrent aⅽⅽesses, anⅾ aⅼso keeps at ⅼeast one register per threaⅾ busy for storing ⅼoaⅾeⅾ vaⅼues. In this thesis, we propose ⅿoving away froⅿ the ⅽonventionaⅼ RISⅭⅼike arⅽhiteⅽture anⅾ aⅼⅼowing ⅿeⅿory operanⅾs for soⅿe... 

    Dynamically adaptive register file architecture for energy reduction in embedded processors

    , Article Microprocessors and Microsystems ; Volume 39, Issue 2 , March , 2015 , Pages 49-63 ; 01419331 (ISSN) Khavari Tavana, M ; Ahmadian Khameneh, S ; Goudarzi, M ; Sharif University of Technology
    Elsevier  2015
    Abstract
    Energy reduction in embedded processors is a must since most embedded systems run on batteries and processor energy reduction helps increase usage time before needing a recharge. Register files are among the most power consuming parts of a processor core. Register file power consumption mainly depends on its size (height as well as width), especially in newer technologies where leakage power is increasing. We provide a register file architecture that, depending on the application behavior, dynamically (i) adapts the width of individual registers, and (ii) puts partitions of temporarily unused registers into low-power mode, so as to save both static and dynamic power. We show that our scheme... 

    Highly concurrent latency-tolerant register files for GPUs

    , Article ACM Transactions on Computer Systems ; Volume 37, Issue 1-4 , 2021 ; 07342071 (ISSN) Sadrosadati, M ; Mirhosseini, A ; Hajiabadi, A ; Ehsani, S. B ; Falahati, H ; Sarbazi Azad, H ; Drumond, M ; Falsafi, B ; Ausavarungnirun, R ; Mutlu, O ; Sharif University of Technology
    Association for Computing Machinery  2021
    Abstract
    Graphics Processing Units (GPUs) employ large register files to accommodate all active threads and accelerate context switching. Unfortunately, register files are a scalability bottleneck for future GPUs due to long access latency, high power consumption, and large silicon area provisioning. Prior work proposes hierarchical register file to reduce the register file power consumption by caching registers in a smaller register file cache. Unfortunately, this approach does not improve register access latency due to the low hit rate in the register file cache. In this article, we propose the Latency-Tolerant Register File (LTRF) architecture to achieve low latency in a two-level hierarchical... 

    Design of a Fault Tolerant SPARC Based Micro Processor On FPGA

    , M.Sc. Thesis Sharif University of Technology Hosseini, Morteza (Author) ; Rashidian, Bijan (Supervisor) ; Vosoughi Vahdat, Bijan (Supervisor)
    Abstract
    In this thesis, LEON Processor was chosen for its compatible architecture that can be implemented on a wide range of FPGAs. The final designed processor is aimed to conquer soft and hard errors that occur due to cosmic radiations in SRAM cells of an FPGA. The system can finally resist all single SEUs that happen in flip flops and all 4 random errors that take place in each register of the register file. All the flip flops and latches are protected using a TMR scheme. The information redundancy in the regiester file to overcome all 4 random errors is 168% and the errors are corrected by means of a mechanesim that is masked from the processor core. In cache memory, each 32 bit data is... 

    Design of a Fault Tolerant ARM-Based Processor on FPGA

    , M.Sc. Thesis Sharif University of Technology Esmaeeli, Siamak (Author) ; Rashidian, Bijan (Supervisor) ; Vosughi Vahdat, Bijan (Supervisor)
    Abstract
    The charged particles in space strike the silicon surface of an embedded system in a satellite and cause fault occurrence in its operation. So some methods should be employed to reduce the effects of the faults. The methods that are implemented in system level are widely used because of their low cost and high reliability. The processors are responsible for performing main processes in embedded systems. On the other hand, the ARM processors are good choices for utilizing in satellites because of their low size, low power consumption and high performance. Also, FPGAs have made a major improvement in embedded system design. So with implementing ... 

    A multi-bit error tolerant register file for a high reliable embedded processor

    , Article 2011 18th IEEE International Conference on Electronics, Circuits, and Systems, ICECS 22011, 11 December 2011 through 14 December 2011 ; December , 2011 , Pages 532-537 ; 9781457718458 (ISBN) Esmaeeli, S ; Hosseini, M ; Vahdat, B. V ; Rashidian, B ; Sharif University of Technology
    2011
    Abstract
    The vulnerability of microprocessors to soft errors is increasing due to continuous shrinking in fabrication process. Recent studies show that 1-5% of the SEUs (single event upset) can cause MBUs (multiple bit upsets). The probability of MBU generation due to SEU is increasing because of the reduction in minimum energy required to flip a memory bit in modern technologies. Register file is the most sensitive component in a microprocessor. In this paper, we present an innovative way to protect registers in a 64-bit register file for a RISC processor using extended Hamming (8, 4) code (SEC-DED code) and narrow-width values. A narrow-width value can be represented by half number of bits of the... 

    Robust register caching: An energy-efficient circuit-level technique to combat soft errors in embedded processors

    , Article IEEE Transactions on Device and Materials Reliability ; Volume 10, Issue 2 , February , 2010 , Pages 208-221 ; 15304388 (ISSN) Fazeli, M ; Namazi, A ; Miremadi, S. G ; Sharif University of Technology
    2010
    Abstract
    This paper presents a cost-efficient technique to jointly use circuit- and architecture-level techniques to protect an embedded processor's register file against soft errors. The basic idea behind the proposed technique is robust register caching (RRC), which creates a cache of the most vulnerable registers within the register file in a small and highly robust cache memory built from circuit-level single-event-upset-protected memory cells. To guarantee that the most vulnerable registers are always stored in the robust register cache, the average number of read operations during a register's lifetime is used as a metric to guide the cache replacement policy. A register is vulnerable to soft... 

    An energy efficient circuit level technique to protect register file from MBUs and SETs in embedded processors

    , Article Proceedings of the International Conference on Dependable Systems and Networks, 29 June 2009 through 2 July 2009, Lisbon ; 2009 , Pages 195-204 ; 9781424444212 (ISBN) Fazeli, M ; Namazi, A ; Miremadi, S.G ; Sharif University of Technology
    2009
    Abstract
    This paper presents a circuit level soft error-tolerant-technique, called RRC (Robust Register Caching), for the register file of embedded processors. The basic idea behind the RRC is to effectively cache the most vulnerable registers in a small highly robust register cache built by circuit level SEU and SET protected memory cells. To decide which cache entry should be replaced, the average number of read operations during a register ACE time is used as a criterion to judge. In fact, the victim cache entry is one which has the maximum read count. To minimize the power overhead of the RRC, the clock gating technique is efficiently exploited for the main register file resulting in... 

    A power efficient approach to fault-tolerant register file design

    , Article Proceedings of the IEEE International Frequency Control Symposium and Exposition, 4 January 2008 through 8 January 2008, Hyderabad ; 2008 , Pages 21-26 ; 0769530834 (ISBN); 9780769530833 (ISBN) Amiri Kamalabad, M ; Miremadi, S. G ; Fazeli, M ; Sharif University of Technology
    2008
    Abstract
    Recently, the trade-off between power consumption and fault tolerance in embedded processors has been highlighted. This paper proposes an approach to reduce dynamic power of conventional high-level fault-tolerant techniques used in the register file of processors, without affecting the effectiveness of the fault-tolerant techniques. The power reduction is based on the reduction of dynamic power of the unaccessed parts of the register file. This approach is applied to three transient fault-tolerant techniques: Single Error Correction (SEC) hamming code, duplication with parity, and Triple Modular Redundancy (TMR). As a case study, this approach is implemented on the register file of an...