Sharif Digital Repository / Sharif University of Technology / Search result

RMAP: A reliability-aware application mapping for network-on-chips

, Article Proceedings - 3rd International Conference on Dependability, DEPEND 2010, 18 July 2010 through 25 July 2010 ; July , 2010 , Pages 112-117 ; 9780769540900 (ISBN) Patooghy, A ; Tabkhi, H ; Miremadi, S. G ; IARIA ; Sharif University of Technology

2010

Abstract

This paper proposes a reliability-aware application mapping for mesh-based NoCs. The proposed reliable mapping, called RMAP, adds redundant communications to the application graph in order to improve the reliability of packet delivery in NoCs. The RMAP divides the application graph into two sub-graphs which have the lowest possible communication with each other. One of the sub-graphs is mapped on the upper triangular nodes of the NoC and the other is mapped on the lower triangular nodes. In this way, lower traffic load is imposed on some channels which are efficiently used to route packets of redundant communications. This minimizes the overheads imposed to the NoC due to redundant...

AFRA: A low cost high performance reliable routing for 3D mesh NoCs

, Article Proceedings -Design, Automation and Test in Europe, DATE ; 2012 , Pages 332-337 ; 15301591 (ISSN) ; 9783981080186 (ISBN) Akbari, S ; Shafiee, A ; Fathy, M ; Berangi, R ; Sharif University of Technology

2012

Abstract

Three-dimensional network-on-chips are suitable communication fabrics for high-density 3D many-core ICs. Such networks have shorter communication hop count, compared to 2D NoCs, and enjoy fast and power efficient TSV wires in vertical links. Unfortunately, the fabrication process of TSV connections has not matured yet, which results in poor vertical links yield. In this work, we address this challenge and introduce AFRA, a deadlock-free routing algorithm for 3D mesh-based NoCs that tolerates faults on vertical links. AFRA is designed to be simple, high performance, and robust. The simplicity is achieved by applying ZXY and XZXY routings in the absence and presence of fault, respectively....

Implementation-aware model analysis: The case of buffer-throughput tradeoff in streaming applications

, Article Proceedings of the ACM SIGPLAN Conference on Languages, Compilers, and Tools for Embedded Systems (LCTES), 18 June 2015 through 19 June 2015 ; Volume 2015-June , 2015 , Pages 108-117 ; 9781450332576 (ISBN) Mirzazad Barijough, K ; Hashemi, M ; Khibin, V ; Ghiasi, S ; Sharif University of Technology

Association for Computing Machinery 2015

Abstract

Models of computation abstract away a number of implementation details in favor of well-defined semantics. While this has unquestionable benefits, we argue that analysis of models solely based on operational semantics (implementation oblivious analysis) is unfit to drive implementation design space exploration. Specifically, we study the tradeoff between buffer size and streaming throughput in applications modeled as synchronous data flow (SDF) graphs. We demonstrate the inherent inaccuracy of implementation-oblivious approach, which only considers SDF operational semantic. We propose a rigorous transformation, which equips the state of the art buffer-throughput tradeoff analysis technique...

Implementation-aware model analysis: The case of buffer-throughput tradeoff in streaming applications

, Article ACM SIGPLAN Notices ; Volume 50, Issue 5 , May , 2015 , Pages 103-112 ; 15232867 (ISSN) Barijough, K. M ; Hashemi, M ; Khibin, V ; Ghiasi, S ; Sharif University of Technology

Association for Computing Machinery 2015

Abstract

Models of computation abstract away a number of implementation details in favor of well-defined semantics. While this has unquestionable benefits, we argue that analysis of models solely based on operational semantics (implementationoblivious analysis) is unfit to drive implementation design space exploration. Specifically, we study the tradeoff between buffer size and streaming throughput in applications modeled as synchronous data flow (SDF) graphs. We demonstrate the inherent inaccuracy of implementationoblivious approach, which only considers SDF operational semantic. We propose a rigorous transformation, which equips the state of the art buffer-throughput tradeoff analysis technique...

A loss aware scalable topology for photonic on chip interconnection networks

, Article Journal of Supercomputing ; Vol. 68, Issue. 1 , April , 2014 , pp. 106-135 ; ISSN: 1573-0484 (online) Reza, A ; Sarbazi Azad, H ; Khademzadeh, A ; Shabani, H ; Niazmand, B ; Sharif University of Technology

Abstract

The demand for robust computation systems has led to the increment of the number of processing cores in current chips. As the number of processing cores increases, current electrical communication means can introduce serious challenges in system performance due to the restrictions in power consumption and communication bandwidth. Contemporary progresses in silicon nano-photonic technology have provided a suitable platform for constructing photonic communication links as an alternative for overcoming such problems. Topology is one of the most significant characteristics of photonic interconnection networks. In this paper, we have introduced a novel topology, aiming to reduce insertion loss in...

New approach to calculate energy on NoC

, Article 2008 International Conference on Computer and Communication Engineering, ICCCE08: Global Links for Human Development, Kuala Lumpur, 13 May 2008 through 15 May 2008 ; 2008 , Pages 1098-1104 ; 9781424416929 (ISBN) Ghadiry, M. H ; Nadi, M ; Rahmati, D ; Sharif University of Technology

2008

Abstract

Low scalability and power efficiency of the shared bus in SoCs is a motivation to use on chip networks instead of traditional buses. In this paper we have modified the Orion power model to reach an analytical model to estimate the average message energy in K-Ary n-Cubes with focus on the number of virtual channels. Afterward by using the power model and also the performance model proposed in [11] the effect of number of virtual channels on Energy-Delay product have been analyzed. In addition a cycle accurate power and performance simulator have been implemented in VHDL to verify the results. ©2008 IEEE

A markovian performance model for networks-on-chip

, Article Proceedings of the 16th Euromicro Conference on Parallel, Distributed and Network-Based Processing, PDP 2008, 13 February 2008 through 15 February 2008, Toulouse ; 2008 , Pages 157-164 ; 0769530893 (ISBN); 9780769530895 (ISBN) Kiasari, A. E ; Rahmati, D ; Sarbazi Azad, H ; Hessabi, S ; Sharif University of Technology

2008

Abstract

Network-on-Chip (NoC) has been proposed as a solution for addressing the design challenges of future high-performance nanoscale architectures. Thus, it is of crucial importance for a designer to ha ve access to fast methods for evaluating the performance of on-chip networks. To this end, we present a Markovian model for evaluating the latency and energy consumption of on-chip networks. We compute the a verage delay due to path contention, virtual channel and crossbar switch arbitration using a queuing-based approach, which can capture the blocking phenomena of wormhole switching quite accurately. The model is then used to estimate the power consumption of all routers in NoCs. The performance...

Effect of number of faults on NoC power and performance

, Article 13th International Conference on Parallel and Distributed Systems, ICPADS, Hsinchu, 5 December 2007 through 7 December 2007 ; Volume 1 , December , 2007 ; 15219097 (ISSN); 9781424418909 (ISBN) Ghadiry, M. H ; Nadi, M ; Manzuri Shalmani, M. T ; Rahmati, D ; Sharif University of Technology

2007

Abstract

According to International Technology Roadmap for Semiconductors (ITRS), before the end of this decade, we will be entering the era of a billion transistors on a single chip. The major threat toward the achievement of billion transistor on a chip is poor scalability of current interconnect infrastructure. With the advent of "Network on Chip (NoC)" various characters and methodologies of traditional networks were hardly considered on-chip. Failure, Power and Area are the major concepts that should be considered when migrating from traditional interconnection networks to NoCs. In this paper we study the effects of faulty links and nodes on power and performance of mesh based NoC, Also several...

Efficient nearest-neighbor data sharing in GPUs

, Article ACM Transactions on Architecture and Code Optimization ; Volume 18, Issue 1 , 2021 ; 15443566 (ISSN) Nematollahi, N ; Sadrosadati, M ; Falahati, H ; Barkhordar, M ; Drumond, M. P ; Sarbazi Azad, H ; Falsafi, B ; Sharif University of Technology

Association for Computing Machinery 2021

Abstract

Stencil codes (a.k.a. nearest-neighbor computations) are widely used in image processing, machine learning, and scientific applications. Stencil codes incur nearest-neighbor data exchange because the value of each point in the structured grid is calculated as a function of its value and the values of a subset of its nearest-neighbor points. When running on Graphics Processing Unit (GPUs), stencil codes exhibit a high degree of data sharing between nearest-neighbor threads. Sharing is typically implemented through shared memories, shuffle instructions, and on-chip caches and often incurs performance overheads due to the redundancy in memory accesses. In this article, we propose Neighbor Data...