Loading...
Search for: virtual-channel
0.007 seconds
Total 27 records

    Reducing Power of On-chip Networks by Exploiting Latency Asymmetry of Router’s Pipeline Stages

    , M.Sc. Thesis Sharif University of Technology Sadrosadati, Mohammad (Author) ; Sarbazi Azad, Hamid (Supervisor)
    Abstract
    NOCs contribute to a large portion of a many-core SOC power consumption. A significant fraction of the mentioned power consumption is due to the buffers, crossbar and the links. Thus, in this thesis, a new method would be introduced which reduces the power consumption of the NOCs in large scale. This method utilizes the latency asymmetry of router pipeline stages for dynamic power reduction and uses different voltage swings for buffers, links and the crossbar in order to decrease the dynamic power consumption while maintaining the performance. Moreover, since the static power consumption has gained a noticeable importance in recent years, a method for degrading this power component is also... 

    Virtual point-to-point links in packet-switched NoCs

    , Article IEEE Computer Society Annual Symposium on VLSI: Trends in VLSI Technology and Design, ISVLSI 2008, Montpellier, 7 April 2008 through 9 April 2008 ; 2008 , Pages 433-436 ; 9780769531700 (ISBN) Modarressi, M ; Sarbazi Azad, H ; Tavakkol, A ; Sharif University of Technology
    2008
    Abstract
    A method to setup virtual point-to-point links between the cores of a packet-switched network-on-chip is presented in this paper which aims at reducing the NoC power consumption and delay. The router architecture proposed in this paper provides packet-switching, as well as a number of virtual point-to-point, or VIP (VIrtual Point-to-point) for short, connections. This is achieved by designating one virtual channel at each physical channel of a router to bypass the router pipeline. The mapping and routing algorithm exploits these virtual channels and tries to virtually connect the source and destination nodes of high-volume communication flows during task-graph mapping and route selection... 

    Traffic-load-aware virtual channel power-gating in network-on-chips

    , Article Advances in Computers ; 2021 ; 00652458 (ISSN) Sadrosadati, M ; Mirhosseini, A ; Akbarzadeh, N ; Modarressi, M ; Sarbazi Azad, H ; Sharif University of Technology
    Academic Press Inc  2021
    Abstract
    Network-on-Chips (NoCs) employ several virtual channels per input port to mitigate head-of-line blocking issue in transmitting network packets. Unfortunately, these virtual channels are power-hungry resources that significantly contribute to the total power consumption of NoCs. In particular, we make the key observation that even in high load traffic, a number of virtual channels are idle, imposing significant static power overhead. Prior works use power-gating technique to switch off idle VCs and reduce the static power consumption. However, we observe that prior works are mostly suitable for low traffic loads and are ineffective in high traffic loads. In this chapter, we aim to propose a... 

    Traffic-load-aware virtual channel power-gating in network-on-chips

    , Article Advances in Computers ; Volume 124 , 2022 , Pages 1-19 ; 00652458 (ISSN); 9780323856881 (ISBN) Sadrosadati, M ; Mirhosseini, A ; Akbarzadeh, N ; Modarressi, M ; Sarbazi Azad, H ; Sharif University of Technology
    Academic Press Inc  2022
    Abstract
    Network-on-Chips (NoCs) employ several virtual channels per input port to mitigate head-of-line blocking issue in transmitting network packets. Unfortunately, these virtual channels are power-hungry resources that significantly contribute to the total power consumption of NoCs. In particular, we make the key observation that even in high load traffic, a number of virtual channels are idle, imposing significant static power overhead. Prior works use power-gating technique to switch off idle VCs and reduce the static power consumption. However, we observe that prior works are mostly suitable for low traffic loads and are ineffective in high traffic loads. In this chapter, we aim to propose a... 

    The impacts of timing constraints on virtual channels multiplexing in interconnect networks

    , Article 25th IEEE International Performance, Computing, and Communications Conference, 2006, IPCCC 2006, Phoenix, AZ, 10 April 2006 through 12 April 2006 ; Volume 2006 , 2006 , Pages 55-62 ; 1424401976 (ISBN); 9781424401970 (ISBN) Khonsari, A ; Ould Khaoua, M ; Nayebi, A ; Sarbazi azad, H ; Sharif University of Technology
    2006
    Abstract
    Interconnect networks employing wormhole-switching play a critical role in shared memory multiprocessor systems-on-chip (MPSoC) designs, Multicomputer systems and System Area Networks. Virtual channels greatly improve the performance of wormhole-switched networks because they reduce blocking by acting as "bypass" lanes for non-blocked messages. Capturing the effects of virtual channel multiplexing has always been a crucial issue for any analytical model proposed for wormhole-switched networks. Dally [8] has developed a model to investigate the behaviour of this multiplexing which have been widely employed in the subsequent analytical models of most routing algorithms suggested in the... 

    The effect of virtual channel organization on the performance of interconnection networks

    , Article 19th IEEE International Parallel and Distributed Processing Symposium, IPDPS 2005, Denver, CO, 4 April 2005 through 8 April 2005 ; Volume 2005 , 2005 ; 0769523129 (ISBN); 0769523129 (ISBN); 9780769523125 (ISBN) Rezazad, M ; Sarbazi Azad, H ; Sharif University of Technology
    2005
    Abstract
    Most of previous studies have assessed the performance issues for regular buffer and virtual channel organiza-tions and have not considered overall buffer size constraint. In this paper, the performance of mesh-based interconnection networks (mesh, torus and hypercube networks) under different traffic patterns (uniform, hotspot, and matrix-transpose) is studied. We investigate the effect of the number of virtual channels and their buffer lengths, on the performance of these topologies when the total buffer size associated to each physical channel (and thus router buffer size) is fixed.The results show that the optimal number of virtual channels and buffer length highly depends on the traffic... 

    Power-efficient deterministic and adaptive routing in torus networks-on-chip

    , Article Microprocessors and Microsystems ; Vol. 36, issue. 7 , October , 2012 , pp. 571-585 ; ISSN: 01419331 Rahmati, D ; Sarbazi-Azad, H ; Hessabi, S ; Kiasari, A. E ; Sharif University of Technology
    Abstract
    Modern SoC architectures use NoCs for high-speed inter-IP communication. For NoC architectures, high-performance efficient routing algorithms with low power consumption are essential for real-time applications. NoCs with mesh and torus interconnection topologies are now popular due to their simple structures. A torus NoC is very similar to the mesh NoC, but has rather smaller diameter. For a routing algorithm to be deadlock-free in a torus, at least two virtual channels per physical channel must be used to avoid cyclic channel dependencies due to the warp-around links; however, in a mesh network deadlock freedom can be insured using only one virtual channel. The employed number of virtual... 

    Power-efficient deterministic and adaptive routing in torus networks-on-chip

    , Article Microprocessors and Microsystems ; Volume 36, Issue 7 , 2012 , Pages 571-585 ; 01419331 (ISSN) Rahmati, D ; Sarbazi Azad, H ; Hessabi, S ; Kiasari, A. E ; Sharif University of Technology
    Elsevier  2012
    Abstract
    Modern SoC architectures use NoCs for high-speed inter-IP communication. For NoC architectures, high-performance efficient routing algorithms with low power consumption are essential for real-time applications. NoCs with mesh and torus interconnection topologies are now popular due to their simple structures. A torus NoC is very similar to the mesh NoC, but has rather smaller diameter. For a routing algorithm to be deadlock-free in a torus, at least two virtual channels per physical channel must be used to avoid cyclic channel dependencies due to the warp-around links; however, in a mesh network deadlock freedom can be insured using only one virtual channel. The employed number of virtual... 

    Performance evaluation of fully adaptive routing under different workloads and constant node buffer size

    , Article 11th International Conference on Parallel and Distributed Systems Workshops, ICPADS 2005, Fukuoka, 20 July 2005 through 22 July 2005 ; Volume 2 , 2005 , Pages 510-514 ; 15219097 (ISSN); 0769522815 (ISBN) Rezazad, M ; Sarbazi Azad, H ; Ma J ; Yang L. T ; Sharif University of Technology
    2005
    Abstract
    In this paper, the performance of some popular direct interconnection networks, namely the mesh, torus and hypercube, are studied with adaptive wormhole routing for different traffic patterns. We investigate the effect of the number of virtual channels and depth of their buffers on the performance of such strictly orthogonal topologies under uniform, hot-spot and matrix-transpose traffic patterns for generated messages, while the total buffer size associated to each physical channel is kept constant. In addition we analyze the effect of escape channel buffer length on the performance of a fully adaptive routing algorithm. It is shown that the optimal number of virtual channels and buffer... 

    P2R2: Parallel Pseudo-Round-Robin arbiter for high performance NoCs

    , Article Integration, the VLSI Journal ; November , 2014 ; ISSN: 1679260 Bashizade, R ; Sarbazi-Azad, H ; Sharif University of Technology
    Abstract
    Networks-on-Chip (NoCs) play an important role in the performance of Chip Multi-Processors (CMPs). Providing the desired performance under heavy traffics imposed by some applications necessitates NoC routers to have a large number of Virtual Channels (VCs). Increasing the number of VCs, however, will add to the delay of the critical path of the arbitration logic, and hence restricts the clock frequency of the router. In order to make it possible to enjoy the benefits of having many VCs and keep the clock frequency as high as that of a low-VC router, we propose Parallel Pseudo-Round-Robin (P2R2) arbiter. Our proposal is based on processing multiple groups of requests in parallel. Our... 

    P2R2: Parallel Pseudo-Round-Robin arbiter for high performance NoCs

    , Article Integration, the VLSI Journal ; Volume 50 , 2014 , pp.173–182 ; ISSN: 0167-9260 Bashizade, R ; Sarbazi-Azad, H ; Sharif University of Technology
    Abstract
    Networks-on-Chip (NoCs) play an important role in the performance of Chip Multi-Processors (CMPs). Providing the desired performance under heavy traffics imposed by some applications necessitates NoC routers to have a large number of Virtual Channels (VCs). Increasing the number of VCs, however, will add to the delay of the critical path of the arbitration logic, and hence restricts the clock frequency of the router. In order to make it possible to enjoy the benefits of having many VCs and keep the clock frequency as high as that of a low-VC router, we propose Parallel Pseudo-Round-Robin (P2R2) arbiter. Our proposal is based on processing multiple groups of requests in parallel. Our... 

    P2R2: Parallel pseudo-round-robin arbiter for high performance NoCs

    , Article Integration, the VLSI Journal ; Volume 50 , June , 2015 , Pages 173-182 ; 01679260 (ISSN) Bashizade, R ; Sarbazi Azad, H ; Sharif University of Technology
    Elsevier  2015
    Abstract
    Abstract Networks-on-Chip (NoCs) play an important role in the performance of Chip Multi-Processors (CMPs). Providing the desired performance under heavy traffics imposed by some applications necessitates NoC routers to have a large number of Virtual Channels (VCs). Increasing the number of VCs, however, will add to the delay of the critical path of the arbitration logic, and hence restricts the clock frequency of the router. In order to make it possible to enjoy the benefits of having many VCs and keep the clock frequency as high as that of a low-VC router, we propose Parallel Pseudo-Round-Robin (P2R2) arbiter. Our proposal is based on processing multiple groups of requests in parallel. Our... 

    New approach to calculate energy on NoC

    , Article 2008 International Conference on Computer and Communication Engineering, ICCCE08: Global Links for Human Development, Kuala Lumpur, 13 May 2008 through 15 May 2008 ; 2008 , Pages 1098-1104 ; 9781424416929 (ISBN) Ghadiry, M. H ; Nadi, M ; Rahmati, D ; Sharif University of Technology
    2008
    Abstract
    Low scalability and power efficiency of the shared bus in SoCs is a motivation to use on chip networks instead of traditional buses. In this paper we have modified the Orion power model to reach an analytical model to estimate the average message energy in K-Ary n-Cubes with focus on the number of virtual channels. Afterward by using the power model and also the performance model proposed in [11] the effect of number of virtual channels on Energy-Delay product have been analyzed. In addition a cycle accurate power and performance simulator have been implemented in VHDL to verify the results. ©2008 IEEE  

    Modelling and evaluation of adaptive routing in high-performance n-D tori networks

    , Article Simulation Modelling Practice and Theory ; Volume 14, Issue 6 , 2006 , Pages 740-751 ; 1569190X (ISSN) Sarbazi Azad, H ; Ould-Khaoua, M ; Sharif University of Technology
    2006
    Abstract
    Many fully-adaptive algorithms have been proposed to overcome the performance limitations of deterministic routing in networks used in high-performance multicomputers, such as the well-known regular n-D torus. This paper proposes a simple yet reasonably accurate analytical performance model to predict message communication latency in tori networks. This model requires a running time of O(1) which is the fastest model yet reported in the literature. Extensive simulations reveal that the new performance model maintains a reasonable accuracy when the network operates under different traffic conditions. The model is then used to perform an extensive investigation into the performance merits of... 

    Efficient genetic based topological mapping using analytical models for on-chip networks

    , Article Journal of Computer and System Sciences ; Volume 79, Issue 4 , 2013 , Pages 492-513 ; 00220000 (ISSN) Arjomand, M ; Amiri, S. H ; Sarbazi Azad, H ; Sharif University of Technology
    2013
    Abstract
    Network-on-Chips are now the popular communication medium to support inter-IP communications in complex on-chip systems with tens to hundreds IP cores. Higher scalability (compared to the traditional shared bus and point-to-point interconnects), throughput, and reliability are among the most important advantages of NoCs. Moreover, NoCs can well match current CAD methodologies mainly relying on modular and reusable structures with regularity of structural pattern. However, since NoCs are resource-limited, determining how to distribute application load over limited on-chip resources (e.g. switches, buffers, virtual channels, and wires) in order to improve the metrics of interest and satisfy... 

    A simple and efficient fault-tolerant adaptive routing algorithm for meshes

    , Article 8th International Conference on Algorithms and Architectures for Parallel Processing, ICA3PP 2008, 9 June 2008 through 11 June 2008 ; Volume 5022 LNCS , 2008 , Pages 54-57 ; 03029743 (ISSN) ; 9783540695004 (ISBN) Shamaei, A ; Nayebi, A ; Sarbazi Azad, H ; Sharif University of Technology
    2008
    Abstract
    The planar-adaptive routing algorithm is a simple method to enhance wormhole routing algorithms for fault-tolerance in meshes but it cannot handle faults on the boundaries of mesh without excessive loss of performance. In this paper, we show that this algorithm can further be improved using a flag bit introduced for guiding misrouted messages. So, the proposed algorithm can be used to route messages when fault regions touch the boundaries of the mesh. We also show that our scheme does not lead to diminish the performance of the network and only three virtual channels per physical channels are sufficient for tolerating multiple boundary faulty regions. © 2008 Springer-Verlag Berlin Heidelberg... 

    A new routing algorithm for irregular mesh NoCs

    , Article 2008 International SoC Design Conference, ISOCC 2008, Busan, 24 November 2008 through 25 November 2008 ; Volume 1 , 2008 , Pages I260-I264 ; 9781424425990 (ISBN) Samadi Bokharaei, V ; Shamaei, A ; Sarbaziazad, H ; Abbaspour, M ; Sharif University of Technology
    2008
    Abstract
    Network-on-Chips (NoCs) usually use regular mesh-based topologies.Regular mesh topologies are not always efficient because of power and area constraints which should be considered in designing system-on-chips.To overcome this problem,irregular mesh NoCs are used for which the design of routing algorithms is an important issue.This paper presents a novel routing algorithm for irregular mesh-based NoCs called "i-route". In contrast to other routing algorithms,this algorithm can be implemented on any arbitrary irregular mesh NoC without any change in the place of IPs. In this algorithm, messages are routed using only 2 classes of virtual channels. Simulation results show that using only 2... 

    An energy-efficient virtual channel power-gating mechanism for on-chip networks

    , Article Proceedings -Design, Automation and Test in Europe, DATE, 9 March 2015 through 13 March 2015 ; Volume 2015-April , March , 2015 , Pages 1527-1532 ; 15301591 (ISSN) ; 9783981537048 (ISBN) Mirhosseini, A ; Sadrosadati, M ; Fakhrzadehgan, A ; Modarressi, M ; Sarbazi Azad, H ; Sharif University of Technology
    Institute of Electrical and Electronics Engineers Inc  2015
    Abstract
    Power-gating is a promising method for reducing the leakage power of digital systems. In this paper, we propose a novel power-gating scheme for virtual channels in on-chip networks that uses an adaptive method to dynamically adjust the number of active VCs based on the on-chip traffic characteristics. Since virtual channels are used to provide higher throughput under high traffic loads, our method sets the number of virtual channel at each port selectively based on the workload demand, thereby do not negatively affect performance. Evaluation results show that by using this scheme, about 40% average reduction in static power consumption can be achieved with negligible performance overhead  

    Analytic performance comparison of hypercubes and star graphs with implementation constraints

    , Article Journal of Computer and System Sciences ; Volume 74, Issue 6 , September , 2008 , Pages 1000-1012 ; 00220000 (ISSN) Kiasari, A. E ; Sarbazi Azad, H ; Sharif University of Technology
    2008
    Abstract
    Many theoretical-based comparison studies, relying on graph structural and algorithmic properties, have been conducted for the hypercube and the star graph. None of these studies, however, have considered real working conditions and implementation limits. We have compared the performance of the star and hypercube networks for different message lengths and number of virtual channels, and considered two implementation constraints, namely the constant bisection bandwidth and constant node pin-out. We use two accurate analytical models, already proposed for the star graph and hypercube, and implement the parameter changes imposed by technological implementation constraints. When no constraint is... 

    Analysis of k-ary n-cubes with dimension-ordered routing

    , Article CCGrid 2002, Berlin, 21 May 2002 through 24 May 2002 ; Volume 19, Issue 4 , 2003 , Pages 493-502 ; 0167739X (ISSN) Sarbazi Azad, H ; Khonsari, A ; Ould Khaoua, M ; Sharif University of Technology
    2003
    Abstract
    K-ary n-cubes have been one of the most popular interconnection networks for practical multicomputers due to their ease of implementation and ability to exploit communication locality found in many parallel applications. This paper describes an analytical model for k-ary n-cubes with dimension-ordered routing. The main feature of the model is its ability to captures network performance when an arbitrary number of virtual channels are used to reduce message blocking. Simulation experiments reveal that the latency results predicted by the analytical model are in good agreement with those provided by the simulation model. © 2002 Elsevier Science B.V. All rights reserved