Sharif Digital Repository / Sharif University of Technology / Search result

An efficient dynamically reconfigurable on-chip network architecture

, Article Proceedings - Design Automation Conference, 13 June 2010 through 18 June 2010 ; June , 2010 , Pages 166-169 ; 0738100X (ISSN) ; 9781450300025 (ISBN) Modarressi, M ; Sarbazi Azad, H ; Tavakkol, A ; Sharif University of Technology

2010

Abstract

In this paper, we present a reconfigurable architecture for NoCs on which arbitrary application-specific topologies can be implemented. The proposed NoC can dynamically tailor its topology to the traffic pattern of different applications at run-time. The run-time topology construction mechanism involves monitoring the network traffic and changing the inter-node connections in order to reduce the number of intermediate routers between the source and destination nodes of heavy communication flows. This mechanism should also preserve the NoC connectivity. In this paper, we first introduce the proposed reconfigurable topology and then address the problem of run-time topology reconfiguration....

Virtual point-to-point links in packet-switched NoCs

, Article IEEE Computer Society Annual Symposium on VLSI: Trends in VLSI Technology and Design, ISVLSI 2008, Montpellier, 7 April 2008 through 9 April 2008 ; 2008 , Pages 433-436 ; 9780769531700 (ISBN) Modarressi, M ; Sarbazi Azad, H ; Tavakkol, A ; Sharif University of Technology

2008

Abstract

A method to setup virtual point-to-point links between the cores of a packet-switched network-on-chip is presented in this paper which aims at reducing the NoC power consumption and delay. The router architecture proposed in this paper provides packet-switching, as well as a number of virtual point-to-point, or VIP (VIrtual Point-to-point) for short, connections. This is achieved by designating one virtual channel at each physical channel of a router to bypass the router pipeline. The mapping and routing algorithm exploits these virtual channels and tries to virtually connect the source and destination nodes of high-volume communication flows during task-graph mapping and route selection...

Supporting non-contiguous processor allocation in mesh-based CMPs using virtual point-to-point links

, Article Proceedings -Design, Automation and Test in Europe, DATE ; 2011 , p. 413-418 ; ISSN: 15301591 ; ISBN: 9783981080179 Asadinia, M ; Modarressi, M ; Tavakkol, A ; Sarbazi-Azad, H ; Sharif University of Technology

2011

Abstract

In this paper, we propose a processor allocation mechanism for run-time assignment of a set of communicating tasks of input applications onto the processing nodes of a Chip Multiprocessor (CMP), when the arrival order and execution lifetime of the input applications are not known a priori. This mechanism targets the on-chip communication and aims to reduce the power and latency of the NoC employed as the communication infrastructure. In this work, we benefit from the advantages of non-contiguous processor allocation mechanisms, by allowing the tasks of the input application mapped onto disjoint regions (sub-meshes) and then virtually connecting them by bypassing the router pipeline stages of...

Energy-optimized on-chip networks using reconfigurable shortcut paths

, Article Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 24 February 2011 through 25 February 2011 ; Volume 6566 LNCS , February , 2011 , Pages 231-242 ; 03029743 (ISSN) ; 9783642191367 (ISBN) Teimouri, N ; Modarressi, M ; Tavakkol, A ; Sarbazi Azad, H ; Sharif University of Technology

2011

Abstract

Topology is an important network attribute that greatly affects the power, performance, cost, and design time/effort of NoCs. In this paper, we propose a novel NoC architecture that can exploit the benefits of both application-specific and regular NoC topologies. To this end, a subset of NoC links bypass the router pipeline stages and directly connect remotely located nodes. This results in an NoC which holds both fixed connections between adjacent nodes and long connections virtually connecting non-adjacent nodes. These shortcut paths are constructed at run-time by employing a simple and fast mechanism composed of two processes: on-chip traffic monitoring and path reconfiguration. The...

Supporting non-contiguous processor allocation in mesh-based CMPs using virtual point-to-point links

, Article Proceedings -Design, Automation and Test in Europe, DATE, 14 March 2011 through 18 March 2011 ; 2011 , Pages 413-418 ; 15301591 (ISSN) ; 9783981080179 (ISBN) Asadinia, M ; Modarressi, M ; Tavakkol, A ; Sarbazi Azad, H ; Sharif University of Technology

2011

Abstract

In this paper, we propose a processor allocation mechanism for run-time assignment of a set of communicating tasks of input applications onto the processing nodes of a Chip Multiprocessor (CMP), when the arrival order and execution lifetime of the input applications are not known a priori. This mechanism targets the on-chip communication and aims to reduce the power and latency of the NoC employed as the communication infrastructure. In this work, we benefit from the advantages of non-contiguous processor allocation mechanisms, by allowing the tasks of the input application mapped onto disjoint regions (sub-meshes) and then virtually connecting them by bypassing the router pipeline stages of...

Energy analysis of re-injection based deadlock recovery routing algorithms

, Article 2008 International Symposium on System-on-Chip, SOC 2008, Tampere, 5 November 2008 through 6 November 2008 ; 2008 ; 9781424425419 (ISBN) Kooti, H ; Mirza Aghatabar, M ; Hessabi, S ; Tavakkol, A ; Sharif University of Technology

2008

Abstract

There are two strategies for deadlock handling in routing algorithms in NoC: deadlock avoidance and deadlock recovery. Some deadlock recovery routing algorithms are re-injection based, such as: Compressionless (CR), Software-Based (SW-TFAR) and AFBAR. In spite of the performance comparison, none of existing researches have focused on the energy consumption of various routing algorithms. We evaluate these routing algorithms according to their energy consumption and latency. Our experimental results show the better performance and worse energy consumption of deadlock recovery routing algorithms compared to deadlock avoidance routing algorithms. In addition, the best and worst energy...

Quick generation of SSD performance models using machine learning

, Article IEEE Transactions on Emerging Topics in Computing ; Volume 10, Issue 4 , 2022 , Pages 1821-1836 ; 21686750 (ISSN) Tarihi, M ; Azadvar, S ; Tavakkol, A ; Asadi, H ; Sarbazi Azad, H ; Sharif University of Technology

IEEE Computer Society 2022

Abstract

Increasing usage of Solid-State Drives (SSDs) has greatly boosted the performance of storage backends. SSDs perform many internal processes such as out-of-place writes, wear-leveling, and garbage collection. These operations are complex and not well documented which make it difficult to create accurate SSD simulators. Our survey indicates that aside from complex configuration, available SSD simulators do not support both sync and discard requests. Past performance models also ignore the long term effect of I/O requests on SSD performance, which has been demonstrated to be significant. In this article, we utilize a methodology based on machine learning that extracts history-aware features at...

A Scalable and High-performance Design Architecture for SSD

, Ph.D. Dissertation Sharif University of Technology Tavakkol, Arash (Author) ; Sarbazi Azad, Hamid (Supervisor)

Abstract

As a promising replacement for the conventional high-latency and low-throughput HDDs, NAND Flash-based solid state drives have been increasingly used in data center and cloud applications as well as high-end enterprise servers. However, capacity scaling is a controversial challenge of the SSD manufacturers to keep pace with the competitors in the storage market. To date, SSD designs have been largely based on multi-channel bus architecture that confronts serious scalability problems in high-end enterprise SSDs with dozens of Flash memory chips and a gigabyte host interface. This forces the community to rapidly change the bus-based inter-Flash standards to respond to ever increasing...

Design for scalability in enterprise SSDs

, Article Parallel Architectures and Compilation Techniques - Conference Proceedings, PACT ; 24-27 August , 2014 , p. 417-429 ; ISSN: 1089795X ; ISBN: 9781450328098 Tavakkol, A ; Arjomand, M ; Sarbazi-Azad, H ; Sharif University of Technology

2014

Abstract

Solid State Drives (SSDs) have recently emerged as a high speed random access alternative to classical magnetic disks. To date, SSD designs have been largely based on multi-channel bus architecture that confronts serious scalability problems in high-end enterprise SSDs with dozens of flash memory chips and a gigabyte host interface. This forces the community to rapidly change the bus-based inter-flash standards to respond to ever increasing application demands. In this paper, we first give a deep look at how different flash parameters and SSD internal designs affect the actual performance and scalability of the conventional architecture. Our experiments show that SSD performance improvement...

Unleashing the potentials of dynamism for page allocation strategies in SSDs

, Article SIGMETRICS 2014 - Proceedings of the 2014 ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Systems ; 16-20 June , 2014 , pp. 551-552 ; ISBN: 9781450327893 Tavakkol, A ; Arjomand, M ; Sarbazi-Azad, H ; Sharif University of Technology

2014

Abstract

In Solid-State Drives (SSDs) with tens of ash chips and highly parallel architecture, we can speed up I/O operations by well-utilizing resources during page allocation. Propos- als already exist for using static page allocation which does not balance the IO load and its efficiency depends on access address patterns. To our best knowledge, there have been no research thus far to show what happens if one or more internal resources can be freely allocated regardless of the request address. This paper explores the possibility of using different degrees of dynamism in page allocation and iden- tifies key design opportunities that they present to improve SSD's characteristics

Network-on-SSD: A scalable and high-performance communication design paradigm for SSDs

, Article IEEE Computer Architecture Letters ; Vol. 12, issue 1, Article number 6178186 , 2013 , pp. 5-8 ; ISSN: 15566056 Tavakkol, A ; Arjomand, M ; Sarbazi-Azad, H ; Sharif University of Technology

2013

Abstract

In recent years, flash memory solid state disks (SSDs) have shown a great potential to change storage infrastructure because of its advantages of high speed and high throughput random access. This promising storage, however, greatly suffers from performance loss because of frequent ''erase-before-write'' and ''garbage collection'' operations. Thus, novel circuit-level, architectural, and algorithmic techniques are currently explored to address these limitations. In parallel with others, current study investigates replacing shared buses in multi-channel architecture of SSDs with an interconnection network to achieve scalable, high throughput, and reliable SSD storage systems. Roughly...

Unleashing the potentials of dynamism for page allocation strategies in SSDs

, Article SIGMETRICS 2014 - Proceedings of the 2014 ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Systems ; 2014 , pp. 551-552 ; ISBN: 9781450327893 Tavakkol, A ; Arjomand, M ; Sarbazi-Azad, H ; Sharif University of Technology

2014

Abstract

In Solid-State Drives (SSDs) with tens of ash chips and highly parallel architecture, we can speed up I/O operations by well-utilizing resources during page allocation. Propos- als already exist for using static page allocation which does not balance the IO load and its efficiency depends on access address patterns. To our best knowledge, there have been no research thus far to show what happens if one or more internal resources can be freely allocated regardless of the request address. This paper explores the possibility of using different degrees of dynamism in page allocation and iden- tifies key design opportunities that they present to improve SSD's characteristics

Network-on-SSD: A scalable and high-performance communication design paradigm for SSDs

, Article IEEE Computer Architecture Letters ; Volume 12, Issue 1 , January-June , 2013 , Pages 5-8 ; 15566056 (ISSN) Tavakkol, A ; Arjomand, M ; Sarbazi Azad, H ; Sharif University of Technology

2013

Abstract

In recent years, flash memory olid state disks (SSDs) have shown a great potential to change storage infrastructure because of its advantages of high speed and high throughput random access. This promising storage, however, greatly suffers from performance loss because of frequent ''erase-before-write'' and ''garbage collection'' operations. Thus, novel circuit-level, architectural, and algorithmic techniques are currently explored to address these limitations. In parallel with others, current study investigates replacing shared buses in multi-channel architecture of SSDs with an interconnection network to achieve scalable, high throughput, and reliable SSD storage systems. Roughly speaking,...

Venice: Improving solid-state drive parallelism at low cost via conflict-free accesses

, Article Proceedings - International Symposium on Computer Architecture ; 2023 , Pages 504-519 ; 10636897 (ISSN); 979-840070095-8 (ISBN) Nadig, R ; Sadrosadati, M ; Mao, H ; Ghiasi, N. M ; Tavakkol, A ; Park, J ; Sarbazi Azad, H ; Luna, J. G ; Mutlu, O ; Sharif University of Technology

Institute of Electrical and Electronics Engineers Inc 2023

Abstract

The performance and capacity of solid-state drives (SSDs) are continuously improving to meet the increasing demands of modern data-intensive applications. Unfortunately, communication between the SSD controller and memory chips (e.g., 2D/3D NAND flash chips) is a critical performance bottleneck for many applications. SSDs use a multi-channel shared bus architecture where multiple memory chips connected to the same channel communicate to the SSD controller with only one path. As a result, path conflicts often occur during the servicing of multiple I/O requests, which significantly limits SSD parallelism. It is critical to handle path conflicts well to improve SSD parallelism and performance....

ITAP: Idle-time-aware power management for GPU execution units

, Article ACM Transactions on Architecture and Code Optimization ; Volume 16, Issue 1 , 2019 ; 15443566 (ISSN) Sadrosadati, M ; Ehsani, S. B ; Falahati, H ; Ausavarungnirun, R ; Tavakkol, A ; Abaee, M ; Orosa, L ; Wang, Y ; Sarbazi Azad, H ; Mutlu, O ; Sharif University of Technology

Association for Computing Machinery 2019

Abstract

Graphics Processing Units (GPUS) are widely used as the accelerator of choice for applications with massively data-parallel tasks. However, recent studies show that GPUS suffer heavily from resource underutilization, which, combined with their large static power consumption, imposes a significant power overhead. One of the most power-hungry components of a GPU-the execution units-frequently experience idleness when (1) an underutilized warp is issued to the execution units, leading to partial lane idleness, and (2) there is no active warp to be issued for the execution due to warp stalls (e.g., waiting for memory access and synchronization). Although large in total, the idle time of...

Performance and power efficient on-chip communication using adaptive virtual point-to-point connections

, Article 2009 3rd ACM/IEEE International Symposium on Networks-on-Chip, NoCS 2009, San Diego, CA, 10 May 2009 through 13 May 2009 ; 2009 , Pages 203-212 ; 9781424441433 (ISBN) Modarressi, M ; Sarbazi Azad, H ; Tavakkol, A ; IEEE Circuits and Systems Society; Council for EDA; ACM Special Interest Group on Computer Architecture (SIGARCH); ACM Special Interest Group on Embedded Systems (SIGBED); ACM Special Interest Group on Design Automation (SIGDA); Silistix, Inc ; Sharif University of Technology

2009

Abstract

In this paper, we propose a packet-switched network-on-chip (NoC) architecture which can provide a number of low-power, low-latency virtual point-to-point connections for communication flows. The work aims to improve the power and performance metrics of packet-switched NoC architectures and benefits from the power and resource utilization advantages of NoCs and superior communication performance of point-to-point dedicated links. The virtual point-to-point connections are set up by bypassing the entire router pipeline stages of the intermediate nodes. This work addresses constructing the virtual point-to-point connections at run-time using a light-weight setup network. It involves monitoring...

Intelligent semi-active vibration control of eleven degrees of freedom suspension system using magnetorheological dampers

, Article Journal of Mechanical Science and Technology ; Volume 26, Issue 2 , 2012 , Pages 323-334 ; 1738494X (ISSN) Zareh, S. H ; Sarrafan, A ; Khayyat, A. A. A ; Zabihollah, A ; Sharif University of Technology

2012

Abstract

A novel intelligent semi-active control system for an eleven degrees of freedom passenger car's suspension system using magnetorheological (MR) damper with neuro-fuzzy (NF) control strategy to enhance desired suspension performance is proposed. In comparison with earlier studies, an improvement in problem modeling is made. The proposed method consists of two parts: a fuzzy control strategy to establish an efficient controller to improve ride comfort and road handling (RCH) and an inverse mapping model to estimate the force needed for a semi-active damper. The fuzzy logic rules are extracted based on Sugeno inference engine. The inverse mapping model is based on an artificial neural network...

Performance and exhaust emission characteristics of a spark ignition engine operated with gasoline and CNG blend

, Article Proceedings of the Spring Technical Conference of the ASME Internal Combustion Engine Division ; 2012 , Pages 179-187 ; 15296598 (ISSN) ; 9780791844663 (ISBN) Dashti, M ; Hamidi, A. A ; Mozafari, A. A ; Sharif University of Technology

2012

Abstract

Using CNG as an additive for gasoline is a proper choice due to higher octane number of CNG enriched gasoline with respect to that of gasoline. As a result, it is possible to use gasoline with lower octane number in the engine. This would also mean the increase of compression ratio in SI engines resulting in higher performance and lower gasoline consumption. Over the years, the use of simulation codes to model the thermodynamic cycle of an internal combustion engine have developed tools for more efficient engine designs and fuel combustion. In this study, a thermodynamic cycle simulation of a conventional four-stroke spark-ignition engine has been developed. The model is used to study the...

A comparative study of the performance of a SI engine fuelled by natural gas as alternative fuel by thermodynamic simulation

, Article 2009 ASME Internal Combustion Engine Division Fall Technical Conference, ICEF 2009, Lucerne, 27 September 2009 through 30 September 2009 ; 2009 , Pages 49-57 ; 9780791843635 (ISBN) Dashti, M ; Hamidi, A. A ; Mozafari, A. A ; Sharif University of Technology

American Society of Mechanical Engineers (ASME) 2009

Abstract

With the declining energy resources and increase of pollutant emissions, a great deal of efforts has been focused on the development of alternatives for fossil fuels. One of the promising alternative fuels to gasoline in the internal combustion engine is natural gas [1-5]. The application of natural gas in current internal combustion engines is realistic due to its many benefits. The higher thermal efficiency due to the higher octane value and lower exhaust emissions including CO2 as a result of the lower carbon to hydrogen ratio of the fuel are the two important feature of using CNG as an alternative fuel. It is well known that computer simulation codes are valuable economically as a cost...

Analytical and experimental analyses of nonlinear vibrations in a rotary inverted pendulum

, Article Nonlinear Dynamics ; Volume 107, Issue 3 , 2022 , Pages 1887-1902 ; 0924090X (ISSN) Dolatabad, M.R ; Pasharavesh, A ; Khayyat, A. A. A ; Sharif University of Technology

Springer Science and Business Media B.V 2022

Abstract

Gaining insight into possible vibratory responses of dynamical systems around their stable equilibria is an essential step, which must be taken before their design and application. The results of such a study can significantly help prevent instability in closed-loop stabilized systems by avoiding the excitation of the system in the neighborhood of its resonance. This paper investigates nonlinear oscillations of a rotary inverted pendulum (RIP) with a full-state feedback controller. Lagrange’s equations are employed to derive an accurate 2-DoF mathematical model, whose parameter values are extracted by both the measurement and 3D modeling of the real system components. Although the governing...