Loading...
Search for:
arjomand--m
0.129 seconds
Total 20439 records
Application-aware deadlock-free oblivious routing based on extended turn-model
, Article IEEE/ACM International Conference on Computer-Aided Design, Digest of Technical Papers, ICCAD, 7 November 2011 through 10 November 2011, San Jose, CA ; 2011 , Pages 213-218 ; 10923152 (ISSN) ; 9781457713989 (ISBN) ; Zolghadr, M ; Arjomand, M ; Sarbazi-Azad, H ; Sharif University of Technology
2011
Abstract
Programmable hardware is gaining popularity as it can keep pace with growing performance demand in tight power budget, design and test cost, and serious reliability concerns of future multiprocessor embedded systems. Compatible with this trend, Network-on-Chip, as a potential bottleneck of future multi-cores, should also support pro-grammability. Here, we address this issue in design and implementation of routing algorithm for two-dimensional mesh. To this end, we allocate paths based on input traffic pattern and in parallel with customizing routing restriction for deadlock freedom. To achieve this, we propose extended turn model (ETM), a novel parametric deadlock-free routing for 2D meshes...
Efficient processor allocation in a reconfigurable CMP architecture for dark silicon era
, Article Proceedings of the 34th IEEE International Conference on Computer Design, ICCD 2016, 2 October 2016 through 5 October 2016 ; 2016 , Pages 336-343 ; 9781509051427 (ISBN) ; Hoveida, M ; Arjomand, M ; Jalili, M ; Sarbazi Azad, H ; Sharif University of Technology
Institute of Electrical and Electronics Engineers Inc
2016
Abstract
The continuance of Moore's law and failure of Dennard scaling force future chip multiprocessors (CMPs) to have considerable dark regions. How to use up available dark resources is an important concern for computer architects. In harmony with these changes, we must revise processor allocation schemes that severely affect the performance of a parallel on-chip system. A suitable allocation algorithm should reduce runtime and increase the power efficiency with proper thermal distribution to avoid hotspots. With this motivation, this paper proposes a power-efficient and high performance general purpose infrastructure for which a Dark Silicon Aware Processor Allocation (DSAPA) scheme is proposed...
An efficient STT-Ram last level cache architecture for GPUs
, Article Proceedings - Design Automation Conference ; 2-5 June , 2014 , pp. 1-6 ; ISSN: 0738100X ; ISBN: 9781479930173 ; Abbasitabar, H ; Arjomand, M ; Sarbazi-Azad, H ; Sharif University of Technology
2014
Abstract
In this paper, having investigated the behavior of GPGPU applications, we present an effcient L2 cache architecture for GPUs based on STT-RAM technology. With the increase of processing cores count, larger on-chip memories are required. Due to its high density and low power characteristics, STT-RAM technology can be utilized in GPUs where numerous cores leave a limited area for on-chip memory banks. They have however two important issues, high energy and latency of write operations, that have to be addressed. Low data retention time STT-RAMs can reduce the energy and delay of write operations. However, employing STT-RAMs with low retention time in GPUs requires a thorough investigation on...
Drug nano-particles formation by supercritical rapid expansion method; operational condition effects investigation
, Article Iranian Journal of Chemistry and Chemical Engineering ; Volume 30, Issue 1 , 2011 , Pages 7-15 ; 10219986 (ISSN) ; Akbarnejad, M. M ; Vaziri Yazdi, A ; Arjomand, M ; Safekordi, A. A ; Sharif University of Technology
2011
Abstract
Dissolution pressure and nozzle temperature effects on particle size and distribution were investigated for RESS (Rapid Expansion of Supercritical Solution) process. Supercritical CO2 was used as solvent and Ibuprofen was applied as the model component in all runs. The resulting Ibuprofen nano-particles (about 50 nm in optimized runs) were analyzed by SEM and laser diffraction particle size analyzer systems. Results show that in low supercritical pressure ranges, depending on the solvent and solid component properties (Lower than 105 bar for Ibuprofen-CO2 system), nozzle temperature should be as low as possible (80-90 °C for Ibuprofen-CO2 system). In the other hand in high supercritical...
Voltage-frequency planning for thermal-aware, low-power design of regular 3-D NoCs
, Article Proceedings of the IEEE International Conference on VLSI Design ; 2010 , p. 57-62 ; ISSN: 10639667 ; ISBN: 9780769539287 ; Sarbazi-Azad, H ; Sharif University of Technology
2010
Abstract
Network-on-Chip combined with Globally Asynchronous Locally Synchronous paradigm is a promising architecture for easy IP integration and utilization with multiple voltage levels. For power reduction, multiple voltage-frequency levels are successfully applied to 2-D NoCs, but never with a generic approach to 3-D counterparts; in which low heat conductivity of insulator layers makes high dense temperature distribution at layers away from heat sink. In this paper, a thermal-aware methodology for regular 3-D NoCs based on multiple voltage levels is proposed. Given an application task graph, this methodology determines an efficient mapping of tasks onto network tiles, considering inherent...
Power-performance analysis of networks-on-chip with arbitrary buffer allocation schemes
, Article IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems ; Vol. 29, issue. 10 , 2010 , p. 1558-1571 ; ISSN: 02780070 ; Sarbazi-Azad, H ; Sharif University of Technology
2010
Abstract
End-to-end delay, throughput, energy consumption, and silicon area are the most important design metrics of networks-on-chip (NoCs). Although several analytical models have been previously proposed for predicting such metrics in NoCs, very few of them consider the effect of message waiting time in the buffers of network routers for predicting overall power consumptions and none of them consider structural heterogeneity of network routers. This paper introduces two inter-related analytical models to compute message latency and power consumption of NoCs with arbitrary topology, buffering structure, and routing algorithm. Buffer allocation scheme defines the buffering space for each individual...
AdaBoost-based face detection in color images with low false alarm
, Article ICCMS 2010 - 2010 International Conference on Computer Modeling and Simulation, 22 January 2010 through 24 January 2010, Sanya ; Volume 2 , 2010 , Pages 107-111 ; 9780769539416 (ISBN) ; Kasaei, S ; Sharif University of Technology
2010
Abstract
In this paper, we have proposed a new face detection method which combines the AdaBoost algorithm with skin color information and support vector machine (SVM). First, a cascade classifier based on AdaBoost is used to detect faces in images. Due to noise and illumination changes some nonfaces might be detected too, therefore we have used a skin color model in the YCbCr color space to remove some of the detected nonfaces. Finally, we have utilized SVM to detect faces more accurately. Experimental results show that the performance of the proposed method is higher than the basic AdaBoost in the sense of detecting fewer nonfaces
Voltage-frequency planning for thermal-aware, low-power design of regular 3-D NoCs
, Article Proceedings of the IEEE International Conference on VLSI Design, 3 January 2010 through 7 January 2010, Bangalore ; 2010 , Pages 57-62 ; 10639667 (ISSN) ; 9780769539287 (ISBN) ; Sarbazi Azad, H ; Sharif University of Technology
2010
Abstract
Network-on-Chip combined with Globally Asynchronous Locally Synchronous paradigm is a promising architecture for easy IP integration and utilization with multiple voltage levels. For power reduction, multiple voltage-frequency levels are successfully applied to 2-D NoCs, but never with a generic approach to 3-D counterparts; in which low heat conductivity of insulator layers makes high dense temperature distribution at layers away from heat sink. In this paper, a thermal-aware methodology for regular 3-D NoCs based on multiple voltage levels is proposed. Given an application task graph, this methodology determines an efficient mapping of tasks onto network tiles, considering inherent...
Power-performance analysis of networks-on-chip with arbitrary buffer allocation schemes
, Article IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems ; Volume 29, Issue 10 , September , 2010 , Pages 1558-1571 ; 02780070 (ISSN) ; Sarbazi Azad, H ; Sharif University of Technology
2010
Abstract
End-to-end delay, throughput, energy consumption, and silicon area are the most important design metrics of networks-on-chip (NoCs). Although several analytical models have been previously proposed for predicting such metrics in NoCs, very few of them consider the effect of message waiting time in the buffers of network routers for predicting overall power consumptions and none of them consider structural heterogeneity of network routers. This paper introduces two inter-related analytical models to compute message latency and power consumption of NoCs with arbitrary topology, buffering structure, and routing algorithm. Buffer allocation scheme defines the buffering space for each individual...
Performance evaluation of butterfly on-chip network for MPSoCs
, Article 2008 International SoC Design Conference, ISOCC 2008, Busan, 24 November 2008 through 25 November 2008 ; Volume 1 , 2008 , Pages I296-I299 ; 9781424425990 (ISBN) ; Sarbazi Azad, H ; Sharif University of Technology
2008
Abstract
By Technology improvement, tens or hundreds of IP cores, operating complex functions with different frequencies, are mapped on-chip. This results in heterogeneous Multiprocessor System-on-Chip (MPSoC). The most MPSoC design challenges are due to infrastructure interconnect. Network-on-Chip (NoC) with multiple constraints to be satisfied is a promising solution for these challenges. It has been shown that infrastructure topology, routing and switching schemes have great effects on overall interconnect performance under different synthesis and real life traffic patterns. In this paper, we evaluate Butterfly network with arbitrary extra stages as MPSoC infrastructure. Different routing and...
Revisiting processor allocation and application mapping in future CMPs in dark silicon era
, Article Advances in Computers ; Volume 110 , 2018 , Pages 35-81 ; 00652458 (ISSN); 9780128153581 (ISBN) ; Aghaaliakbari, F ; Jalili, M ; Bashizade, R ; Arjomand, M ; Sarbazi Azad, H ; Sharif University of Technology
Academic Press Inc
2018
Abstract
With technology advances and the emergence of new fabrication and VLSI technologies, current and future chip multiprocessors (CMPs) are expected to have tens to hundreds of processing elements and Gigabytes of on-chip caches, which are connected by a high bandwidth network-on-chip (NoC). Unfortunately, due to limited power budget of a computing system, specially for its processing element(s), it is impossible to keep all cores, caches, and network elements working at highest voltage level—that would resulted in dark silicon computing era, where by employing system-level or architecture-level techniques, one can keep a great portion of a CMP elements OFF (or in dim mode) to meet the power...
A hybrid Non-Volatile Cache Design for Solid-State Drives using comprehensive I/O characterization
, Article IEEE Transactions on Computers ; Volume 65, Issue 6 , 2016 , Pages 1678-1691 ; 00189340 (ISSN) ; Asadi, H ; Haghdoost, A ; Arjomand, M ; Sarbazi Azad, H ; Sharif University of Technology
IEEE Computer Society
2016
Abstract
The emergence of new memory technologies provides us with opportunity to enhance the properties of existing memory architectures. One such technology is Phase Change Memory (PCM) which boasts superior scalability, power savings, non-volatility, and a performance competitive to Dynamic Random Access Memory (DRAM). In this paper, we propose a write buffer architecture for Solid-State Drives (SSDs) which attempts to exploit PCM as a DRAM alternative while alleviating its issues such as long write latency, high write energy, and finite endurance. To this end and based on thorough I/O characterization of desktop and enterprise applications, we propose a hybrid DRAM-PCM SSD cache design with an...
Efficient mapping of applications for future chip-multiprocessors in dark silicon era
, Article ACM Transactions on Design Automation of Electronic Systems ; Volume 22, Issue 4 , 2017 ; 10844309 (ISSN) ; Aghaaliakbari, F ; Bashizade, R ; Arjomand, M ; Sarbazi Azad, H ; Sharif University of Technology
2017
Abstract
The failure of Dennard scaling has led to the utilization wall that is the source of dark silicon and limits the percentage of a chip that can actively switch within a given power budget. To address this issue, a structure is needed to guarantee the limited power budget along with providing sufficient flexibility and performance for different applications with various communication requirements. In this article, we present a generalpurpose platform for future many-core Chip-Multiprocessors (CMPs) that benefits from the advantages of clustering, Network-on-Chip (NoC) resource sharing among cores, and power gating the unused components of clusters. We also propose two task mapping methods for...
Face Detection in Color Images
, M.Sc. Thesis Sharif University of Technology ; Kasaei, Shohreh (Supervisor)
Abstract
Human face detection is an important research area with several applications such as human computer interface (HCI), face recognition, surveillance systems, security systems, and content-based image retrieval (CBIR). Face detection problem can be stated as “determining whether there are human faces in the image” and if there are “returning the location of each human face in the image” regardless of its position, size, scale, orientation, and lighting condition. In this thesis, we have proposed a new face detection method which combines the AdaBoost algorithm with skin color information and support vector machine (SVM). First, a cascade classifier based on AdaBoost is used to detect faces in...
A Comprehensive Approach for the Validation of Lumbar Spine Finite Element Models Investigating Post-Fusion Adjacent Segment Effects
, M.Sc. Thesis Sharif University of Technology ; Arjomand, Navid (Supervisor)
Abstract
Spinal fusion surgery is usually followed by accelerated degenerative changes in the unfused segments above and below the treated segment(s), i.e., adjacent segment disease (ASD). While a number of risk factors for ASD have been suggested, its exact pathogenesis remains to be identified. Finite element (FE) models are indispensable tools to investigate mechanical effects of fusion surgeries on post-fusion changes in the adjacent segment kinematics and kinetics. Existing modeling studies validate only their intact FE model against in vitro data and subsequently simulate post-fusion in vivo conditions. The present study provides a novel approach for the comprehensive validation of a lumbar...
A High-Performance and Power-Efficient Design of Memory Hierarchy in Multi-Core Systems Using Non-Volatile Technologies
, Ph.D. Dissertation Sharif University of Technology ; Sarbazi-Azad, Hamid (Supervisor)
Abstract
Ever increasing number of on-chip processors coupled with the trend towards rising memory footprints of the programs increases the demand for larger cache and main memory to hide the long latency of disk system. During the last three decades, SRAM- and DRAM-based memory successfully kept pace with this capacity demand by exponential reduction in cost per bit. Feedbacks from industry also confirms that entering sub-20nm technology era with dominant role of leakage power, however, SRAM and DRAM memories are confronting serious scalability and power limitations. To this end, researchers always pursuit some circuit-level and architectural proposals for incorporating non-volatile technologies in...
An analytical performance evaluation for WSNs using loop-free bellman ford protocol
, Article 2009 International Conference on Advanced Information Networking and Applications, AINA 2009, Bradford, 26 May 2009 through 29 May 2009 ; 2009 , Pages 568-571 ; 1550445X (ISSN); 9780769536385 (ISBN) ; Hajisheykhi, R ; Arjomand, M ; Jahangir, A. H ; IEEE Computer Society ; Sharif University of Technology
2009
Abstract
Although several analytical models have been proposed for wireless sensor networks (WSNs) with different capabilities, very few of them consider the effect of general service distribution as well as design constraints on network performance. This paper presents a new analytical model to compute message latency in a WSN with loop-free Bellman Ford routing strategy. The model considers limited buffer size for each node using M/G/1/k queuing system. Also, contention probability and resource utilization are suitably modeled. The results obtained from simulation experiments confirm that the model exhibits a high degree of accuracy for various network configurations. © 2009 IEEE
A generic FPGA prototype for on-chip systems with network-on-chip communication infrastructure
, Article Computers and Electrical Engineering ; Vol. 40, issue. 1 , 2014 , pp. 158-167 ; ISSN: 00457906 ; Boroumand, A ; Sarbazi Azad, H ; Sharif University of Technology
2014
Abstract
As System-on-Chips (SoCs) grow in complexity and size, proposals of networks-on-chip (NoCs) as the on-chip communication infrastructure are justified by reusability, scalability, and energy efficiency provided by the interconnection networks. Simulation and mathematical analysis offer flexibility for the evaluations under various network configurations. However, the accuracy of such analyzing methods largely depends on the approximations made. On the other hand, prototyping can be used to improve the evaluation accuracy by bringing the design closer to reality. In this paper, we propose a FPGA prototype that is general enough to model different video-processing SoCs where different cores...
A comprehensive power-performance model for NoCs with multi-flit channel buffers
, Article Proceedings of the International Conference on Supercomputing, 8 June 2009 through 12 June 2009, Yorktown Heights, NY ; 2009 , Pages 470-478 ; 9781605584980 (ISBN) ; Sarbazi-Azad, H ; ACM SIGARCH ; Sharif University of Technology
2009
Abstract
Large Multi-Processor Systems-on-Chip use Networks-on-Chip with a high degree of reusability and scalability for message communication. Therefore, network infrastructure is a crucial element affecting the overall system performance. On the other hand, technology improvements may lead to much energy consumption in micro-routers of an on-chip network. This necessitates an exhaustive analysis of NoCs for future designs. This paper presents a comprehensive analytical model to predict message latency for different data flows traversing across the network. This model considers channel buffers of multiple flits which were not previously studied in NoC context. Also, architectural descriptions of...
Efficient genetic based topological mapping using analytical models for on-chip networks
, Article Journal of Computer and System Sciences ; Volume 79, Issue 4 , 2013 , Pages 492-513 ; 00220000 (ISSN) ; Amiri, S. H ; Sarbazi Azad, H ; Sharif University of Technology
2013
Abstract
Network-on-Chips are now the popular communication medium to support inter-IP communications in complex on-chip systems with tens to hundreds IP cores. Higher scalability (compared to the traditional shared bus and point-to-point interconnects), throughput, and reliability are among the most important advantages of NoCs. Moreover, NoCs can well match current CAD methodologies mainly relying on modular and reusable structures with regularity of structural pattern. However, since NoCs are resource-limited, determining how to distribute application load over limited on-chip resources (e.g. switches, buffers, virtual channels, and wires) in order to improve the metrics of interest and satisfy...