Loading...
Search for: speed-ups
0.005 seconds
Total 23 records

    Unleashing the potentials of dynamism for page allocation strategies in SSDs

    , Article SIGMETRICS 2014 - Proceedings of the 2014 ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Systems ; 16-20 June , 2014 , pp. 551-552 ; ISBN: 9781450327893 Tavakkol, A ; Arjomand, M ; Sarbazi-Azad, H ; Sharif University of Technology
    Abstract
    In Solid-State Drives (SSDs) with tens of ash chips and highly parallel architecture, we can speed up I/O operations by well-utilizing resources during page allocation. Propos- als already exist for using static page allocation which does not balance the IO load and its efficiency depends on access address patterns. To our best knowledge, there have been no research thus far to show what happens if one or more internal resources can be freely allocated regardless of the request address. This paper explores the possibility of using different degrees of dynamism in page allocation and iden- tifies key design opportunities that they present to improve SSD's characteristics  

    Unleashing the potentials of dynamism for page allocation strategies in SSDs

    , Article SIGMETRICS 2014 - Proceedings of the 2014 ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Systems ; 2014 , pp. 551-552 ; ISBN: 9781450327893 Tavakkol, A ; Arjomand, M ; Sarbazi-Azad, H ; Sharif University of Technology
    Abstract
    In Solid-State Drives (SSDs) with tens of ash chips and highly parallel architecture, we can speed up I/O operations by well-utilizing resources during page allocation. Propos- als already exist for using static page allocation which does not balance the IO load and its efficiency depends on access address patterns. To our best knowledge, there have been no research thus far to show what happens if one or more internal resources can be freely allocated regardless of the request address. This paper explores the possibility of using different degrees of dynamism in page allocation and iden- tifies key design opportunities that they present to improve SSD's characteristics  

    GPU implementation of split-field finite difference time-domain method for drudelorentz dispersive media

    , Article Progress in Electromagnetics Research ; Volume 125 , 2012 , Pages 55-77 ; 10704698 (ISSN) Shahmansouri, A ; Rashidian, B ; Sharif University of Technology
    2012
    Abstract
    Split-field finite-difference time-domain (SF-FDTD) method can overcome the limitation of ordinary FDTD in analyzing periodic structures under oblique incidence. On the other hand, huge run times of 3D SF-FDTD, is practically a major burden in its usage for analysis and design of nanostructures, particularly when having dispersive media. Here, details of parallel implementation of 3D SF-FDTD method for dispersive media, combined with totalfield/ scattered-field (TF/SF) method for injecting oblique plane wave, are discussed. Graphics processing unit (GPU) has been used for this purpose, and very large speed up factors have been achieved. Also a previously reported formulation of SF-FDTD based... 

    Applying elastic unfolding technique in nonlinear inverse finite element method for sheet forming modeling

    , Article Advanced Materials Research, 8 July 2011 through 11 July 2011 ; Volume 341-342 , July , 2012 , Pages 242-246 ; 10226680 (ISSN) ; 9783037852521 (ISBN) Farahani, M. K ; Assempour, A ; Sharif University of Technology
    2012
    Abstract
    A simplified efficient finite element method called the inverse approach (IA) has been developed to estimate initial blank and strain distribution in sheet metal forming. This algorithm is an inverse method since the position of points in final shape is known and their corresponding position in the initial blank should be determined. This approach deals with the geometric compatibility of finite elements, plastic deformation theory, and virtual work principle. This method often based on implicit static algorithms, sometimes causes convergence problems because of strong nonlinearities. This paper introduces an initial guess to speed up the convergence of Newton- Raphson solution. The... 

    Almost-optimum signature matrices in binary-input synchronous overloaded CDMA

    , Article 2011 18th International Conference on Telecommunications, ICT 2011, 8 May 2011 through 11 May 2011, Ayia Napa ; 2011 , Pages 195-200 ; 9781457700248 (ISBN) Khoozani, M. H ; Rashidinejad, A ; Froushani, M. H. L ; Pad, P ; Marvasti, F ; Sharif University of Technology
    2011
    Abstract
    The everlasting bandwidth limitations in wireless communication networks has directed the researchers' thrust toward analyzing the prospect of overloaded Code Division Multiple Access (CDMA). In this paper, we have proposed a Genetic Algorithm in search of optimum signature matrices for binary-input synchronous CDMA. The main measure of optimality considered in this paper, is the per-user channel capacity of the overall multiple access system. Our resulting matrices differ from the renowned Welch Bound Equality (WBE) codes, regarding the fact that our attention is specifically aimed at binary, rather than Gaussian, input distributions. Since design based on channel capacity is... 

    DWM-CDD: Dynamic weighted majority concept drift detection for spam mail filtering

    , Article World Academy of Science, Engineering and Technology ; Volume 80 , 2011 , Pages 291-294 ; 2010376X (ISSN) Nosrati, L ; Pour, A. N ; Sharif University of Technology
    Abstract
    Although e-mail is the most efficient and popular communication method, unwanted and mass unsolicited e-mails, also called spam mail, endanger the existence of the mail system. This paper proposes a new algorithm called Dynamic Weighted Majority Concept Drift Detection (DWM-CDD) for content-based filtering. The design purposes of DWM-CDD are first to accurate the performance of the previously proposed algorithms, and second to speed up the time to construct the model. The results show that DWM-CDD can detect both sudden and gradual changes quickly and accurately. Moreover, the time needed for model construction is less than previously proposed algorithms  

    A fast enhanced algorithm of PRI transform

    , Article Proceedings - 6th International Symposium on Parallel Computing in Electrical Engineering, PARELEC 2011, 4 April 2011 through 5 April 2011 ; April , 2011 , Pages 179-184 ; 9780769543970 (ISBN) Mahdavi, A ; Pezeshk, A. M ; Sharif University of Technology
    2011
    Abstract
    The problem of estimating pulse repetition interval (PRI) of an interleaved pulse train which consist of several independent radar signals, is the main issue of signal processing in electronic support systems. PRI Transform algorithm is one of the well known and effective methods of PRI detection which is capable of detecting several close jittered signals and surpassing subharmonics, but have some drawbacks especially because of small PRI dynamic range and heavy computations. In this paper a modified PRI transform is introduced which manage wide range of PRIs simultaneously, and speed up the algorithm by significantly reducing the computations. Moreover an efficient threshold is set for... 

    Soft error rate estimation of digital circuits in the presence of Multiple Event Transients (METs)

    , Article Proceedings -Design, Automation and Test in Europe, DATE, 14 March 2011 through 18 March 2011 ; March , 2011 , Pages 70-75 ; 15301591 (ISSN) ; 9783981080179 (ISBN) Fazeli, M ; Ahmadian, S. N ; Miremadi, S. G ; Asadi, H ; Tahoori, M. B ; Sharif University of Technology
    2011
    Abstract
    In this paper, we present a very fast and accurate technique to estimate the soft error rate of digital circuits in the presence of Multiple Event Transients (METs). In the proposed technique, called Multiple Event Probability Propagation (MEPP), a four-value logic and probability set are used to accurately propagate the effects of multiple erroneous values (transients) due to METs to the outputs and obtain soft error rate. MEPP considers a unified treatment of all three masking mechanisms i.e., logical, electrical, and timing, while propagating the transient glitches. Experimental results through comparisons with statistical fault injection confirm accuracy (only 2.5% difference) and... 

    Accelerating the performance of parallel depth-first-search branch-and-bound algorithm in transportation network design problem

    , Article ICORES 2015 - 4th International Conference on Operations Research and Enterprise Systems, Proceedings, 10 January 2015 through 12 January 2015 ; January , 2015 , Pages 359-366 ; 9789897580758 (ISBN) Zarrinmehr, A ; Shafahi, Y ; Sharif University of Technology
    SciTePress  2015
    Abstract
    Transportation Network Design Problem (TNDP) aims at selection of a subset of proposed urban projects in budget constraint to minimize the network users' total travel time. This is a well-known resource-intensive problem in transportation planning literature. Application of parallel computing, as a result, can be useful to address the exact solution of TNDP. This paper is going to investigate how the performance of a parallel Branch-and-Bound (B&B) algorithm with Depth-First-Search (DFS) strategy can be accelerated. The paper suggests assigning greedy solutions to idle processors at the start of the algorithm. A greedy solution, considered in this paper, is a budget-wise feasible selection... 

    A scalable offset-cancelled current/voltage sense amplifier

    , Article ISCAS 2010 - 2010 IEEE International Symposium on Circuits and Systems: Nano-Bio Circuit Fabrics and Systems, 30 May 2010 through 2 June 2010, Paris ; 2010 , Pages 3853-3856 ; 9781424453085 (ISBN) Attarzadeh, H ; SharifKhani, M ; Jahinuzzaman, S. M ; Sharif University of Technology
    2010
    Abstract
    the application of current sense amplifiers in scaled SRAM design is limited by two factors: the DC offset due to the device mismatch and limited voltage headroom. The presented scheme reduces the effect of offset by proposing an extra phase for offset cancellation before current sensing takes place. A twofold reduction of the cell access time is achieved compared to the conventional scheme under similar cell current and bitline capacitance. The offset cancellation phase takes place in parallel to the wordline decoding time in order to speed up the current sensing. The proposed scheme requires a small power budget due to a self shut off mechanism. In addition to presenting a comparison with... 

    Importance of KPI in BI system, case study: Iranian industries

    , Article ITNG2010 - 7th International Conference on Information Technology: New Generations, 12 April 2010 through 14 April 2010 ; April , 2010 , Pages 1245-1246 ; 9780769539843 (ISBN) Seify, M ; Premier Hall for Advancing Science and Engineering, Inc. (PHASE) ; Sharif University of Technology
    2010
    Abstract
    In today's competitive world, having an effective business intelligent (BI) system for monitoring and evaluation of industrial and product oriented organizations is vital. A perfect BI must help managers not only to speed up his decision making process; but also to increase quality of their decision. But how? One of the main specifications of an effective BI system is providing managers by correct format of information and at the correct time. Critical Success Factors (CSFs) are a kind of significant factors that must be considered in attaining an organization's goal; and key performance indicators (KPIs) are a type of quantitative and measurable CSF; and one character of an effective BI is... 

    Dynamic FPGA-accelerator sharing among concurrently running virtual machines

    , Article Proceedings of 2016 IEEE East-West Design and Test Symposium, EWDTS 2016, 14 October 2016 through 17 October 2016 ; 2017 ; 9781509006939 (ISBN) Nasiri, H ; Goudarzi, M ; Sharif University of Technology
    Abstract
    Using an FPGA as a hardware accelerator has been prevalent, to speed up compute intensive workloads. However, employing an accelerator in virtualized environment enhances complexity, because accessing the accelerator from virtual machines has significant overhead and sharing it needs some considerations. We have implemented adequate infrastructure to share an FPGA-based accelerator between multiple virtual machines with negligible access overhead which dynamically implements virtual machines' accelerators. In our architecture each user process from a virtual machine can directly access the FPGA over PCIe link and reconfigure its accelerator in the specified part of FPGA at run-time. The... 

    Efficient and safe path planning for a mobile robot using genetic algorithm

    , Article 2009 IEEE Congress on Evolutionary Computation, CEC 2009, Trondheim, 18 May 2009 through 21 May 2009 ; 2009 , Pages 2091-2097 ; 9781424429592 (ISBN) Naderan Tahan, M ; Manzuri Shalmani, T ; Sharif University of Technology
    2009
    Abstract
    In this paper, a new method for path planning is proposed using a genetic algorithm (GA). Our method has two key advantages over existing GA methods. The first is a novel environment representation which allows a more efficient method for obstacles dilation in comparison to current cell based approaches that have a tradeoff between speed and accuracy. The second is the strategy we use to generate the initial population in order to speed up the convergence rate which is completely novel. Simulation results show that our method can find a near optimal path faster than computational geometry approaches and with more accuracy in smaller number of generations than GA methods. © 2009 IEEE  

    Improvement of fault detection in wireless sensor networks

    , Article 2009 Second ISECS International Colloquium on Computing, Communication, Control, and Management, CCCM 2009, Sanya, 8 August 2009 through 9 August 2009 ; Volume 4 , 2009 , Pages 644-646 ; 9781424442461 (ISBN) Khazaei, E ; Barati, A ; Movaghar, A ; Yangzhou University; Guangdong University of Business Studies; Wuhan Institute of Technology; IEEE SMC TC on Education Technology and Training; IEEE Technology Management Council ; Sharif University of Technology
    2009
    Abstract
    This paper presents a centralized fault detection algorithm for wireless sensor networks. Faulty sensor nodes are identified based on comparisons between neighboring nodes and own central node and dissemination of the decision made at each node. RNS system is used to tolerate transient faults in sensing and communication. In this system, arithmetic operations act on residues - reminder of dividing original number in several definite modules - in parallel. Consequently computations on these residues which are smaller than the original number are performed, so speed up arithmetic and decreased power consumption is achieved. ©2009 IEEE  

    Speeding up of Genetic Structural Variation Detection

    , M.Sc. Thesis Sharif University of Technology Akbari Nejad Mousavi, Shaya (Author) ; Goudarzi, Maziar (Supervisor)
    Abstract
    Large differences in chromosome structures, compared to the reference genome, are one of the essential reasons for genetic variations. These differences that are called structural variations are associated with numerous diseases, including schizophrenia, cancer development, and autism. Therefore, calling these variations is of utmost importance in the next stages of analysis. However, due to computationally intensive tasks of discovering these variations, calling structural variations is lagging behind data produced by sequencers. Hence, discovering these variations with proper accuracy and in a reasonable time is of paramount importance. In this research, we implement a fast, yet accurate,... 

    Mapping the probability of microlensing detection of extra-solar planets

    , Article 59th International Astronautical Congress 2008, IAC 2008, Glasgow, 29 September 2008 through 3 October 2008 ; Volume 3 , 2008 , Pages 1865-1877 ; 9781615671601 (ISBN) Tabeshian, M ; Molaverdikhani, K ; Sharif University of Technology
    2008
    Abstract
    The growing rate of increase in the number of the discovered extra-solar planets which has consequently raised the enthusiasm to explore the universe in hope of finding earth-like planets has resulted in the wide use of Gravitational Microlensing as a planet detection method. However, until September 2008, only 7 out of the overall 307 discovered exoplanets have been detected through Microlensing, a fact which shows that this method is relatively new in the detection of extra-solar planets. Therefore, preparing a map of the sky which pinpoints the regions with higher probability of planet detection by this method and is drawn based on the available equipments and other regional factors... 

    Accelerating 3-D capacitance extraction in deep sub-micron VLSI design using vector/parallel computing

    , Article 13th International Conference on Parallel and Distributed Systems, ICPADS, Hsinchu, 5 December 2007 through 7 December 2007 ; Volume 2 , December , 2007 ; 15219097 (ISSN); 9781424418909 (ISBN) Shahbazi, N ; Sarbazi Azad, H ; Sharif University of Technology
    2007
    Abstract
    The widespread application of deep sub-micron and multilayer routing techniques makes the interconnection parasitic influence become the main factor to limit the performance of VLSI circuits. Therefore, fast and accurate 3D capacitance extraction is essential for ultra deep sub-micron design (UDSM) of integrated circuits. Parallel processing provides an approach to reducing the simulation turn-around time. In this paper, we present parallel formulations for 3D capacitance extraction based on P-FFT algorithm, on a personal computer (PC) or on a network of PCs. We implement both vector and parallel versions of 3D capacitance extraction algorithm simultaneously and evaluate our implementation... 

    Fast wavelet-based photoacoustic microscopy

    , Article Journal of the Optical Society of America A: Optics and Image Science, and Vision ; Volume 38, Issue 11 , 2021 , Pages 1673-1680 ; 10847529 (ISSN) Abbasi, H ; Mostafavi, S. M ; Kavehvash, Z ; Sharif University of Technology
    The Optical Society  2021
    Abstract
    A novel photoacoustic microscopy (PAM) structure, based on Haar wavelet patterns, is proposed in this paper. Its main goal is to mitigate the PAM imaging resolution and thus the time of its sampling process without compromising the image quality. Owing to the intrinsic nature of wavelet transform, this structure collects spatial and spectral components simultaneously, and this feature speeds up the sampling process by 33%. The selection of these patterns helps in better control of required conditions, such as multi-resolution imaging, to guarantee adequate image quality in comparison to previous microscopic structures. Simulation results prove the superior quality of the proposed approach... 

    Fast wavelet-based photoacoustic microscopy

    , Article Journal of the Optical Society of America A: Optics and Image Science, and Vision ; Volume 38, Issue 11 , 2021 , Pages 1673-1680 ; 10847529 (ISSN) Abbasi, H ; Mostafavi, S. M ; Kavehvash, Z ; Sharif University of Technology
    The Optical Society  2021
    Abstract
    A novel photoacoustic microscopy (PAM) structure, based on Haar wavelet patterns, is proposed in this paper. Its main goal is to mitigate the PAM imaging resolution and thus the time of its sampling process without compromising the image quality. Owing to the intrinsic nature of wavelet transform, this structure collects spatial and spectral components simultaneously, and this feature speeds up the sampling process by 33%. The selection of these patterns helps in better control of required conditions, such as multi-resolution imaging, to guarantee adequate image quality in comparison to previous microscopic structures. Simulation results prove the superior quality of the proposed approach... 

    FPGA-based fault injection into switch-level models

    , Article Microprocessors and Microsystems ; Volume 28, Issue 5-6 SPEC. ISS , 2004 , Pages 317-327 ; 01419331 (ISSN) Ejlali, A ; Miremadi, S. G ; Sharif University of Technology
    2004
    Abstract
    This article presents a method for fast fault injection into switch-level circuits using FPGA chips. In this method, gates model switch-level circuits and we can emulate mixed gate-switch-level models. By the use of this method, FPGA chips can be used to accelerate the fault-injection campaigns into switch-level models. The approach has been evaluated experimentally by injecting a set of faults into a pipelined RISC processor. The experimental results show that significant speed-ups with respect to fully simulation-based fault-injection methods can be achieved. © 2004 Elsevier B.V. All rights reserved