Loading...
Search for: parallel-implementations
0.005 seconds

    3-point RANSAC for fast vision based rotation estimation using GPU technology

    , Article IEEE International Conference on Multisensor Fusion and Integration for Intelligent Systems9 February 2017 ; 2017 , Pages 212-217 ; 9781467397087 (ISBN) Kamran, D ; Manzuri, M. T ; Marjovi, A ; Karimian, M ; Sharif University of Technology
    Institute of Electrical and Electronics Engineers Inc  2017
    Abstract
    In many sensor fusion algorithms, the vision based RANdom Sample Consensus (RANSAC) method is used for estimating motion parameters for autonomous robots. Usually such algorithms estimate both translation and rotation parameters together which makes them inefficient solutions for merely rotation estimation purposes. This paper presents a novel 3-point RANSAC algorithm for estimating only the rotation parameters between two camera frames which can be utilized as a high rate source of information for a camera-IMU sensor fusion system. The main advantage of our proposed approach is that it performs less computations and requires fewer iterations for achieving the best result. Despite many... 

    Real-time Implementation of Vision-aided Navigation on GPU

    , M.Sc. Thesis Sharif University of Technology Kamran, Danial (Author) ; Manzuri Shalmani, Mohammad Taghi (Supervisor)
    Abstract
    Knowing the exact position of the robot in real world is one of crucial and important aspects of its navigation process. For this purpose, several inertial sensors such as gyroscope, accelerometer and compass have been used; however, each one of these sensors has its own drawbacks which cause some inaccuracies in some specific situations. Moreover, the Global Positioning System (GPS) is not available in indoor environments and also not accurate in outdoor places. All of these reasons have persuaded researchers to use camera frames captured from the top of robot as new information for estimating motion parameters of the robot. The main challenge for vision aided localization algorithms is... 

    Parallel Implementation of Telecommunication Decodings in Real-time

    , M.Sc. Thesis Sharif University of Technology Jafarzadeh, Ali (Author) ; Hashemi, Matin (Supervisor)
    Abstract
    Many chip manufacturers have recently introduced high-performance deep-learning hardware accelerators. In modern GPUs, programmable tensor cores accelerate the heavy operations involved in deep neural networks. This paper presents a novel solution to re-purpose tensor cores in modern GPUs for high-throughput implementation of turbo decoders. Turbo codes closely approach Shannon’s limit on channel capacity, and are widely used in many state-of-the-art wireless systems including satellite communications and mobile communications. Experimental evaluations show that the proposed solution achieves about 1.2 Gbps throughput, which is higher compared to previous GPU-accelerated solutions  

    Stochastic successive convex approximation for non-convex constrained stochastic optimization

    , Article IEEE Transactions on Signal Processing ; Volume 67, Issue 16 , 2019 , Pages 4189-4203 ; 1053587X (ISSN) Liu, A ; Lau, V. K. N ; Kananian, B ; Sharif University of Technology
    Institute of Electrical and Electronics Engineers Inc  2019
    Abstract
    This paper proposes a constrained stochastic successive convex approximation (CSSCA) algorithm to find a stationary point for a general non-convex stochastic optimization problem, whose objective and constraint functions are non-convex and involve expectations over random states. Most existing methods for non-convex stochastic optimization, such as the stochastic (average) gradient and stochastic majorization-minimization, only consider minimizing a stochastic non-convex objective over a deterministic convex set. The proposed CSSCA algorithm can also handle stochastic non-convex constraints in optimization problems, and it opens the way to solving more challenging optimization problems that... 

    GIM: GPU accelerated RIS-Based influence maximization algorithm

    , Article IEEE Transactions on Parallel and Distributed Systems ; Volume 32, Issue 10 , 2021 , Pages 2386-2399 ; 10459219 (ISSN) Shahrouz, S ; Salehkaleybar, S ; Hashemi, M ; Sharif University of Technology
    IEEE Computer Society  2021
    Abstract
    Given a social network modeled as a weighted graph GG, the influence maximization problem seeks kk vertices to become initially influenced, to maximize the expected number of influenced nodes under a particular diffusion model. The influence maximization problem has been proven to be NP-hard, and most proposed solutions to the problem are approximate greedy algorithms, which can guarantee a tunable approximation ratio for their results with respect to the optimal solution. The state-of-the-art algorithms are based on Reverse Influence Sampling (RIS) technique, which can offer both computational efficiency and non-trivial (1-1/e-ϵ)-approximation ratio guarantee for any epsilon >0ϵ>0.... 

    GPU implementation of split-field finite difference time-domain method for drudelorentz dispersive media

    , Article Progress in Electromagnetics Research ; Volume 125 , 2012 , Pages 55-77 ; 10704698 (ISSN) Shahmansouri, A ; Rashidian, B ; Sharif University of Technology
    2012
    Abstract
    Split-field finite-difference time-domain (SF-FDTD) method can overcome the limitation of ordinary FDTD in analyzing periodic structures under oblique incidence. On the other hand, huge run times of 3D SF-FDTD, is practically a major burden in its usage for analysis and design of nanostructures, particularly when having dispersive media. Here, details of parallel implementation of 3D SF-FDTD method for dispersive media, combined with totalfield/ scattered-field (TF/SF) method for injecting oblique plane wave, are discussed. Graphics processing unit (GPU) has been used for this purpose, and very large speed up factors have been achieved. Also a previously reported formulation of SF-FDTD based... 

    High performance GPU implementation of k-NN based on Mahalanobis distance

    , Article CSSE 2015 - 20th International Symposium on Computer Science and Software Engineering, 18 August 2015 ; 2015 ; 9781467391818 (ISBN) Gavahi, M ; Mirzæi, R ; Nazarbeygi, A ; Ahmadzadeh, A ; Gorgin, S ; Sharif University of Technology
    Abstract
    The k-nearest neighbor (k-NN) is a widely used classification technique and has significant applications in various domains. The most challenging issues in the k-nearest neighbor algorithm are high dimensional data, the reasonable accuracy of results and suitable computation time. Nowadays, using parallel processing and deploying many-core platforms like GPUs is considered as one of the popular approaches to improving these issues. In this paper, we present a novel and accurate parallel implementation of k-NN based on Mahalanobis distance metric in GPU platform. We design and implement k-NN for GPU architecture and utilize mathematic and algorithmic techniques to eliminate repetitive... 

    Developing 3D neutron transport kernel for heterogeneous structures in an improved method of characteristic (MOC) framework

    , Article Progress in Nuclear Energy ; Volume 127 , 2020 Porhemmat, M. H ; Hadad, K ; Salehi, A. A ; Moghadam, A ; Sharif University of Technology
    Elsevier Ltd  2020
    Abstract
    Given the importance and complexity of the three-dimensional (3D) neutron transport equation solution, in the current research, a new Modular Ray Tracing (MRT) Algorithm and 3D characteristic kernel for heterogeneous structures are presented. Improvement of memory management and cache coherency are achieved to some acceptable level. Also, parallel implementation of transport algorithm utilizing OpenMP, cause significant reduction in runtime. To validate our Algorithm, first, a self-constituted pin cell and a lattice arrangement are modeled and results are compared with Monte-Carlo simulation. Second, the well-known 3D benchmark, Takeda model one and two, are investigated and results compared...