Search for: state-of-the-art
0.007 seconds
Total 214 records

    Recurrent poisson factorization for temporal recommendation

    , Article Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 13 August 2017 through 17 August 2017 ; Volume Part F129685 , 2017 , Pages 847-855 ; 9781450348874 (ISBN) Hosseini, S. A ; Alizadeh, K ; Khodadadi, A ; Arabzadeh, A ; Farajtabar, M ; Zha, H ; Rabiee, H. R ; Sharif University of Technology
    Poisson factorization is a probabilistic model of users and items for recommendation systems, where the so-called implicit consumer data is modeled by a factorized Poisson distribution. There are many variants of Poisson factorization methods who show state-of-the-art performance on real-world recommendation tasks. However, most of them do not explicitly take into account the temporal behavior and the recurrent activities of users which is essential to recommend the right item to the right user at the right time. In this paper, we introduce Recurrent Poisson Factorization (RPF) framework that generalizes the classical PF methods by utilizing a Poisson process for modeling the implicit... 

    Speaker recognition with random digit strings using uncertainty normalized HMM-Based i-Vectors

    , Article IEEE/ACM Transactions on Audio Speech and Language Processing ; Volume 27, Issue 11 , 2019 , Pages 1815-1825 ; 23299290 (ISSN) Maghsoodi, N ; Sameti, H ; Zeinali, H ; Stafylakis, T ; Sharif University of Technology
    Institute of Electrical and Electronics Engineers Inc  2019
    In this paper, we combine Hidden Markov Models HMMs with i-vector extractors to address the problem of text-dependent speaker recognition with random digit strings. We employ digit-specific HMMs to segment the utterances into digits, to perform frame alignment to HMM states and to extract Baum-Welch statistics. By making use of the natural partition of input features into digits, we train digit-specific i-vector extractors on top of each HMM and we extract well-localized i-vectors, each modelling merely the phonetic content corresponding to a single digit. We then examine ways to perform channel and uncertainty compensation, and we propose a novel method for using the uncertainty in the... 

    Deep submodular network: An application to multi-document summarization

    , Article Expert Systems with Applications ; Volume 152 , 2020 Ghadimi, A ; Beigy, H ; Sharif University of Technology
    Elsevier Ltd  2020
    Employing deep learning makes it possible to learn high-level features from raw data, resulting in more precise models. On the other hand, submodularity makes the solution scalable and provides the means to guarantee a lower bound for its performance. In this paper, a deep submodular network (DSN) is introduced, which is a deep network meeting submodularity characteristics. DSN lets modular and submodular features to participate in constructing a tailored model that fits the best with a problem. Various properties of DSN are examined and its learning method is presented. By proving that cost function used for learning process is a convex function, it is concluded that minimization can be... 

    Unsupervised image segmentation by mutual information maximization and adversarial regularization

    , Article IEEE Robotics and Automation Letters ; Volume 6, Issue 4 , 2021 , Pages 6931-6938 ; 23773766 (ISSN) Mirsadeghi, S. E ; Royat, A ; Rezatofighi, H ; Sharif University of Technology
    Institute of Electrical and Electronics Engineers Inc  2021
    Semantic segmentation is one of the basic, yet essential scene understanding tasks for an autonomous agent. The recent developments in supervised machine learning and neural networks have enjoyed great success in enhancing the performance of the state-of-the-art techniques for this task. However, their superior performance is highly reliant on the availability of a large-scale annotated dataset. In this letter, we propose a novel fully unsupervised semantic segmentation method, the so-called Information Maximization and Adversarial Regularization Segmentation (InMARS). Inspired by human perception which parses a scene into perceptual groups, rather than analyzing each pixel individually, our... 

    A fast iterative method for removing impulsive noise from sparse signals

    , Article IEEE Transactions on Circuits and Systems for Video Technology ; Volume 31, Issue 1 , 2021 , Pages 38-48 ; 10518215 (ISSN) Sadrizadeh, S ; Zarmehi, N ; Kangarshahi, E. A ; Abin, H ; Marvasti, F ; Sharif University of Technology
    Institute of Electrical and Electronics Engineers Inc  2021
    In this paper, we propose a new method to reconstruct a signal corrupted by noise where both signal and noise are sparse but in different domains. The main contribution of our algorithm is its low complexity; it has much lower run-time than most other algorithms. The reconstruction quality of our algorithm is both objectively (in terms of PSNR and SSIM) and subjectively better or comparable to other state-of-the-art algorithms. We provide a cost function for our problem, present an iterative method to find its local minimum, and provide the analysis of the algorithm. As an application of this problem, we apply our algorithm for Salt-and-Pepper noise (SPN) and Random-Valued Impulsive Noise... 

    Step response analysis of third order OpAmps with slew-rate

    , Article IEEE/IFIP International Conference on VLSI and System-on-Chip, VLSI-SoC ; 2013 , Pages 62-63 ; 23248432 (ISSN); 9781479905249 (ISBN) Hassanpourghadi, M ; Sharifkhani, M ; Sharif University of Technology
    IEEE Computer Society  2013
    Drawing an accurate relationship between settling time and the power consumption of the amplifier is a challenging problem in Switch Capacitor circuits especially when it includes non-linear effects. In this paper, a new method for the estimation of this relationship including both non-linear settling as a result of slew-rate and small signal settling in the 3 rd order amplifier is proposed. The results show that the proposed settling time estimation is more accurate than other conventional methods when it is compared with the circuit level simulations. The proposed method has error smaller than 10% for the third order OpAmp in estimating settling error. This is about two times more accurate... 

    Speaker models reduction for optimized telephony text-prompted speaker verification

    , Article Canadian Conference on Electrical and Computer Engineering, 3 May 2015 through 6 May 2015 ; Volume 2015-June, Issue June , May , 2015 , Pages 1470-1474 ; 08407789 (ISSN) Kalantari, E ; Sameti, H ; Zeinali, H ; Sharif University of Technology
    Institute of Electrical and Electronics Engineers Inc  2015
    In this article a new scheme is proposed to use mean supervector in text-prompted speaker verification system. In this scheme, for each month name a subsystem is constructed and a final score based on passphrase is computed by the combination of the scores of these subsystems. Results from the telephony dataset of Persian month names show that the proposed method significantly reduces EER in comparison with the-State-of-the-art State-GMM-MAP method. Furthermore, it is shown that based on training set and testing set we can use 12 models per speaker instead of 220. Therefore, this scheme reduces EER and computational burden. In addition, the use of HMM instead of GMM as words' model improves... 

    Extractive summarization of multi-party meetings through discourse segmentation

    , Article Natural Language Engineering ; Volume 22, Issue 1 , 2016 , Pages 41-72 ; 13513249 (ISSN) Bokaei, M. H ; Sameti, H ; Liu, Y ; Sharif University of Technology
    Cambridge University Press  2016
    In this article we tackle the problem of multi-party conversation summarization. We investigate the role of discourse segmentation of a conversation on meeting summarization. First, an unsupervised function segmentation algorithm is proposed to segment the transcript into functionally coherent parts, such as Monologuei (which indicates a segment where speaker i is the dominant speaker, e.g., lecturing all the other participants) or Discussionx1x2,...,xn (which indicates a segment where speakers x 1 to xn involve in a discussion). Then the salience score for a sentence is computed by leveraging the score of the segment containing the sentence. Performance of our proposed segmentation and... 

    3D human pose estimation from image using couple sparse coding

    , Article Machine Vision and Applications ; Vol. 25, issue. 6 , 2014 , p. 1489-1499 Zolfaghari, M ; Jourabloo, A ; Gozlou, S.G ; Pedrood, B ; Manzuri-Shalmani, M.T ; Sharif University of Technology
    Recent studies have demonstrated that high-level semantics in data can be captured using sparse representation. In this paper, we propose an approach to human body pose estimation in static images based on sparse representation. Given a visual input, the objective is to estimate 3D human body pose using feature space information and geometrical information of the pose space. On the assumption that each data point and its neighbors are likely to reside on a locally linear patch of the underlying manifold, our method learns the sparse representation of the new input using both feature and pose space information and then estimates the corresponding 3D pose by a linear combination of the bases... 

    History based unsupervised data oriented parsing

    , Article International Conference Recent Advances in Natural Language Processing, RANLP ; September , 2013 , Pages 453-459 ; 13138502 (ISSN) Mesgar, M ; Ghasem Sani, G ; Sharif University of Technology
    Grammar induction is a basic step in natural language processing. Based on the volume of information that is used by different methods, we can distinguish three types of grammar induction method: supervised, unsupervised, and semi-supervised. Supervised and semisupervised methods require large tree banks, which may not currently exist for many languages. Accordingly, many researchers have focused on unsupervised methods. Unsupervised Data Oriented Parsing (UDOP) is currently the state of the art in unsupervised grammar induction. In this paper, we show that the performance of UDOP in free word order languages such as Persian is inferior to that of fixed order languages such as English. We... 

    Key splitting for random key distribution schemes

    , Article Proceedings - International Conference on Network Protocols, ICNP ; 2012 ; 10921648 (ISSN) ; 9781467324472 (ISBN) Ehdaie, M ; Alexiou, N ; Ahmadian, M ; Aref, M. R ; Papadimitratos, P ; Sharif University of Technology
    A large number of Wireless Sensor Network (WSN) security schemes have been proposed in the literature, relying primarily on symmetric key cryptography. To enable those, Random Key pre-Distribution (RKD) systems have been widely accepted. However, WSN nodes are vulnerable to physical compromise. Capturing one or more nodes operating with RKD would give the adversary keys to compromise communication of other benign nodes. Thus the challenge is to enhance resilience of WSN to node capture, while maintaining the flexibility and low-cost features of RKD. We address this problem, without any special-purpose hardware, proposing a new and simple idea: key splitting. Our scheme does not increase... 

    Traffic-aware buffer reconfiguration in on-chip networks

    , Article IEEE/IFIP International Conference on VLSI and System-on-Chip, VLSI-SoC, 5 October 2015 through 7 October 2015 ; Volume 2015-October , 2015 , Pages 201-206 ; 23248432 (ISSN) ; 9781467391405 (ISBN) Bashizade, R ; Sarbazi-Azad, H ; Sharif University of Technology
    IEEE Computer Society  2015
    Networks-on-Chip (NoCs) play a crucial role in the performance of Chip Multi-Processors (CMPs). Routers are one of the main components determining the efficiency of NoCs. As various applications have different communication characteristics and hence, buffering requirements, it is difficult to make proper decisions in this regard in the design time. In this paper, we propose a traffic-aware reconfigurable router which can adapt its buffers structure to the changes in the traffic of the network. Our proposed router manages to achieve up to 18.8% and 44.4% improvements in terms of postponing saturation rate under synthetic traffic patterns, and average packet latency for PARSEC applications,... 

    Fast aggregation scheduling in wireless sensor networks

    , Article IEEE Transactions on Wireless Communications ; Volume 14, Issue 6 , 2015 , Pages 3402-3414 ; 15361276 (ISSN) Yousefi, H ; Malekimajd, M ; Ashouri, M ; Movaghar, A ; Sharif University of Technology
    Institute of Electrical and Electronics Engineers Inc  2015
    Data aggregation is a key, yet time-consuming functionality introduced to conserve energy in wireless sensor networks (WSNs). In this paper, to minimize time latency, we focus on aggregation scheduling problem and propose an efficient distributed algorithm that generates a collision-free schedule with the least number of time slots. In contrast to others, our approach named FAST mainly contributes to both tree construction, where the former studies employ Connected 2-hop Dominating Sets, and aggregation scheduling that was previously addressed through the Competitor Sets computation. We prove that the latency of FAST under the protocol interference model is upper-bounded by 12R+Δ-2, where R... 

    Extractive meeting summarization through speaker zone detection

    , Article 16th Annual Conference of the International Speech Communication Association, INTERSPEECH 2015, 6 September 2015 through 10 September 2015 ; Volume 2015-January , January , 2015 , Pages 2724-2728 ; 2308457X (ISSN) Bokaei, M. H ; Sameti, H ; Liu, Y ; Sharif University of Technology
    International Speech and Communication Association  2015
    In this paper we investigate the role of discourse analysis in extractive meeting summarization task. Specifically our proposed method comprises of two distinct steps. First we use a meeting segmentation algorithm in order to detect various functional parts of the input meeting. Afterwards, a two level scoring mechanism in a graph-based framework is used to score each dialogue act in order to extract the most valuable ones and include them in the extracted summary. We evaluate our proposed method on AMI and ICSI corpora and compare it with other state-of-the-art graph based algorithms according to various evaluation metrics. The experimental results show that our algorithm outperforms the... 

    MDL-CW: A multimodal deep learning framework with cross weights

    , Article 2016 IEEE Conference on Computer Vision and Pattern Recognition, 26 June 2016 through 1 July 2016 ; Volume 2016-January , 2016 , Pages 2601-2609 ; 10636919 (ISSN) ; 9781467388511 (ISBN) Rastegar, S ; Soleymani Baghshah, M ; Rabiee, H. R ; Shojaee, S. M ; Sharif University of Technology
    IEEE Computer Society 
    Deep learning has received much attention as of the most powerful approaches for multimodal representation learning in recent years. An ideal model for multimodal data can reason about missing modalities using the available ones, and usually provides more information when multiple modalities are being considered. All the previous deep models contain separate modality-specific networks and find a shared representation on top of those networks. Therefore, they only consider high level interactions between modalities to find a joint representation for them. In this paper, we propose a multimodal deep learning framework (MDLCW) that exploits the cross weights between representation of... 

    Multi-label learning in the independent label sub-spaces

    , Article Pattern Recognition Letters ; Volume 97 , 2017 , Pages 8-12 ; 01678655 (ISSN) Barezi, E. J ; Kwok, J. T ; Rabiee, H. R ; Sharif University of Technology
    The objective in multi-label learning problems is simultaneous prediction of many labels for each input instance. During the past years, there were many proposed embedding based approaches to solve this problem by considering label dependencies and decreasing learning and prediction cost. However, compressing the data leads to lose part of information included in label space. The idea in this work is to divide the whole label space to some independent small groups which leads to independent learning and prediction for each small group in the main space, rather than transforming to the compressed space. We use subspace clustering approaches to extract these small partitions such that the... 

    An attribute learning method for zero-shot recognition

    , Article 2017 25th Iranian Conference on Electrical Engineering, ICEE 2017, 2 May 2017 through 4 May 2017 ; 2017 , Pages 2235-2240 ; 9781509059638 (ISBN) Yazdanian, R ; Shojaee, S. M ; Soleymani Baghshah, M ; Sharif University of Technology
    Recently, the problem of integrating side information about classes has emerged in the learning settings like zero-shot learning. Although using multiple sources of information about the input space has been investigated in the last decade and many multi-view and multi-modal learning methods have already been introduced, the attribute learning for classes (output space) is a new problem that has been attended in the last few years. In this paper, we propose an attribute learning method that can use different sources of descriptions for classes to find new attributes that are more proper to be used as class signatures. Experimental results show that the learned attributes by the proposed... 

    Effective cache bank placement for GPUs

    , Article 20th Design, Automation and Test in Europe, DATE 2017, 27 March 2017 through 31 March 2017 ; 2017 , Pages 31-36 ; 9783981537093 (ISBN) Sadrosadati, M ; Mirhosseini, A ; Roozkhosh, S ; Bakhishi, H ; Sarbazi Azad, H ; ACM Special Interest Group on Design Automation (ACM SIGDA); Electronic System Design Alliance (ESDA); et al.; European Design and Automation Association (EDAA); European Electronic Chips and Systems Design Initiative (ECSI); IEEE Council on Electronic Design Automation (CEDA) ; Sharif University of Technology
    Institute of Electrical and Electronics Engineers Inc  2017
    The placement of the Last Level Cache (LLC) banks in the GPU on-chip network can significantly affect the performance of memory-intensive workloads. In this paper, we attempt to offer a placement methodology for the LLC banks to maximize the performance of the on-chip network connecting the LLC banks to the streaming multiprocessors in GPUs. We argue that an efficient placement needs to be derived based on a novel metric that considers the latency hiding capability of the GPUs through thread level parallelism. To this end, we propose a throughput aware metric, called Effective Latency Impact (ELI). Moreover, we define an optimization problem to formulate our placement approach based on the... 

    HNP3: A hierarchical nonparametric point process for modeling content diffusion over social media

    , Article 16th IEEE International Conference on Data Mining, ICDM 2016, 12 December 2016 through 15 December 2016 ; 2017 , Pages 943-948 ; 15504786 (ISSN); 9781509054725 (ISBN) Hosseini, S. A ; Khodadadi, A ; Arabzadeh, A ; Rabiee, H. R ; Sharif University of Technology
    Institute of Electrical and Electronics Engineers Inc  2017
    This paper introduces a novel framework for modeling temporal events with complex longitudinal dependency that are generated by dependent sources. This framework takes advantage of multidimensional point processes for modeling time of events. The intensity function of the proposed process is a mixture of intensities, and its complexity grows with the complexity of temporal patterns of data. Moreover, it utilizes a hierarchical dependent nonparametric approach to model marks of events. These capabilities allow the proposed model to adapt its temporal and topical complexity according to the complexity of data, which makes it a suitable candidate for real world scenarios. An online inference... 

    Deep relative attributes

    , Article 13th Asian Conference on Computer Vision, ACCV 2016, 20 November 2016 through 24 November 2016 ; Volume 10115 LNCS , 2017 , Pages 118-133 ; 03029743 (ISSN); 9783319541921 (ISBN) Souri, Y ; Noury, E ; Adeli, E ; Sharif University of Technology
    Springer Verlag  2017
    Visual attributes are great means of describing images or scenes, in a way both humans and computers understand. In order to establish a correspondence between images and to be able to compare the strength of each property between images, relative attributes were introduced. However, since their introduction, hand-crafted and engineered features were used to learn increasingly complex models for the problem of relative attributes. This limits the applicability of those methods for more realistic cases. We introduce a deep neural network architecture for the task of relative attribute prediction. A convolutional neural network (ConvNet) is adopted to learn the features by including an...