Loading...
Search for: audio-signal
0.005 seconds
Total 22 records

    Two techniques for audio watermarking based on a novel transformation

    , Article 2007 IEEE International Conference on Signal Processing and Communications, ICSPC 2007, Dubai, 14 November 2007 through 27 November 2007 ; 2007 , Pages 1139-1142 ; 9781424412365 (ISBN) Feizi khankandi, S ; Akhaee, M. A ; Marvasti, F ; Sharif University of Technology
    2007
    Abstract
    The main purpose of this paper is to embed data in the transform domain of audio signals. Data embedding is similar to Quantization Index Modulation (QIM) approach. But the embedding process is performed in a novel transformation named PPG (Point to Point Graph) which converts the audio signal to a set of points in the Cartesian Coordinates. Two watermarking schemes are proposed in this paper. The first approach uses the QIM method on the radius of the PPG points while the second approach employs the logical operand for this aim. Simulation results show that these two methods have great robustness against the common attack (such as White Gaussian noise, echo and filtering). Subjective... 

    A novel technique for audio signals watermarking in the wavelet and Walsh transform domains

    , Article 2006 International Symposium on Intelligent Signal Processing and Communications, ISPACS'06, Yonago, 12 December 2006 through 15 December 2006 ; 2006 , Pages 171-174 ; 0780397339 (ISBN); 9780780397330 (ISBN) Akhaee, M. A ; Ghaemmaghami, S ; Khademi, N ; Sharif University of Technology
    Institute of Electrical and Electronics Engineers Inc  2006
    Abstract
    This paper presents a novel approach to audio signals watermarking in the wavelet or the Walsh transform domain. The idea is to embed watermark data in the coefficients of some scales of the transform domain. The overall bit rate of this method is about 90 bps. Due to low computational complexity of the suggested approach, particularly in the Walsh domain, this algorithm can be implemented in real time. Experimental results show robustness of the proposed method in low SNRs and also against some typical attacks, such as MP3 compression, echo, filtering, etc. Subjective evaluation confirms transparency of the watermarked audio signals. © 2006 IEEE  

    Design and Implementation of the Multiplicative Watermarking Technique for Multimedia Signals

    , Ph.D. Dissertation Sharif University of Technology Akhaee, Mohammad Ali (Author) ; Marvasti, Farrokh (Supervisor)
    Abstract
    One of the most effective and robust algorithms in watermarking are additive and multiplicative methods. Although the detector of additive watermarking methods are easier than the multiplicative one, they do not gain from human visual or auditory systems. This is the main drawback of additive watermarking techniques. On the other hand, the most advantage of multiplicative watermarking methods is that the power of the watermark is proportional to the power of the host signal. In this thesis, we have introduced a new multiplicative watermarking technique for audio and image signals. For the audio signal, the embedding is performed on the wavelet coefficients. We used Maximum likelihood rule... 

    Universal adversarial attacks on text classifiers

    , Article 44th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2019, 12 May 2019 through 17 May 2019 ; Volume 2019-May , 2019 , Pages 7345-7349 ; 15206149 (ISSN); 9781479981311 (ISBN) Behjati, M ; Moosavi Dezfooli, S. M ; Baghshah, M. S ; Frossard, P ; Sharif University of Technology
    Institute of Electrical and Electronics Engineers Inc  2019
    Abstract
    Despite the vast success neural networks have achieved in different application domains, they have been proven to be vulnerable to adversarial perturbations (small changes in the input), which lead them to produce the wrong output. In this paper, we propose a novel method, based on gradient projection, for generating universal adversarial perturbations for text; namely sequence of words that can be added to any input in order to fool the classifier with high probability. We observed that text classifiers are quite vulnerable to such perturbations: inserting even a single adversarial word to the beginning of every input sequence can drop the accuracy from 93% to 50%. © 2019 IEEE  

    A novel pruning approach for bagging ensemble regression based on sparse representation

    , Article 2020 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2020, 4 May 2020 through 8 May 2020 ; Volume 2020 , May , 2020 , Pages 4032-4036 Khorashadi Zadeh, A. E ; Babaie Zadeh, M ; Jutten, C ; The Institute of Electrical and Electronics Engineers, Signal Processing Society ; Sharif University of Technology
    Institute of Electrical and Electronics Engineers Inc  2020
    Abstract
    This work aims to propose an approach for pruning a bagging ensemble regression (BER) model based on sparse representation, which we call sparse representation pruning (SRP). Firstly, a BER model with a specific number of subensembles should be trained. Then, the BER model is pruned by our sparse representation idea. For this type of regression problems, pruning means to remove the subensembles that do not have a significant effect on prediction of the output. The pruning problem is casted as a sparse representation problem, which will be solved by orthogonal matching pursuit (OMP) algorithm. Experiments show that the pruned BER with only 20% of the initial subensembles has a better... 

    Quantization based audio watermarking in a new transform domain

    , Article 2008 International Symposium on Telecommunications, IST 2008, Tehran, 27 August 2008 through 28 August 2008 ; October , 2008 , Pages 682-687 ; 9781424427512 (ISBN) Akhaee, M. A ; Nikooienejad, A ; Marvasti, F ; Sharif University of Technology
    2008
    Abstract
    In this paper, a novel blind watermarking technique based on quantization is proposed. Quantization is performed in a special domain which converts one dimensional signal to a 2-D one named Point to Point Graph (PPG). Basis of the method is on the separation of this domain into two portions; while, only one portion is quantized. Furthermore, in the dewatermarking procedure, by using the unquantized portion and zero norm, the embedded data can be extracted. The performance of the proposed method is analytically investigated and verified by simulation with artificial Gaussian signals. Experimental results over several audio signals shows the great robustness of the technique in comparison with... 

    Audio classification based on sinusoidal model: a new feature

    , Article 2008 IEEE Region 10 Conference, TENCON 2008, Hyderabad, 19 November 2008 through 21 November 2008 ; 2008 ; 1424424089 (ISBN); 9781424424085 (ISBN) Shirazi, J ; Ghaemmaghami, S ; Sharif University of Technology
    2008
    Abstract
    In this paper, a new feature set is presented and evaluated based on sinusoidal modeling of audio signals. Duration of the longest sinusoidal model frequency track, as a measure of the harmony, is used and compared to typical features as input into an audio classifier. The performance of this sinusoidal model feature is evaluated through classification of audio to speech and music using both the GMM and the SVM classifiers. Classification results show the proposed feature, which could be used for the first time in such an audio classification, is quite successful in speech/music classification. Experimental comparisons with popular features for audio classification, such as HZCRR and LSTER,... 

    Audio watermarking based on quantization index modulation in the frequency domain

    , Article 2007 IEEE International Conference on Signal Processing and Communications, ICSPC 2007, Dubai, 14 November 2007 through 27 November 2007 ; 2007 , Pages 1127-1130 ; 9781424412365 (ISBN) Khademi, N ; Akhaee, M. A ; Ahadi, S. M ; Moradi, M ; Kashi, A ; Sharif University of Technology
    2007
    Abstract
    In this paper, our main purpose is to embed data in the frequency domain of audio signals. Data was embedded by means of Quantization Index Modulation (QIM) in the frequency domain. With this aim, the spectrum of the audio signal was divided into two parts. The first part consisted of the first 19 Barks and the second included the remaining 6 Barks. Each of these parts had a different quantization step size. In order to have large quantization step sizes which yield more robustness, Human Auditory System (HAS) has been used. Decoder detects the watermark signal, without using the original audio signal. Simulation results have shown that this watermarking scheme has better robustness against... 

    localization of Loose Parts on Primary Circuit of Bushehr Nuclear Power Plant by using Acoustic Signals of Sensors

    , M.Sc. Thesis Sharif University of Technology Mahmoudabadi, Saeed (Author) ; Ghofrani, Mohammad Bagher (Supervisor)
    Abstract
    Loose parts in the primary circuit of a nuclear power plant, causing damage to the fuel rods and other equipment, so early localization and mass estimation of this pieces can provide context of safety measures for the reactor after an event.Acoustic signals emitted by the location of these parts provide enough information to estimate their location and mass. So with obtain time-of-arrival differences between sensors and sound velocity can be estimate loose part location. In this thesis signals of the sensors in Bushehr power plant monitoring system are analyzied. To estimate the loose part locations, the time delays between sensors must be calculated. The time difference between the sensors... 

    Closure of sets: A statistically hypersensitive system for steganalysis of least significant bit embedding

    , Article IET Signal Processing ; Volume 5, Issue 4 , July , 2011 , Pages 379-389 ; 17519675 (ISSN) Khosravirad, S. R ; Eghlidos, T ; Ghaemmaghami, S ; Sharif University of Technology
    2011
    Abstract
    This study introduces a new scheme for steganalysis of the least significant bit (LSB) embedding, based on the idea of closure of sets (CoS), which is independent of the type of cover signal, applicable to both spatial and transform domains. The CoS is referred to as some special subsets that could be found in a common space whose elements relate to higher-order statistical properties of the signal. The proposed scheme is used for steganalysis of the LSB steganography of greyscale TIFF and JPEG images and audio signals, employing a set of accurate and monotone features that are extracted based on the CoS definition. It is shown that significant improvement to the detection accuracy in... 

    Comparison of uniform and random sampling for speech and music signals

    , Article 2017 12th International Conference on Sampling Theory and Applications, SampTA 2017, 3 July 2017 through 7 July 2017 ; 2017 , Pages 552-555 ; 9781538615652 (ISBN) Zarmehi, N ; Shahsavari, S ; Marvasti, F ; Sharif University of Technology
    Abstract
    In this paper, we will provide a comparison between uniform and random sampling for speech and music signals. There are various sampling and recovery methods for audio signals. Here, we only investigate uniform and random schemes for sampling and basic low-pass filtering and iterative method with adaptive thresholding for recovery. The simulation results indicate that uniform sampling with cubic spline interpolation outperforms other sampling and recovery methods. © 2017 IEEE  

    Gray-scale image colorization using cycle-consistent generative adversarial networks with residual structure enhancer

    , Article 2020 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2020, 4 May 2020 through 8 May 2020 ; Volume 2020 , May , 2020 , Pages 2223-2227 Johari, M. M ; Behroozi, H ; The Institute of Electrical and Electronics Engineers, Signal Processing Society ; Sharif University of Technology
    Institute of Electrical and Electronics Engineers Inc  2020
    Abstract
    The colorization of gray-scale images has always been a challenging task in computer vision. Recently, novel approaches have been introduced for unsupervised image translation between two domains using Generative Adversarial Networks (GANs). Since one can consider the gray-scale and colorful images as two separate domains, we propose a two-stage cycle-consistent network architecture to produce convincible images. First, an intermediate image is generated with a relatively uncomplicated objective function at the output. Next, at the second stage, the intermediate image is enhanced via a residual network structure with a more complicated objective function. Furthermore, by employing two... 

    Low mutual and average coherence dictionary learning using convex approximation

    , Article 2020 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2020, 4 May 2020 through 8 May 2020 ; Volume 2020-May , 2020 , Pages 3417-3421 Parsa, J ; Sadeghi, M ; Babaie Zadeh, M ; Jutten, C ; The Institute of Electrical and Electronics Engineers, Signal Processing Society ; Sharif University of Technology
    Institute of Electrical and Electronics Engineers Inc  2020
    Abstract
    In dictionary learning, a desirable property for the dictionary is to be of low mutual and average coherences. Mutual coherence is defined as the maximum absolute correlation between distinct atoms of the dictionary, whereas the average coherence is a measure of the average correlations. In this paper, we consider a dictionary learning problem regularized with the average coherence and constrained by an upper-bound on the mutual coherence of the dictionary. Our main contribution is then to propose an algorithm for solving the resulting problem based on convexly approximating the cost function over the dictionary. Experimental results demonstrate that the proposed approach has higher... 

    Multi-view face detection and recognition under varying illumination conditions by designing an illumination effect cancelling filter

    , Article 12th AES Symposium on New Trends in Audio and Video, NTAV 2008, Joined with the 12th IEEE Conference on Signal Processing: Algorithms, Architectures, Arrangements, and Applications, SPA 2008, Poznan, 25 September 2008 through 27 September 2008 ; 2008 , Pages 27-32 ; 9781457716607 (ISBN) Shoja Ghiass, R ; Fatemizadeh, E ; Sharif University of Technology
    2008
    Abstract
    This paper presents a novel approach for detection and recognition of multi-view faces whose location is unknown and the illumination conditions are varying. The detection of faces is accomplished after canceling the effect of the various illumination conditions by using a proposed filter. Because of the independency of the approach to skin color of face, the persons with every kind of skin colors are detected even in completely dark environments. Next, the detected faces are recognized. It is a well known technique to combine the feature based methods with the template based methods in face recognition. Our experiments show that we can combine some proposed aspects of the feature based... 

    Improvements in audio classification based on sinusoidal modeling

    , Article 2008 IEEE International Conference on Multimedia and Expo, ICME 2008, Hannover, 23 June 2008 through 26 June 2008 ; 2008 , Pages 1485-1488 ; 9781424425716 (ISBN) Shirazi, J ; Ghaemmaghami, S ; Razzazi, F ; Sharif University of Technology
    2008
    Abstract
    In this paper, a set of features is presented and evaluated based on sinusoidal modeling of audio signals. Amplitude, frequency, and phase parameters of the sinusoidal model are used and compared as input features into an audio classifier system. The performance of sinusoidal model features is evaluated for classification of audio into speech and music classes using both the Gaussian and the GMM (Gaussian Mixture Model) classifiers. Experimental results show superiority of the amplitude parameters of the sinusoidal model, which could be used for the first time for such an audio classification, as compared to the popular cepstral features. By using a set of 40 sinusoidal features, we achieved... 

    Birth-death frequencies variance of sinusoidal model a new feature for audio classification

    , Article SIGMAP 2008 - International Conference on Signal Processing and Multimedia Applications, Porto, 26 July 2008 through 29 July 2008 ; 2008 , Pages 139-144 ; 9789898111609 (ISBN) Ghaemmaghami, S ; Shirazi, J ; Sharif University of Technology
    2008
    Abstract
    In this paper, a new feature set for audio classification is presented and evaluated based on sinusoidal modeling of audio signals. Variance of the birth-death frequencies in sinusoidal model of signal, as a measure of harmony, is used and compared to typical features as the input into an audio classifier. The performance of this sinusoidal model feature is evaluated through classification of audio to speech and music using both the GMM and the SVM classifiers. Classification results show that the proposed feature is quite successful in speech/music classification. Experimental comparisons with popular features for audio classification, such as HZCRR and LSTER, are presented and discussed.... 

    A fast iterative method for removing impulsive noise from sparse signals

    , Article IEEE Transactions on Circuits and Systems for Video Technology ; Volume 31, Issue 1 , 2021 , Pages 38-48 ; 10518215 (ISSN) Sadrizadeh, S ; Zarmehi, N ; Kangarshahi, E. A ; Abin, H ; Marvasti, F ; Sharif University of Technology
    Institute of Electrical and Electronics Engineers Inc  2021
    Abstract
    In this paper, we propose a new method to reconstruct a signal corrupted by noise where both signal and noise are sparse but in different domains. The main contribution of our algorithm is its low complexity; it has much lower run-time than most other algorithms. The reconstruction quality of our algorithm is both objectively (in terms of PSNR and SSIM) and subjectively better or comparable to other state-of-the-art algorithms. We provide a cost function for our problem, present an iterative method to find its local minimum, and provide the analysis of the algorithm. As an application of this problem, we apply our algorithm for Salt-and-Pepper noise (SPN) and Random-Valued Impulsive Noise... 

    Improvement to speech-music discrimination using sinusoidal model based features

    , Article Multimedia Tools and Applications ; Volume 50, Issue 2 , November , 2010 , Pages 415-435 ; 13807501 (ISSN) Shirazi, J ; Ghaemmaghami, S ; Sharif University of Technology
    2010
    Abstract
    This paper addresses a model-based audio content analysis for classification of speech-music mixed audio signals into speech and music. A set of new features is presented and evaluated based on sinusoidal modeling of audio signals. The new feature set, including variance of the birth frequencies and duration of the longest frequency track in sinusoidal model, as a measure of the harmony and signal continuity, is introduced and discussed in detail. These features are used and compared to typical features as inputs to an audio classifier. Performance of these sinusoidal model features is evaluated through classification of audio into speech and music using both the GMM (Gaussian Mixture Model)... 

    Robust audio and speech watermarking using Gaussian and Laplacian modeling

    , Article Signal Processing ; Volume 90, Issue 8 , 2010 , Pages 2487-2497 ; 01651684 (ISSN) Akhaee, M. A ; Khademi Kalantari, N ; Marvasti, F ; Sharif University of Technology
    2010
    Abstract
    In this paper, a semi-blind multiplicative watermarking approach for audio and speech signals has been presented. At the receiver end, the optimal maximum likelihood (ML) detector aided by the archived information for Gaussian and Laplacian signals in noisy environment is designed and implemented. The performance of the proposed scheme is analytically calculated and verified by simulation. Then, we adapt the proposed scheme to speech and audio signals. To improve robustness, the algorithm is applied to low frequency components of the host signal. Besides, the power of the watermark is controlled elegantly to have inaudibility using perceptual evaluation of audio quality (PEAQ) and perceptual... 

    A Distributed 1-bit compressed sensing algorithm robust to impulsive noise

    , Article IEEE Communications Letters ; Volume 20, Issue 6 , 2016 , Pages 1132-1135 ; 10897798 (ISSN) Zayyani, H ; Korki, M ; Marvasti, F ; Sharif University of Technology
    Institute of Electrical and Electronics Engineers Inc 
    Abstract
    This letter proposes a sparse diffusion algorithm for 1-bit compressed sensing (CS) in wireless sensor networks, and the algorithm is inherently robust against impulsive noise. The approach exploits the diffusion strategy from distributed learning in the 1-bit CS framework. To estimate a common sparse vector cooperatively from only the sign of measurements, a steepest descent method that minimizes the suitable global and local convex cost functions is used. A diffusion strategy is suggested for distributive learning of the sparse vector. A new application of the proposed algorithm to sparse channel estimation is also introduced. The proposed sparse diffusion algorithm is compared with both...