Robust phoneme recognition using MLP neural networks in various domains of MFCC features

, Article 2010 5th International Symposium on Telecommunications, IST 2010, 4 December 2010 through 6 December 2010, Tehran ; 2010 , Pages 755-759 ; 9781424481835 (ISBN) Dabbaghchian, S ; Sameti, H ; Ghaemmaghami, M. P ; BabaAli, B ; Sharif University of Technology

2010

Abstract

This paper focuses on enhancing MFCC features using a set of MLP NN in order to improve phoneme recognition accuracy under different noise types and SNRs. A NN can be used in different domains (between any pair of MFCC feature extraction blocks). It includes FFT, MEL, LOG, DCT and DELTA domains. Various domains have different complexities and achieve different degrees. A comparative study is done in this paper in order to find the best domain. Furthermore, a set of MLP NNs, instead of one NN, is used to enhance various noise types with different levels of SNRs. In this case, each NN is trained with a special noise type and SNR. The database used in the simulations is created by artificially...

Large-scale testing on specific fracture energy determination of dam concrete

, Article International Journal of Fracture ; Volume 141, Issue 1-2 , 2006 , Pages 247-254 ; 03769429 (ISSN) Ghaemmaghami, A ; Ghaemian, M ; Sharif University of Technology

2006

Abstract

The specific fracture energy of dam concrete is a basic material characteristic needed for the prediction of concrete dam behavior. Data on fracture properties of dam concrete are quite limited to date. A series of tests was carried out based on the size effect due to a number of geometrically similar notched specimens of various sizes. Experimental tests include three-point bending tests. The specimens were of square cross section with a span to depth ratio of 2/5. Three different specimens with depth of 200, 400 and 800 mm were considered for the purpose of testing. Concrete mixtures are provided from the Caroon 3 dam project site using river gravel or commonly crushed stones from...

Shaking table test on small-scale retrofitted model of Sefid-rud concrete buttress dam

, Article Earthquake Engineering and Structural Dynamics ; Volume 39, Issue 1 , 2010 , Pages 109-118 ; 00988847 (ISSN) Ghaemmaghami, A. R ; Ghaemian, M ; Sharif University of Technology

2010

Abstract

Sefid-rud concrete buttress dam with a height of 106m was damaged during the devastating 1990 Manjil earthquake. The dam was repaired and strengthened using epoxy grouting of cracks and the installation of post-tensioned anchors. In a previous study, nonlinear seismic response of the highest monolith with empty reservoir was investigated experimentally through model testing. A geometric-scaled model of 1:30 was tested on a shaking table to study dynamic cracking of the model. As a result of the similarity between model and prototype cracking pattern, the model was retrofitted according to prototype retrofitting plan after the Manjil earthquake and re-tested on shaking table to estimate the...

Experimental seismic investigation of Sefid-rud concrete buttress dam model on shaking table

, Article Earthquake Engineering and Structural Dynamics ; Volume 37, Issue 5 , 2008 , Pages 809-823 ; 00988847 (ISSN) Ghaemmaghami, A. R ; Ghaemian, M ; Sharif University of Technology

John Wiley and Sons Ltd 2008

Abstract

Owing to the devastating M7.6 earthquake of 20 June 1990 that occurred in the northern province of Iran, Sefid-rud concrete buttress dam located near the epicenter was severely shaken. The crack penetrated throughout the dam thickness near slope discontinuity, causing severe leakage, but with no general failure. In this study, nonlinear seismic response of the highest monolith with empty reservoir is investigated experimentally through model testing. A geometric-scaled model of 1:30 was tested on a shaking table with high-frequency capability to study dynamic cracking of the model and serve as data for nonlinear computer model calibration. Three construction joints are set up in the model to...

On the effect of spatial to compressed domains transformation in LSB-based image steganography

, Article 7th IEEE/ACS International Conference on Computer Systems and Applications, AICCSA-2009, Rabat, 10 May 2009 through 13 May 2009 ; 2009 , Pages 260-264 ; 9781424438068 (ISBN) Sarreshtedari, S ; Ghotbi, M ; Ghaemmaghami, S ; Sharif University of Technology

2009

Abstract

This paper introduces an efficient scheme to image steganography by introducing the hidden message (payload) insertion in spatial domain and transforming the stego-image to compressed domain. We apply a recently-proposed LSB method in order to obtain better statistical behavior of the stego-message and subsequently, the obtained stego-image is transformed and quantized in order to enhance the security of hiding. Performance analysis comparisons confirm a higher efficiency for our proposed method. Compared to recently-proposed approaches, our method offers the advantage that it combines an efficient LSB method with transform domain security. © 2009 IEEE

Interpolative coding of speech parameters using hierarchical temporal decomposition

, Article Digital Signal Processing: A Review Journal ; Volume 13, Issue 3 , 2003 , Pages 433-456 ; 10512004 (ISSN) Ghaemmaghami, S ; Deriche, M ; Sridharan, S ; Sharif University of Technology

Elsevier Inc 2003

Abstract

A new method for temporal decomposition (TD) of speech parameters for very low rate coding applications is developed. Unlike typical TD, the phonetic relevance is not considered here, instead, we represent the spectral parameters of speech using pre-defined interpolation functions. These functions are located at instants, which give maximum correlation with the true event structure. In this method, no event refinement is required, which significantly reduces the computational complexity of the coder to make real-time implementation possible. The method is also highly flexible and can comply with diverse coding system attributes such as bit-rate, accuracy, delay, and complexity. A spectral...

Non-Smooth regularization: improvement to learning framework through extrapolation

, Article IEEE Transactions on Signal Processing ; Volume 70 , 2022 , Pages 1213-1223 ; 1053587X (ISSN) Amini, S ; Soltanian, M ; Sadeghi, M ; Ghaemmaghami, S ; Sharif University of Technology

Institute of Electrical and Electronics Engineers Inc 2022

Abstract

Deep learning architectures employ various regularization terms to handle different types of priors. Non-smooth regularization terms have shown promising performance in the deep learning architectures and a learning framework has recently been proposed to train autoencoders with such regularization terms. While this framework efficiently manages the non-smooth term during training through proximal operators, it is limited to autoencoders and suffers from low convergence speed due to several optimization sub-problems that must be solved in a row. In this paper, we address these issues by extending the framework to general feed-forward neural networks and introducing variable extrapolation...

A low complexity NSAF algorithm

, Article IEEE Signal Processing Letters ; Volume 19, Issue 11 , August , 2012 , Pages 716-719 ; 10709908 (ISSN) Rabiee, M ; Attari, M. A ; Ghaemmaghami, S ; Sharif University of Technology

IEEE 2012

Abstract

This letter proposes a novel normalized subband adaptive filter (NSAF) algorithm, which applies variable step sizes to subband filters to improve the convergence performance of the conventional NSAF and update only a subset of the subbands per iteration to reduce its computational complexity. The selection process for each subband is based on the amount of improvement it makes to the mean square deviation at every iteration. Simulation results show significant reduction in computational complexity, faster convergence rate, and lower misadjustment error achieved using the proposed scheme

Robust video watermarking using maximum likelihood decoder

, Article European Signal Processing Conference, 29 August 2011 through 2 September 2011, Barcelona ; 2011 , Pages 2044-2048 ; 22195491 (ISSN) Diyanat, A ; Akhaee, M. A ; Ghaemmaghami, S ; Sharif University of Technology

2011

Abstract

In this paper, a robust multiplicative video watermarking scheme is presented. We segment the video signal into 3-D blocks like cubes, and then apply 3-D wavelet transform to each block. The watermark is inserted through multiplying the low frequency wavelet coefficients by a constant parameter that controls the power of the watermark. The proposed watermark extraction procedure is based on the maximum likelihood rule applied to the watermarked wavelet coefficients

Audio segmentation and classification based on a selective analysis scheme

, Article Proceedings - 10th International Multimedia Modelling Conference, MMM 2004, Brisbana, 5 January 2004 through 7 January 2004 ; 2004 , Pages 42-48 ; 0769520847 (ISBN); 9780769520841 (ISBN) Ghaemmaghami, S ; Sharif University of Technology

2004

Abstract

This paper addresses a new approach to segmentation and classification of audio through analysis of a smaller set of selective frames, which are identified by temporal decomposition (TD). These frames are located at the most steady instants, or event centroids, within a given block of the signal, which yield the maximal diversity over the set of selected features. Based on this selection scheme, the number of frames used in the analysis is reduced by at least 40%, while the temporal resolution is doubled as compared to that in typical audio classifiers. By constructing a classification system to segment audio into speech, music, speech-music, and others, it is shown that the proposed method...

Toward naturalness in narrow-band speech compression

, Article 2000 IEEE Internatinal Conference on Multimedia and Expo (ICME 2000), New York, NY, 30 July 2000 through 2 August 2000 ; Issue I/MONDAY , 2000 , Pages 440-443 Ghaemmaghami, S ; Sharif University of Technology

2000

Abstract

This paper addresses a new mixed model for characterizing LPC excitation on a 3-band basis through analyzing harmonic structure of the residual signal. In addition, a sub-frame based analysis is developed for detecting both aperiodic pulses and noisy signals, which plays a major role in reduction of perceptual errors introduced by some certain consonants. Preliminary results show that near natural speech is achieved at 1050 bps, allocated to the excitation parameters, suggesting superiority of the proposed coding scheme to the MELP-2400 coding standard, in the sense of perceptual quality of reconstructed speech

Towards higher detection accuracy in blind steganalysis of JPEG images

, Article 24th Iranian Conference on Electrical Engineering, ICEE 2016, 10 May 2016 through 12 May 2016 ; 2016 , Pages 1860-1864 ; 9781467387897 (ISBN) Zohourian, M ; Heidari, M ; Ghaemmaghami, S ; Gholampour, I ; Sharif University of Technology

Institute of Electrical and Electronics Engineers Inc 2016

Abstract

A new steganalysis system for JPG-based image data hiding is proposed in this paper. We use features extracted from both wavelet and DCT domains that are refined later in the sense of utmost discrimination between the clear and stego images in the classification system. Statistical properties of the SVD of wavelet sub-bands are combined with the extended DCT-Markov features, and the features that are most sensitive to the data embedding are chosen through a SVM-RFE based selection algorithm. Experimental results show significant improvement over baseline methods, especially for steganalysis of Perturbed Quantization (PQ), which is known to be one of most secure JPG-based steganography...

Multi-dimensional correlation steganalysis

, Article MMSP 2011 - IEEE International Workshop on Multimedia Signal Processing ; 2011 ; 9781457714337 (ISBN) Farhat, F ; Diyanat, A ; Ghaemmaghami, S ; Aref, M. R ; Sharif University of Technology

2011

Abstract

Multi-dimensional spatial analysis of image pixels have not been much investigated for the steganalysis of the LSB Steganographic methods. Pixel distribution based steganalysis methods could be thwarted by intelligently compensating statistical characteristics of image pixels, as reported in several papers. Simple LSB replacement methods have been improved by introducing smarter LSB embedding approaches, e.g. LSB matching and LSB+ methods, but they are basically the same in the sense of the LSB alteration. A new analytical method to detect LSB stego images is proposed in this paper. Our approach is based on the relative locations of image pixels that are essentially changed in an LSB...

Birth-death frequencies variance of sinusoidal model a new feature for audio classification

, Article SIGMAP 2008 - International Conference on Signal Processing and Multimedia Applications, Porto, 26 July 2008 through 29 July 2008 ; 2008 , Pages 139-144 ; 9789898111609 (ISBN) Ghaemmaghami, S ; Shirazi, J ; Sharif University of Technology

2008

Abstract

In this paper, a new feature set for audio classification is presented and evaluated based on sinusoidal modeling of audio signals. Variance of the birth-death frequencies in sinusoidal model of signal, as a measure of harmony, is used and compared to typical features as the input into an audio classifier. The performance of this sinusoidal model feature is evaluated through classification of audio to speech and music using both the GMM and the SVM classifiers. Classification results show that the proposed feature is quite successful in speech/music classification. Experimental comparisons with popular features for audio classification, such as HZCRR and LSTER, are presented and discussed....

Deep Learning Based on Sparse Coding for Data Classiﬁcation

, Ph.D. Dissertation Sharif University of Technology Amini, Sajjad (Author) ; Ghaemmaghami, Shahrokh (Supervisor)

Abstract

Deep neural networks have not progresses comparative until last decade due to computational complexity and principal challenges as gradient vanishing. Thanks to newly designed hardware architecture and great breakthroughs in 2000s leading to the solution of principal challenges, we currently face a tsunami of deep architecture utilization in various machine learning applications. Sparsity of a representation as a feature to make it more descriptive has been considered in different deep learning architectures leading to different formulations where sparsity is impose on specific representations. Due to the gradient based optimization methods for training deep architecture, smooth regularizers...

Noise reduction algorithm for robust speech recognition using MLP neural network

, Article PACIIA 2009 - 2009 2nd Asia-Pacific Conference on Computational Intelligence and Industrial Applications, 28 November 2009 through 29 November 2009 ; Volume 1 , 2009 , Pages 377-380 ; 9781424446070 (ISBN) Ghaemmaghami, M. P ; Razzazi, F ; Sameti, H ; Dabbaghchian, S ; BabaAli, B ; Sharif University of Technology

2009

Abstract

We propose an efficient and effective nonlinear feature domain noise suppression algorithm, motivated by the minimum mean square error (MMSE) optimization criterion. Multi Layer Perceptron (MLP) neural network in the log spectral domain minimizes the difference between noisy and clean speech. By using this method as a pre-processing stage of a speech recognition system, the recognition rate in noisy environments is improved. We can extend the application of the system to different environments with different noises without re-training it. We need only to train the preprocessing stage with a small portion ofnoisy data which is created by artificially adding different types of noises from the...

Robust speech recognition using MLP neural network in log-spectral domain

, Article IEEE International Symposium on Signal Processing and Information Technology, ISSPIT 2009, 14 December 2009 through 16 December 2009, Ajman ; 2009 , Pages 467-472 ; 9781424459506 (ISBN) Ghaemmaghami, M. P ; Sametit, H ; Razzazi, F ; BabaAli, B ; Dabbaghchiarr, S ; Sharif University of Technology

2009

Abstract

In this paper, we have proposed an efficient and effective nonlinear feature domain noise suppression algorithm, motivated by the minimum mean square error (MMSE) optimization criterion. A Multi Layer Perceptron (MLP) neural network in the log spectral domain has been employed to minimize the difference between noisy and clean speech. By using this method, as a pre-processing stage of a speech recognition system, the recognition rate in noisy environments has been improved. We extended the application ofthe system to different environments with different noises without retraining HMMmodel. We trained the feature extraction stage with a small portion of noisy data which was created by...

Improve Performance of Higher Order Statistics in Spatial and Frequency Domains in Blind Image Steganalysis

, M.Sc. Thesis Sharif University of Technology Shakeri, Ehsan (Author) ; Ghaemmaghami, Shahrokh (Supervisor)

Abstract

Blind image steganalysis is a technique used to, which require no prior information about the steganographic method applied to the stego im- age, determine whether the image contains an embedded message or not. The basic idea of blind steganalysis is to extract some features sensitive to information hiding, and then exploit classifiers for judging whether a given test image contains a secret message.The main focus of this research is to design an choose features sen-sitive to the embedding changes. In fact, we use high order moments in different domains, such as spatial, DCT and multi-resolution do-main, in order to improve the performance of existing steganalyzers.Accordingly, First, we...

Information Hiding of Visual Multimedia Signals Based on an Entropic Transcript

, M.Sc. Thesis Sharif University of Technology Diyanat, Abolfazl (Author) ; Ghaemmaghami, Shahrokh (Supervisor)

Abstract

Steganography is the art and science of writing hidden messages in such a way that no one, apart from the sender and intended recipient, suspects the existence of the message, a form of security through obscurity. In this thesis, we focus on entropic issue of multimedia signal in the two branches of Information hiding namely Steganography and Watermarking. How to choose the block and noise estimation in the watermarking, and analysis of the singular values decomposition in steganography are examples of using entopic issue which we use in our thesis. The two new designs for video signals AVI are presented in Watermarking. For the both proposed method ,first AVI video signal will be divided...

Analysis of Sensitivity of Features to Data Embedding in Blind Image Steganalysis

, M.Sc. Thesis Sharif University of Technology Heidari, Mortaza (Author) ; Ghaemmaghami, Shahrokh (Supervisor)

Abstract

Steganalysis is the science of detecting covert communication. It is called blind (universal) if designed to detect stego images steganographied by a wide range of embedding methods. In this method, statistical properties of the image are explored, regardless the embedding procedure employed. The main problem for image steganalysis is to find sensitive features and characteristics of the image which make a statistically significant difference between the clean and stego images. In this thesis we propose a blind image steganalysis method based on the singular value decomposition (SVD) of the discrete cosine transform (DCT) coefficients that are revisited in this work in order to enhance the...