Sharif Digital Repository / Sharif University of Technology / Search result

Hierarchical concept score postprocessing and concept-wise normalization in CNN-based video event recognition

, Article IEEE Transactions on Multimedia ; Volume 21, Issue 1 , 2019 , Pages 157-172 ; 15209210 (ISSN) Soltanian, M ; Ghaemmaghami, S ; Sharif University of Technology

Institute of Electrical and Electronics Engineers Inc 2019

Abstract

This paper is focused on video event recognition based on frame level convolutional neural network (CNN) descriptors. Using transfer learning, the image trained descriptors are applied to the video domain to make event recognition feasible in scenarios with limited computational resources. After fine-tuning of the existing CNN concept score extractors, pretrained on ImageNet, the output descriptors of the different fully connected layers are employed as frame descriptors. The resulting descriptors are hierarchically postprocessed and combined with novel and efficient pooling and normalization methods. As major contributions of this paper to the video event recognition, we present a...

Hierarchical concept score postprocessing and concept-wise normalization in cnn-based video event recognition

, Article IEEE Transactions on Multimedia ; Volume 21, Issue 1 , 2019 , Pages 157-172 ; 15209210 (ISSN) Soltanian, M ; Ghaemmaghami, S ; Sharif University of Technology

Institute of Electrical and Electronics Engineers Inc 2019

Abstract

This paper is focused on video event recognition based on frame level convolutional neural network (CNN) descriptors. Using transfer learning, the image trained descriptors are applied to the video domain to make event recognition feasible in scenarios with limited computational resources. After fine-tuning of the existing CNN concept score extractors, pretrained on ImageNet, the output descriptors of the different fully connected layers are employed as frame descriptors. The resulting descriptors are hierarchically postprocessed and combined with novel and efficient pooling and normalization methods. As major contributions of this paper to the video event recognition, we present a...

Mammogram image retrieval via sparse representation

, Article 2011 1st Middle East Conference on Biomedical Engineering, MECBME 2011, Sharjah, 21 February 2011 through 24 February 2011 ; 2011 , Pages 63-66 ; 9781424470006 (ISBN) Siyahjani, F ; Ghaffari, A ; Fatemizadeh, E ; Sharif University of Technology

2011

Abstract

In recent years there has been a great effort to enhance the computer-aided diagnosis systems, since proven similar pathologies, in the past, plays an important role in diagnosis of the current cases, content based medical image retrieval has been emerged. In this work we have designed a decision making machine in which utilizes sparse representation technique to preserve semantic category relevance among the retrieved images and the query image, this machine comprises optimized wavelets (adapted using lifting scheme) to extract appropriate visual features in order to grasp visual content of the images, afterwards by using some classical methods, Raw data vectors become applicable for sparse...

Content based mammogram image retrieval based on the multiclass visual problem

, Article 2010 17th Iranian Conference of Biomedical Engineering, ICBME 2010 - Proceedings, 3 November 2010 through 4 November 2010, Isfahan ; 2010 ; 9781424474844 (ISBN) Siyahjani, F ; Fatemizadeh, E ; Sharif University of Technology

2010

Abstract

Since expertise elicited from past resolved cases plays an important role in medical application and images acquired from various cases have a great contribution to diagnosis of the abnormalities, Content based medical image retrieval has become an active research area for many scientists, In this article we proposed a new framework to retrieve visually similar images from a large database, in which visual relevance is regarded as much as the semantic category similarity, we used optimized wavelet transform as the multi-resolution analysis of the images and extracted various statistical SGLDM features from different resolutions then after reducing feature space we used error correcting codes...

Using minimum matching for clustering with balancing constraints

, Article 2009 Second ISECS International Colloquium on Computing, Communication, Control, and Management, CCCM 2009, Sanya, 8 August 2009 through 9 August 2009 ; Volume 1 , 2009 , Pages 225-228 ; 9781424442461 (ISBN) Shirali Shahreza, S ; Abolhassani, H ; Shirali Shahreza, M. H ; Yangzhou University; Guangdong University of Business Studies; Wuhan Institute of Technology; IEEE SMC TC on Education Technology and Training; IEEE Technology Management Council ; Sharif University of Technology

2009

Abstract

Clustering is a major task in data mining which is used in many applications. However, general clustering is inappropriate for many applications where some constraints should be applied. One category of these constraints is the cluster size constraint. In this paper, we propose a new algorithm for solving the clustering with balancing constraints by using the minimum matching. We compare our algorithm with the method proposed by Banerjee and Ghosh that uses stable matching and show that our algorithm converge to the final solution in fewer iterations. ©2009 IEEE

M-quiz by SMS

, Article 6th International Conference on Advanced Learning Technologies, ICALT 2006, Kerkrade, 5 July 2006 through 7 July 2006 ; Volume 2006 , 2006 , Pages 726-729 ; 0769526322 (ISBN); 9780769526324 (ISBN) Shahreza, M. S ; Sharif University of Technology

2006

Abstract

Virtual learning is a new idea that has gotten a new form with the emergence of new technologies such as the wireless networks. The mobile phone (cell phone) is a device that is used by most people nowadays. Therefore, one can use the mobile phone for virtual learning on a wide scale. One of the popular and at the same time simple and inexpensive services on the mobile phone is the SMS (Short Message Service). In this paper I propose a method for taking multiple-choice quizzes by using the SMS on mobile phones. In the provision of these tests, some SMS messages were sent to the student along with the answers of the questions, which were steganography in an image. The student, after receiving...

Fast content based color image retrieval system based on texture analysis of edge map

, Article Advanced Materials Research, 8 July 2011 through 11 July 2011 ; Volume 341-342 , July , 2012 , Pages 168-172 ; 10226680 (ISSN) ; 9783037852521 (ISBN) Salehian, H ; Zamani, F ; Jamzad, M ; Sharif University of Technology

Abstract

In this paper we propose a method for CBIR based on the combination of texture, edge map and color. As texture of edges yields important information about the images, we utilized an adaptive edge detector that produces a binary edge image. Also, using the statistics of color in two different color spaces provides complementary information to retrieve images. Our method is time efficient since we have applied texture calculations on the binary edge image. Our experimental results showed both the higher accuracy and lower time complexity of our method with similar related works using SIMPLIcity database

Secure steganography based on embedding capacity

, Article International Journal of Information Security ; Volume 8, Issue 6 , 2009 , Pages 433-445 ; 16155262 (ISSN) Sajedi, H ; Jamzad, M ; Sharif University of Technology

2009

Abstract

Mostly the embedding capacity of steganography methods is assessed in non-zero DCT coefficients. Due to unequal distribution of non-zero DCT coefficients in images with different contents, images with the same number of non-zero DCT coefficients may have different actual embedding capacities. This paper introduces embedding capacity as a property of images in the presence of multiple steganalyzers, and discusses a method for computing embedding capacity of cover images. Using the capacity constraint, embedding can be done more secure than the state when the embedder does not know how much data can be hidden securely in an image. In our proposed approach, an ensemble system that uses...

User adaptive clustering for large image databases

, Article Proceedings - International Conference on Pattern Recognition, 23 August 2010 through 26 August 2010, Istanbul ; 2010 , Pages 4271-4274 ; 10514651 (ISSN) ; 9780769541099 (ISBN) Saboorian, M. M ; Jamzad, M ; Rabiee, H. R ; Sharif University of Technology

2010

Abstract

Searching large image databases is a time consuming process when done manually. Current CBIR methods mostly rely on training data in specific domains. When source and domain of images are unknown, unsupervised methods provide better solutions. In this work, we use a hierarchical clustering scheme to group images in an unknown and large image database. In addition, the user should provide the current class assignment of a small number of images as a feedback to the system. The proposed method uses this feedback to guess the number of required clusters, and optimizes the weight vector in an iterative manner. In each step, after modification of the weight vector, the images are reclustered. We...

Multi-modal deep distance metric learning

, Article Intelligent Data Analysis ; Volume 21, Issue 6 , 2017 , Pages 1351-1369 ; 1088467X (ISSN) Roostaiyan, S. M ; Imani, E ; Soleymani Baghshah, M ; Sharif University of Technology

IOS Press 2017

Abstract

In many real-world applications, data contain heterogeneous input modalities (e.g., web pages include images, text, etc.). Moreover, data such as images are usually described using different views (i.e. different sets of features). Learning a distance metric or similarity measure that originates from all input modalities or views is essential for many tasks such as content-based retrieval ones. In these cases, similar and dissimilar pairs of data can be used to find a better representation of data in which similarity and dissimilarity constraints are better satisfied. In this paper, we incorporate supervision in the form of pairwise similarity and/or dissimilarity constraints into...

Toward real-time image annotation using marginalized coupled dictionary learning

, Article Journal of Real-Time Image Processing ; Volume 19, Issue 3 , 2022 , Pages 623-638 ; 18618200 (ISSN) Roostaiyan, S. M ; Hosseini, M. M ; Mohammadi Kashani, M ; Amiri, S. H ; Sharif University of Technology

Springer Science and Business Media Deutschland GmbH 2022

Abstract

In most image retrieval systems, images include various high-level semantics, called tags or annotations. Virtually all the state-of-the-art image annotation methods that handle imbalanced labeling are search-based techniques which are time-consuming. In this paper, a novel coupled dictionary learning approach is proposed to learn a limited number of visual prototypes and their corresponding semantics simultaneously. This approach leads to a real-time image annotation procedure. Another contribution of this paper is that utilizes a marginalized loss function instead of the squared loss function that is inappropriate for image annotation with imbalanced labels. We have employed a marginalized...

Automatic image annotation by a loosely joint non-negative matrix factorisation

, Article IET Computer Vision ; Volume 9, Issue 6 , November , 2015 , Pages 806-813 ; 17519632 (ISSN) Rad, R ; Jamzad, M ; Sharif University of Technology

Institution of Engineering and Technology 2015

Abstract

Nowadays, the number of digital images has increased so that the management of this volume of data needs an efficient system for browsing, categorising and searching. Automatic image annotation is designed for assigning tags to images for more accurate retrieval. Non-negative matrix factorisation (NMF) is a traditional machine learning technique for decomposing a matrix into a set of basis and coefficients under the non-negative constraints. In this study, the authors propose a two-step algorithm for designing an automatic image annotation system that employs the NMF framework for its first step and a variant of K-nearest neighbourhood as its second step. In the first step, a new multimodal...

A multi-view-group non-negative matrix factorization approach for automatic image annotation

, Article Multimedia Tools and Applications ; 2017 , Pages 1-21 ; 13807501 (ISSN) Rad, R ; Jamzad, M ; Sharif University of Technology

Abstract

In automatic image annotation (AIA) different features describe images from different aspects or views. Part of information embedded in some views is common for all views, while other parts are individual and specific. In this paper, we present the Mvg-NMF approach, a multi-view-group non-negative matrix factorization (NMF) method for an AIA system which considers both common and individual factors. The NMF framework discovers a latent space by decomposing data into a set of non-negative basis vectors and coefficients. The views divided into homogeneous groups and latent spaces are extracted for each group. After mapping the test images into these spaces, a unified distance matrix is...

Image annotation using multi-view non-negative matrix factorization with different number of basis vectors

, Article Journal of Visual Communication and Image Representation ; Volume 46 , 2017 , Pages 1-12 ; 10473203 (ISSN) Rad, R ; Jamzad, M ; Sharif University of Technology

Academic Press Inc 2017

Abstract

Automatic Image Annotation (AIA) helps image retrieval systems by predicting tags for images. In this paper, we propose an AIA system using Non-negative Matrix Factorization (NMF) framework. The NMF framework discovers a latent space, by factorizing data into a set of non-negative basis and coefficients. To model the images, multiple features are extracted, each one represents images from a specific view. We use multi-view graph regularization NMF and allow NMF to choose a different number of basis vectors for each view. For tag prediction, each test image is mapped onto the multiple latent spaces. The distances of images in these spaces are used to form a unified distance matrix. The...

A multi-view-group non-negative matrix factorization approach for automatic image annotation

, Article Multimedia Tools and Applications ; Volume 77, Issue 13 , 2018 , Pages 17109-17129 ; 13807501 (ISSN) Rad, R ; Jamzad, M ; Sharif University of Technology

Springer New York LLC 2018

Abstract

In automatic image annotation (AIA) different features describe images from different aspects or views. Part of information embedded in some views is common for all views, while other parts are individual and specific. In this paper, we present the Mvg-NMF approach, a multi-view-group non-negative matrix factorization (NMF) method for an AIA system which considers both common and individual factors. The NMF framework discovers a latent space by decomposing data into a set of non-negative basis vectors and coefficients. The views divided into homogeneous groups and latent spaces are extracted for each group. After mapping the test images into these spaces, a unified distance matrix is...

Application of 3D-wavelet statistics to video analysis

, Article Multimedia Tools and Applications ; Volume 65, Issue 3 , 2013 , Pages 441-465 ; 13807501 (ISSN) Omidyeganeh, M ; Ghaemmaghami, S ; Shirmohammadi, S ; Sharif University of Technology

2013

Abstract

Video activity analysis is used in various video applications such as human action recognition, video retrieval, video archiving. In this paper, we propose to apply 3D wavelet transform statistics to natural video signals and employ the resulting statistical attributes for video modeling and analysis. From the 3D wavelet transform, we investigate the marginal and joint statistics as well as the Mutual Information (MI) estimates. We show that marginal histograms are approximated quite well by Generalized Gaussian Density (GGD) functions; and the MI between coefficients decreases when the activity level increases in videos. Joint statistics attributes are applied to scene activity grouping,...

Autoregressive video modeling through 2D Wavelet Statistics

, Article Proceedings - 2010 6th International Conference on Intelligent Information Hiding and Multimedia Signal Processing, IIHMSP 2010, 15 October 2010 through 17 October 2010 ; October , 2010 , Pages 272-275 ; 9780769542225 (ISBN) Omidyeganeh, M ; Ghaemmaghami, S ; Shirmohammadi, S ; Sharif University of Technology

2010

Abstract

We present an Autoregressive (AR) modeling method for video signal analysis based on 2D Wavelet Statistics. The video signal is assumed to be a combination of spatial feature time series that are temporally approximated by the AR model. The AR model yields a linear approximation to the temporal evolution of a stationary stochastic process. Generalized Gaussian Density (GGD) parameters, extracted from 2D wavelet transform subbands, are used as the spatial features. Wavelet transform efficiently resembles the Human Visual System (HVS) characteristics and captures more suitable features, as compared to color histogram features. The AR model describes each spatial feature vector as a linear...

Video activity analysis based on 3D wavelet statistical properties

, Article 11th International Conference on Advanced Communication Technology, ICACT 2009, Phoenix Park, 15 February 2009 through 18 February 2009 ; Volume 3 , 2009 , Pages 2054-2058 ; 17389445 (ISSN); 9788955191387 (ISBN) Omidyeganeh, M ; Ghaemmagham, S ; Khalilain, H ; IEEE Communications Society, IEEE ComSoc; IEEE Region 10 and IEEE Daejeon Section; Korean Institute of Communication Sciences, KICS; lEEK Communications Society, IEEK ComSoc; Korean Institute of Information Scientists and Engineers, KIISE; et al ; Sharif University of Technology

2009

Abstract

A video activity analysis is presented based on 3D wavelet transform. Marginal and joint statistics as well as mutual information estimates are extracted. Marginal histograms are approximated by Generalized Gaussian Density (GGD) functions. The mutual information between coefficients -as a quantitative estimate of joint statistics- decreases when the activity in the video increases. The relationship between kurtosis graphs, extracted from joint distributions and video activity, is deduced. Results show that the type of activity in the video can be figured out from Kurtosis curves. The GGD and the Kullback-Leibler distance (KLD) are used to retrieve and locate 96% of videos properly

Tensor-based face representation and recognition using multi-linear subspace analysis

, Article 2009 14th International CSI Computer Conference, CSICC 2009, 20 October 2009 through 21 October 2009, Tehran ; 2009 , Pages 658-663 ; 9781424442621 (ISBN) Mohseni, H ; Kasaei, S ; Sharif University of Technology

Abstract

Discriminative subspace analysis is a popular approach for a variety of applications. There is a growing interest in subspace learning techniques for face recognition. Principal component analysis (PCA) and eigenfaces are two important subspace analysis methods have been widely applied in a variety of areas. However, the excessive dimension of data space often causes the curse of dimensionality dilemma, expensive computational cost, and sometimes the singularity problem. In this paper, a new supervised discriminative subspace analysis is presented by encoding face image as a high order general tensor. As face space can be considered as a nonlinear submanifold embedded in the tensor space, a...

Fuzzy Adaptive Resonance Theory for content-based data retrieval

, Article 2006 Innovations in Information Technology, IIT, Dubai, 19 November 2006 through 21 November 2006 ; 2006 ; 1424406749 (ISBN); 9781424406746 (ISBN) Milani Fard, A ; Akbari, H ; Akbarzadeh-T., M. R ; Sharif University of Technology

2006

Abstract

In this paper we propose a content-based text and image retrieval architecture using Fuzzy Adaptive Resonance Theory neural network. This method is equipped with an unsupervised mechanism for dynamic data clustering to deal with incremental information without metadata such as in web environment. Results show noticeable average precision and recall over search results. © 2006 IEEE