Loading...
Search for: visual
0.018 seconds
Total 298 records

    Bag of Words-based Feature Learning for Image Classification Systems

    , M.Sc. Thesis Sharif University of Technology Najibi Kohneh Shahri, Mahyar (Author) ; Rabiee, Hamid Reza (Supervisor)
    Abstract
    Bag of words-based image classification systems have achieved state-of-the-art accuracies in the image classification task recently. These systems can be decomposed into four separate subsystems, each of which has its own objectives: Feature extraction, Feature learning and coding, Pooling, and classification. The effects of the feature learning stage, in which each extracted feature is represented as a linear combination of several visual words, can not be neglected in the success of the whole system. The importance of this part has attracted several researchers to develop different methods in order to alleviate the existing issues. Although several methods have been proposed so far, there... 

    EEG-based Personalized Interpretable Visual Attention Prediction

    , M.Sc. Thesis Sharif University of Technology Behnamnia, Armin (Author) ; Rabiee, Hamid Reza (Supervisor)
    Abstract
    Human visual attention is a mapping that determines to what regions of an image human’s eyes focus more while perceiving it. Personalized visual attention is visual attention computed for a specific individual. The importance of visual attention lies in its wide range of applications in computer vision and cognitive science, such as neural encoding, image captioning, self-driving cars, video anomaly detection, image classification, and visual design. One of important aspects of visual attention is personalization, the ability to assign every individual their own, specialized attention map. In this project we aim to utilize EEG signals measured from people’s brain to predict their... 

    Implementation of Virtual Structure Approach in Multiple Spacecraft Formation Flight Using Visual Sensors

    , M.Sc. Thesis Sharif University of Technology Saberi Tavakkoli, Mohammad (Author) ; Saghafi, Fariborz (Supervisor)
    Abstract
    In this research, it has been trying to implement Virtual Structure method for formation flight of multiple spacecraft. In this method, a virtual solid frame with virtual center of mass is considered and agents are arranged in a formation with respect to the virtual center. In this work, a formation keeping control system is implemented in which a feedback from formation to virtual structure and vice versa is considered. The method is implemented firstly in a nongradient field and then developed into a circular orbit in order to investigate the in-orbital effects. An algorithm for collision avoidance, based on relative distances and relative velocities, was also developed. The algorithm was... 

    Processing the Local Field Potential Signals in Comparison to Neighboring Simple and Complex Neurons of Primary Visual Cortex

    , M.Sc. Thesis Sharif University of Technology Eftekhar, Morteza (Author) ; Lashgari, Reza (Supervisor)
    Abstract
    In neural systems of living organism, moreover than differences in anatomic structure of cells, there is also differences in physiological functions of analogous cells.Specification and categorization of neurons based on physiological functions is one of objectives of neuroscience. Study of cognitive behaviors and systematic study of neural system, modeling and practical applications in neural prosthesis design are some of applications of categorizing neural cells. Neural signals can be studied by Spike rate of a single neuron activity or Local Field Potential (LFP) of a finite number of neurons. In previous studies neurons of first visual cortex are divided into two groups of simple and... 

    Visual Question Answering

    , M.Sc. Thesis Sharif University of Technology Salari, Arsalan (Author) ; Manzuri, Mohammad Taghi (Supervisor)
    Abstract
    Visual Question Answering (VQA) deep-learning systems tend to capture superficial statistical correlations in the training data because of strong language priors and fail to generalize to test data with a significantly different question-answer(QA) distribution. To address this issue, we introduce a Visually Directed Question Encoder to replace the commonly used RNNs in base models. our method uses visual features alongside word embeddings of question words to encode each word. As a result, the model is forced to look at the visual information relevant to each word and it no longer produces answers based on just the question itself. We evaluate our approach on the VQA generalization task... 

    Answering Questions about Image Contents by Deep Networks

    , M.Sc. Thesis Sharif University of Technology Chavoshian, Mohammad (Author) ; Soleymani Baghshah, Mahdieh (Supervisor)
    Abstract
    Due to the recent advances in the learning of multimodal data, humans tend to use computer systems in order to solve more complex problems. One of them is Visual Question Answering (VQA), where the goal is finding the answer of a question asked about the visual contents of a given image. This is an interdisciplinary problem between the areas of Computer Vision, Natural Language Processing and Reasoning. Because of recent achievements of Deep Neural Networks in these areas, recent works used them to address the VQA task. In this thesis, three different methods have been proposed which adding each of them to existing solutions to the VQA problem can improve their results. First method tries to... 

    Free-Viewpoint Soccer Match Video from Overlapping Cameras

    , M.Sc. Thesis Sharif University of Technology Zarean, Ali (Author) ; Kasaei, Shohreh (Supervisor)
    Abstract
    Nowadays TV broadcasting of soccer matches has attracted a lot of viewers. With advancement of technology, the broadcasting is also going under changes. One of these changes is using free-viewpoint video. In free-viewpoint video users can interactively choose any view to watch the match from. Generally free-viewpoint problem is defined as follows. We have sequential frames of a soccer match video from different views as input. We want to create sequential frames of the same video from a different view as output. There are several methods for solving this problem. We can classify these methods into to general approaches. Image-based and geometry-based. We will examine these approaches in... 

    Fusion of Audio and Visual Occurrences using Fuzzy Logic for Improving Perception Quality of Events

    , Ph.D. Dissertation Sharif University of Technology Faraji, Mohammad Mahdi (Author) ; Bagheri Shouraki, Saeed (Supervisor)
    Abstract
    The ability of human to analyze the environment around them has been an inspiring source for event analysis research. Since human perception of the environment is formed in a multi-modal space, many efforts have been made to fuse information to create an intelligent fusion. In this study, we want to better understand the environment using information fusion. For this purpose, fuzzy fusion of audio and video signals based on ink drop spread operator is performed for recognizing and tracking of the targets using several scenarios of the AV16.3 dataset. We then focused on the fusion of audio data for sound source localization, one of the most important applications in evaluating fusion... 

    Video Watermarking and Capacity Analysis: Information Theoretic Approach

    , M.Sc. Thesis Sharif University of Technology Khalilian, Hanieh (Author) ; Ghaemmaghami, Shahrokh (Supervisor)
    Abstract
    Data hiding in digital media has been widely investigated over the past decade because of covering different crucial applications. Amongst digital multimedia signals, video has been received a special attention and data hiding in video signals has reached a significant improvement in recent years. This thesis aims at introducing an information theoretic based analysis method for calculating the capacity of data hiding in video signals, as one of most challenging issues in this area. This analysis is expected to establish a reasonable basis for the design and analysis of data hiding algorithms. We study and investigate the data hiding problems that could be specific to video signals and its... 

    Visual Simultaneous Localization and Mapping using an RGB-D Camera

    , M.Sc. Thesis Sharif University of Technology Rashidi, Hossein (Author) ; Kasaei, Shohreh (Supervisor)
    Abstract
    Simultaneous localization and mapping (SLAM) is the action of detecting robot pose in an unknown environment and building the environment map by use of input data that captured from robot sensors. In visual SLAM, the input data for the robot, is limited to camera sensors. Nowadays, SLAM is one of the main challenges in robotic research.For autonomous action, we need robot pose in the map of the environment. The map production in the indoor environment, there is no GPS data, is one of the research issue in robotic community, in last decade. In this thesis, a new and efficient method is proposed for SLAM at the level of objects. The maps produced by state of the art methods don’t have a... 

    Visual Odometry using RGB-D Cameras

    , M.Sc. Thesis Sharif University of Technology Mohammadi Kaji, Mahsa (Author) ; Kasaei, Shohreh (Supervisor)
    Abstract
    Vision-based localization and 3D orientation estimation of a moving camera, has been for long a vast research area including robot localization and mapping, virtual reality and structure from motion. By introduction of RGB-D cameras in 2010, many sparse methods which are based on key-point extraction and tracking, moved towards dense methods. Dense methods utilize the RGB-D depth and gray-scale values in the images and define the odometry estimation problem as an image registration optimization, without the need to make key-point correspondance in images. Although RGB-D cameras impose specific constraints such as limited depth, depth errors and medium resolution, dense methods have shown... 

    Experimental Study on Liquid Breakup Process in Slinger Injection

    , M.Sc. Thesis Sharif University of Technology Rezayat, Sajjad (Author) ; Farshchi, Mohammad (Supervisor)
    Abstract
    In small turbojet engines, it is important to find a suitable fuel injector with good spray quality. However, the rotating fuel injection system can potentially provide high atomization quality without the high-pressure fuel pump through the centrifugal forces of the engine shaft. The spray characteristics of rotary atomization for small gas turbines can be investigated using a high-speed camera. To analyze the breakup process of the liquid column and liquid film, spray visualization tests should performed under varied test conditions.In this research, experimental study on liquid breakup process has performed in rotary atomizer of j402 turbojet engine with the name of slinger injector, to... 

    Visualization of Flow Pattern and Experimental Investigation of Thermal Performance of Pulsating Heat Pipe with Proposed Fluid

    , M.Sc. Thesis Sharif University of Technology Gandomkar, Amir Reza (Author) ; Saeedi, Mohammad Hassan (Supervisor) ; Shafii, Mohammad Behshad ($item.subfieldsMap.e)
    Abstract
    Pulsating heat pipe (PHP) is a two-phase device for the means of transferring high heat fluxes and is used extensively for the electronic cooling. In this study the different flow regimes in PHP with different fluids have been investigated. In this research, 3 different fluids including: Pure fluids, Ferro-fluid and surfactant solution with %50 filling ratio have been used. For ferro-fluid, 5 different concentrations and 3 type of magnetic fields have been operated in 2 different heat pipes. Results show that ferrofluid is more stable in Pyrex made-heat pipe for long period of time and no magnet mode has the best thermal performance due to high conductivity of fluid. In copper made-heat pipe... 

    Neural Synchrony of Spiking Local Neurons and Local Field Potential Activities in Primary Visual Cortex of Awake Monkeys

    , M.Sc. Thesis Sharif University of Technology Masoudian, Saeed (Author) ; Jafari, Mehdi (Supervisor) ; Rabiee, Hamid Reza (Supervisor) ; Lashgari, Reza (Co-Supervisor)
    Abstract
    Primary visual cortex plays a paramount role in processing visual information. Visual information after the primitive processing by LGN and Thalamus, will be sent to the primary visual cortex which is called V1 area. According to the fact that V1 neurons have the definitive sensitivity to the edge and contrast of visual stimuli so their behavior are very selective to the certain features, specially contrast, of visual stimuli. These neurons have different response behavior to contrast. In fact most of them have some nonlinear behavior to this feature. Normally we expect that response of them or spiking rate of them grow linearly by the contrast increases but in most many cases they have some... 

    Modeling of Visual Attention Mechanism by Brain Signals

    , M.Sc. Thesis Sharif University of Technology Pahlevan Aghababa, Fatemeh (Author) ; Beigy, Hamid (Supervisor)
    Abstract
    Attention is a cognitive process in which the mind reacts to certain stimuli or stimuli of the environment while other environmental stimuli are ignored. Attention might be an overt or covert process. Overt attention is a process in which based on the purpose, we selectively choose an object or place among other objects and places to focus on and we are aware of it. However, the covert attention originates from hidden source, and we are not aware of it. In fact, the covert attention causes a clear and rapid movement of the eye toward the stimulus or space to be taken into consideration and the time when the movement of the eye it means overt attention has occurred. Visual attention is given... 

    A Semantic Valency Lexicon for Persian Predicates and Visualization of their Relations

    , M.Sc. Thesis Sharif University of Technology Salimifar, Saeedeh (Author) ; Khosravi Zadeh, Parvaneh (Supervisor) ; Shojaei, Razieh (Supervisor)
    Abstract
    The highest and most difficult layer of Natural Language Processing, is the understanding of meaning. As a result, lexicons and annotated corpora are of the utmost importance in this area. However, the lack of such semantic resources, especially in Abstract Meaning Representation (AMR), is one of the main issues in this field for Persian Language. This work by modeling PropBank, a semantic valency lexicon for English predicates, is the first step towards building such lexicons for Persian Language with the focus on AMR. Thus, a guideline describing how to annotate the Persian predicates is provided which first evaluates the common structures between the two languages and then focuses on the... 

    Designing a SSVEP based Brain-Computer Interface to Control the Keyboard

    , M.Sc. Thesis Sharif University of Technology Basere, Naser (Author) ; Shamsollahi, Mohammad Bagher (Supervisor)
    Abstract
    Brain Computer Interface (BCI) is a communication system between human brain’s activity and his environment such as prosthetic hand and wheelchair and other controlling devices.In this thesis we will introduce a new and easy spelling system by using Brain Computer Interface (BCI) based on Steady State Evoke Potential (SSVEP). This system is used for spelling the numbers of telephone keyboard by four flickers and clock shape. This system allows the user to spell numbers by gazing at one of four flickers (Clockwise rotation, Counter Clockwise rotation, Accept and Backspace), Clockwise is used to rotate clock hand toward right, Counter Clockwise to rotate clock hand toward left, Accept is used... 

    Designing a Hybrid Brain Computer Interface System

    , M.Sc. Thesis Sharif University of Technology Mashayekh Bakhsh, Tara (Author) ; Shamsollahi, Mohammad Bagher (Supervisor)
    Abstract
    Brain Computer Interface (BCI) is a communication system between human brain and a computer or a peripheral device which by recording brain signals directly would send messages and commands from the human brain to computer.According to brain activity patterns of EEG, BCIs are divided into different types. The most important of these patterns called ERP (Event Related Potentials) which appears after particular events in the EEG signal. A significant ERP pattern is P300 potential. It occurs when patient recognizes oddball stimuli. SSVEP (Steady-State Visual Evoked Potential) is another type of patterns and is response of the brain to optical stimulations with certain frequencies and a strong... 

    Design a Content-Based Color Image Retrieval Using Attention Driven Saliency Map

    , M.Sc. Thesis Sharif University of Technology Ebrahimi, Davood (Author) ; Fatemizadeh, Emadeddin (Supervisor)
    Abstract
    Content Based Image Retrieval (CBIR) is in fact an image search engine which Operates on image Context . in this thesis (project) the aim was to use the Visual attention of humans in detecting the objects in image. in this ability first a salient image of the most important things in the image would be created And after an initial separation , for the final recognition the other features (details) in the image will be used It’s a while that the use of Visual attention models and saliency maps in designing the interfaces between humans and machines has been considered widely. This fact in the design of CBIR systems has not a good background (satisfying history). In this thesis I have... 

    Design and Implementation of a Face Model in Video-realistic Speech Animation for Farsi Language

    , M.Sc. Thesis Sharif University of Technology Ghasemi Naraghi, Zeinab (Author) ; Jamzad, Mansour (Supervisor)
    Abstract
    With increasing use of computers in everyday life, improved communication between machines and human is needed. To make a right communication and understand a humankind face which is made in a graphical environment, implementing the audio and visual projects like lip reading, audio and visual speech recognition and lip modelling needed. The main goal in this project is natural representation of strings of lip movements for Farsi language. Lack of a complete audio and visual database for this application in Farsi language made us provide a new complete Farsi database for this project that is called SFAVD. It is a unique audio and visual database which covers the most applicable words, all...