Sharif Digital Repository / Sharif University of Technology / Search result

Short Term Traffic State Forecasting for Travel Time Estimation

, M.Sc. Thesis Sharif University of Technology Badrestani, Ebrahim (Author) ; Beigy, Hamid (Supervisor)

Abstract

Real-time travel time estimation is a major requirement in many transportation related systems. One of the main challeges is to estimate the traffic speed and then forecast it for a short time. A valuable data source for this task is instant location of moving cars that is captured using global positioning system (GPS) and sent through internet in online manner. The main problem is that the resulting traffic data is severely sparse and also contains a lot of noise. Previous researchs on this type of data are mostly based on matrix or tensor factorization. In this work it is shown that despite the large fraction of missing value it is possible to use neural network for this problem with some...

Expert Finding in Community Question Answering

, M.Sc. Thesis Sharif University of Technology Miri, Mohammad (Author) ; Beigy, Hamid (Supervisor)

Abstract

Community question answering are the systems in which users can propose their needs with asking questions. Moreover, they can share their own knowledge with the others by responding their questions. Widely spreading this sort of communities and growth of questions and answers has made some challenges. One of these challenges is finding an appropriate users who can answer questions. For instance, some user might ask a question and has to wait some times to receive another user's response. On the other hand, the users who have expertise in some fields have to spend time a lot seeking to find related questions. Therefore, expert finding systems are used to meet these needs.The main issue is...

Estimating Protein-Protein Interaction Network Similarity through Sampling

, M.Sc. Thesis Sharif University of Technology Naseri, Shervin (Author) ; Beigy, Hamid (Supervisor)

Abstract

In examining protein-protein interaction networks, we often encounter similar and repetitive schemes. Examination of these designs, which often appear in the form of motifs and similar patterns, reveals important information such as the type of protein linkage and many of the internal similarities between these networks. The ability to recognize these similarities plays an important role in identifying the function of genes, recognizing the relationships between diseases, and making drugs. We know that exact algorithms for examining subgraph isomorphism are np-hard and time-consuming and infeasible in large networks; Therefore, in practice, approximate and heuristic algorithms are used and...

Hebrid Generative Models of Social Networks

, M.Sc. Thesis Sharif University of Technology Mahdavi, Hamed (Author) ; Beigy, Hamid (Supervisor)

Abstract

With the advent of graph neural network networks, a new class of models has emerged that has a high ability to learn powerful representations. Also, there have been popular probabilistic latent variable models for representation learning and solving graph problems. Graph neural networks do not necessarily provide meaningful representation in their hidden layers and also do not have the ability to estimate uncertainty. Learning probabilistic models is usually a slow process and there is no specific way to add general features to these models. Therefore, recently, a combination of neural network models and probabilistic network models have been developed that can partially answer these...

Information and Influence Diffusion in Social Network

, Ph.D. Dissertation Sharif University of Technology Sepehr, Arman (Author) ; Beigy, Hamid (Supervisor)

Abstract

People use social networks to share millions of stories every day, but these stories rarely become viral. Can we estimate the probability that a story becomes a u/rof cascade? If so, can we find a set of users that are more likely to trigger viral cascades? There are many factors influenced the message virality. In this thesis, we investigate the effect of graph structure, diffusion pattern as well as the message text on virality measure. Finally, the authors solve both source localization and inferring COVID-19 network via propsed methods.First, the authors investigate probability estimation and maximization of cascade virality. In this section, we develop an efficient viral cascade...

Predicting Novelty Concepts in Data Streams

, M.Sc. Thesis Sharif University of Technology Soudani, Heydar (Author) ; Beigy, Hamid (Supervisor)

Abstract

Many real-world environment challenges are not considered in laboratory-controlled models. Although different and powerful models have been developed for object detection and classification in diverse applications, many fail in the real world. One of the most important challenges is dealing with unknown data at the inference time. The second challenge is to change the characteristics of the data distribution over time, known as concept drift. These two important challenges are explored in the Data Stream environment, along with many of the events that a model may face in the real world. To address the challenges of learning in a data stream environment, this thesis first designs a...

Fraud Detection in Financial Transactions

, M.Sc. Thesis Sharif University of Technology Haghighat, Mohammad (Author) ; Beigy, Hamid (Supervisor)

Abstract

With development of electronic payment infrastructures and increase of payment transactions in result, abusing these infrastructures and fraudulent efforts has been increased. Problem of “Fraud Detection in Financial Transactions” is finding these illegal/abnormal transactions while many other legitimate transactions exist. Goal of this thesis is providing a method for fraud detection in financial transactions using representation learning. Many approaches are used for solving fraud detection including classic data mining algorithms and deep learning based methods, which are compared in this thesis. We also covered diverse feature engineering and representation learning ideas for improving...

Image Anomaly Detection based on Deep Learning

, M.Sc. Thesis Sharif University of Technology Lagzian, Arash (Author) ; Beigy, Hamid (Supervisor)

Abstract

The detection of unusual events, which is called abnormality, is very important in various fields such as: industry, medicine, art, and agriculture, and has applications such as food quality detection, inconsistency detection in work environments, disease detection in medical images, and artefact detection. It has the art of counterfeiting and detection of unhealthy agricultural products, and by detecting the abnormality, the damages caused by the abnormality can be reduced. There are many challenges when detecting anomalies in images, which can be mentioned: the input image is not rich enough to learn a suitable representation, there are not enough samples to learn the model, the high...

Expert Recommendation in Community Question Answering

, M.Sc. Thesis Sharif University of Technology Esmaeili, Elyas (Author) ; Beigy, Hamid (Supervisor)

Abstract

Expert finding is an important task in community question answering (CQA) websites, enabling the routing of new questions towards users who have the highest level of expertise in the relevant topic. This method helps question raisers receive satisfactory responses in a shorter time and makes it easier for answerers to find questions they are interested in and have enough expertise to answer. . The primary goal in expert finding is to learn the representation of questions and expert candidates based on the history of answered questions. Many existing approaches generate a unique representation for users without considering the specific question asked. Additionally, many of these approaches...

Concept Drift Handling in Data Stream using Domain Adaptation Approach

, Ph.D. Dissertation Sharif University of Technology Karimian, Mahmood (Author) ; Beigy, Hamid (Supervisor)

Abstract

The escalating volume of data generated across diverse platforms underscores the necessity for robust methodologies in data stream classification. Predicting data streams becomes particularly challenging amidst evolving concepts, processing time constraints, and memory limitations. Concept drift, characterized by shifts in data distribution over time, significantly impacts prediction accuracy. This dissertation delves into data stream prediction and implicit concept drift management through a domain adaptation approach. To address these challenges, we examine two distinct scenarios. Firstly, we investigate data stream prediction problems wherein multiple sources contribute to the stream,...

Improving Density Peaks Clustering Algorithm

, M.Sc. Thesis Sharif University of Technology Masumi, Mostafa (Author) ; Beigy, Hamid (Supervisor)

Abstract

Clustering algorithms, as an unsupervised learning method, are widely used in fields like bioinformatics, natural language processing, image processing, and data mining. The Density Peaks clustering algorithm, introduced in 2014, is a notable density-based method. Its main advantage is the ability to detect clusters of any shape efficiently. The algorithm identifies cluster centers as points with high density compared to their neighbors and that are distant from other centers. However, its performance can degrade when clusters have multiple density peaks and it is sensitive to input parameters. In this research, three new algorithms based on the idea of local peaks were introduced. The first...

Cost-Sensitive Classifiers and Their Applications

, M.Sc. Thesis Sharif University of Technology Ahmadi, Zahra (Author) ; Beigy, Hamid (Supervisor)

Abstract

Decision making often has different effects and results with unequal importance. Most of classifiers try to minimize the rate of misclassified instances. These classifiers assume equal costs for different misclassification types. However, this assumption is not true in many real world problems and different misclassification types have different costs. These differences can be applied by introducing the cost in the process of learning. In this manner, total cost of misclassification will be the evaluation metric of classification. In order to apply this metric to the problems, new learning algorithms are needed. Cost-sensitive learning is the related area of machine learning which deals with...

Data Stream Classification in Presence of Concept Drift Using Ensemble Learning

, M.Sc. Thesis Sharif University of Technology Sobhani, Parinaz (Author) ; Beigy, Hamid (Supervisor)

Abstract

Traditional classification techniques of machine learning assume that data have stationary distributions. This assumption for recent challenges where tremendous amount of data are generated at unprecedented rates with evolving patterns, is not true anymore. Classification of data streams has become an important area of machine learning, as the number of applications facing these challenges increases. Examples of such data streams applications include text streams, surveillance video streams, credit card fraud detection, market basket analysis, information filtering, computer security, etc. An appropriate method for such problems should adapt to drifting concepts by revising and refining the...

Using Transductive Learning Classification in Bioinformatics

, M.Sc. Thesis Sharif University of Technology Tajari, Hossein (Author) ; Beigy, Hamid (Supervisor)

Abstract

Classification is one of the most important problems in machine learning area. Reliable and successful classification is essential for diagnosing patients for further treatment. In many applications such as bioinformatics unlabeled data is abundant and available. However labeling data is much more difficult and expensive to obtain. This dissertation presents a novel transductive approach for the development of robust microarray data classification. The transduction problem is to estimate the value of classification function at the given points in the working set. This contrasts with the standard inductive learning problem of estimating the classification method at all possible values and...

Concept Drift Detection in Data Streams Using Ensemble Classifiers

, M.Sc. Thesis Sharif University of Technology Dehghan, Mahdie (Author) ; Beigy, Hamid (Supervisor)

Abstract

Concept drift is a challenging problem in the context of data stream processing. As a result of increasing applications of data streams, including network intrusion detection, weather forecasting, and detection of unconventional behavior in financial transactions; numerous studies have been conducted in the field of concept drift detection. In order to solve the problem of concept drift detection, an ideal method should be able to quickly and correctly identify a variety of changes, adapt quickly to new concepts, in the presence of limitations of memory and processing power. In this thesis, a new explicit concept drift detection method based on ensemble classifiers has been proposed for data...

Call Admission Control Schemes in WiMAX Networks

, M.Sc. Thesis Sharif University of Technology Mokhtari, Zeinab (Author) ; Beigy, Hamid (Supervisor)

Abstract

The rapid growth of broadband wireless access (BWA) has increased the demand of new application such as VoIP, video conferencing, online gaming each of which has different requirement for quality of service. Due to limited bandwidth provided for these networks, one of the most important issues is how effective we manage bandwidth in order to support requests. The quality of service is an important indicator of the effective management of bandwidth. Using mechanisms of call admission control is a commonly accepted method for balance between quality of service and increase of utilization resource in cellular mobile networks. In fact, ...

Multi-Label Text Classification

, M.Sc. Thesis Sharif University of Technology Kamali, Sajjad (Author) ; Beigy, Hamid (Supervisor)

Abstract

Nowadays, with the increasing size of data,it’s impossible to collect data and fast classification by human, and needs for an automated classification and data analysis, is more interested. Data classification is a process of giving the training data along with their class labels to the learning agent, which learns the relation between the instances and the labels. Then make a prediction to the label of the training data.In this thesis we will observe the classification of the multi-label data. Multi-label data have more than one label. In other words, each instance appears with a vector of labels.In this thesis, a method based on nearest neighbor is proposed to classify the multi-label...

An Active Learning Algorithm for Spam Filtering

, M.Sc. Thesis Sharif University of Technology Shadloo, Maryam (Author) ; Beigy, Hamid (Supervisor)

Abstract

Content-based spam filtering problem is defined as classifying input emails into spam and legitimate emails. so it is considered as an application of supervised-learning. The supervised learning methods often require a large training set of labelled emails to attain good accuracy and the users should label huge amount of emails. In reality, it is not reasonable to expect users to do this. To address this issue and reduce number of labelling request from user active learning techniques can be used. The goal of active Learning algorithms is to achieve appropriate accuracy by using fewer amounts of labelled data in comparison with supervised-learning methods.In this thesis two active learning...

An Outlier Detection and Cleaning Algorithm in Classification Applications

, M.Sc. Thesis Sharif University of Technology Kasaeian, Mojtaba (Author) ; Beigy, Hamid (Supervisor)

Abstract

Increasing information in real world needs the special instrument for data saving, cleaning and processing. Data cleaning is so important steps in machine learning application that include various kind of procedures such as, duplicate detection, fill out missing value and outlier detection. Outliers are observation, which deviates so much from other observations as to arouse suspicions that it was generated by a different mechanism. Many researches has been carried out in the machine learning field with regards to the outlier detection that has applications in real world, like: Intrusion detection for network security, fraud detection in credit cards, fault detection for security in critical...

Multi-cass Semi-srvised Classification of Data Streams

, M.Sc. Thesis Sharif University of Technology Sepehr, Arman (Author) ; Beigy, Hamid (Supervisor)

Abstract

Recent advances in storage and processing have provided the ability of automatic gathering of information which in turn leads to fast and contineous flow of data. The data which are produced and stored in this way are named data streams. It has many applications such as processing financial transactions, the recorded data of various sensors or the collected data by web sevices. Data streams are produced with high speed, large size and much dynamism and have some unique properties which make them applicable in precise modeling of many real data mining applications. The main challenge of data streams is the occurrence of concept drift which can be in four types: sudden, gradual, incremental or...