Loading...

Single-Cell RNA-seq Dropout Imputation and Noise Reduction by Machine Learning

Moinfar, Amir Ali | 2019

1110 Viewed
  1. Type of Document: M.Sc. Thesis
  2. Language: Farsi
  3. Document No: 52816 (19)
  4. University: Sharif University of Technology
  5. Department: Computer Engineering
  6. Advisor(s): Soleymani Baghshah, Mahdih; Sharifi Zarchi, Ali; Goodarzi, Hani
  7. Abstract:
  8. Single-cell RNA sequencing (scRNA-seq) technologies have empowered us to study gene expressions at the single-cell resolution. These technologies are developed based on barcoding of single cells and sequencing of transcriptome using next-generation sequencing technologies. Achieving this single-cell resolution is specially important when the target population is complex or heterogeneous, which is the case for most biological samples, including tissue samples and tumor biopsies.Single-cell technologies suffer from high amounts of noise and missing values, generally known as dropouts. This complexity can affect a number of key downstream analyses such as differential expression analysis, reconstruction of cell trajectories, clustering, and etc. There are some methods to impute the dropouts and reduce the noise present in the data. However, we believe that these methods can be improved using new algorithms, which fit better to the current problem. In Addition, justification and benchmarking of previous methods are limited and a unified set of criteria to systematically evaluate the methods is missing.Thus, besides our work, we introduce a benchmarking framework for systematic evaluation of single-cell dropout imputation and noise reduction methods.Here, we introduce an attention-based deep neural network to impute the missing values and reduce the noise of scRNA-seq experiments. The key advantage of our method is its structure, which benefits from embedding of genes in a low-dimensional space. Simultaneous embedding of genes and cells along with the imputation task, Improves the performance and makes our model more interpretable. Besides the imputation, the problem of estimation of library sizes of cells is also targeted. Utilizing the introduced benchmarking framework, we have benchmarked and compared previous methods with our proposed method. The results suggest that the proposed methods work well, especially in experiments having unique molecule identifiers
  9. Keywords:
  10. Machine Learning ; Noise Reduction ; Benchmarking ; Error Correction ; RNA Sequencing ; Single Cell Sequencing ; Gene Expression Data ; Dropout Imputation

 Digital Object List

 Bookmark

...see more