Loading...
Search for:
corpus
0.053 seconds
Introducing an Approach to Build a Phrase Structure Treebank Using Persian Dependency Treebank
, M.Sc. Thesis Sharif University of Technology ; Bahrani, Mohammad (Supervisor) ; Eslami, Moharram (Co-Advisor)
Abstract
Treebanks are useful in many applications of natural language processing such as machine translation, speech recognition, information extraction, and etc. They are also being used in theoretical linguistics to study languages. For instance, they are valuable for development of different syntactic theories, calculation of frequency of syntactic rules, evaluation and comparison of statistical models and etc.
The most treebanks are based on phrase structure grammar or dependency grammar. In a phrase structure treebank, a sentence is divided to phrases. In this representation, a phrase is composed of several words. But in the dependency treebank, connection between two words is based on the...
The most treebanks are based on phrase structure grammar or dependency grammar. In a phrase structure treebank, a sentence is divided to phrases. In this representation, a phrase is composed of several words. But in the dependency treebank, connection between two words is based on the...
Thesis Submitted in Partial Fulfillment of the Requirements for the Degree of Master of Science (M.Sc.) in Computer Engineering, Artificial Intelligence
, M.Sc. Thesis Sharif University of Technology ; Sameti, Hossein (Supervisor)
Abstract
Punctuation marks in every language, constitute an important part of a text. Not inserting these punctuations in text, makes the text ambiguous. The output text of automatic speech recognition (ASR) system, is typically a raw sequence of words, containing no punctuation marks. This makes the text difficult or even impossible to make sense of for humans, as well as for any further text processing tasks. The goal of this thesis is to perform automatic punctuation insertion in Persian texts lacking punctuation marks. To the best of our knowledge, this is the first work done in this context for the Persian language. For this purpose, firstly, we assembled a state-of-the-art corpus to train and...
Lexical Bundles and Generic Moves in the Discussion Sections of Research Articles by L1 and L2 Writers in Applied Linguistics: Adopting a Bundle-Move Approach
, M.Sc. Thesis Sharif University of Technology ; Jahangard, Ali (Supervisor) ; Hassanzadeh, Mohammad (Supervisor)
Abstract
English research articles (henceforth RAs) have evolved into a vital medium for conveying and disseminating scientific knowledge. Mastering textual and linguistic conventions of academic prose assists English native and non-native authors in writing scientific RAs. Lexical bundles (LBs) refer to word combinations that co-occur frequently in a particular register (Biber, Johansson, Leech, Conrad & Finegan, 1999). In fact, they are an integral component of “fluent linguistic production” (Hyland, 2018, p. 4). For more than a century, LBs have attracted researchers' attention in corpus-driven studies, although the extent to which LBs vary between English native and non-native authors remains...