Process Of Machine Translation

965 Words4 Pages
Given their importance, the demand for multilingual parallel resources is increasing primarily for those included under-resources languages. However, the problem of building a balanced mix of multilingual texts in sufficient quantities and with a high quality of translation becomes ever more central. This bottleneck becomes so prohibitive when any further processes such as sentence alignment or Part of Speech (PoS) tagging are attempted to be involved. Regardless of the difficulty of building such corpora, they are very valuable for many applications in Natural Language Processing (NLP) field given that the progress in most of these applications is driven by available data (Tiedemann 2007). Statistical Machine Translation (SMT) is one of…show more content…
Hu (Hu 2016) has explained in more details how parallel or comparable corpora can be used in translation teaching, primarily on the establishment of corpus-based mode of translation teaching and the use of corpora in compiling translation textbooks. The PoS tagging and its natural successor parsing are basic tasks in NLP as well as corpus linguistics. Knowing that different ambiguity patterns are likely to occur in different places across languages, combining information from many languages creates a clearer picture of each (Naseem et al. 2014). When a parallel corpus is available, a cross-lingual PoS tagging can be used to assess the effectiveness of cross-linguistic projection of morphological features to an under-specified target language (Sylak-Glassman et al. 2015). Also, with the help of parallel treebanks, the syntactic annotation may achieve a notable impact (Xing et al. 2016). For example, the syntactic annotation applied using the parallel corpus Prague Czech-English Dependency treebank (Bojar et al. 2012; Hajic et al. 2012). The parallel corpus will be more valuable for some NLP tasks if it is aligned at the level of sentences as well as words. For example, in Word Sense Disambiguation process, the word senses can be derived from word alignments on a parallel corpus instead of a pre-defined monolingual sense-inventory such as WordNet (Lefever et al. 2011). Yet, exploiting the text
Open Document