Setrans: a machine translator from english to sinhala

dc.contributor.authorHerath, H. M. P. U.
dc.contributor.authorJunaideen, M. Z.
dc.contributor.authorElkaduwe, D.
dc.date.accessioned2024-09-30T11:39:06Z
dc.date.available2024-09-30T11:39:06Z
dc.date.issued2013-07-04
dc.description.abstractWe present a syntax-based language model for natural language processing needs for Sinhala language. We specifically present an English to Sinhala translation scheme which makes use of the Stanford parser which is a free and open source software to parse the English sentence. The Sinhala language is not still properly analyzed and neither is data-gathered for a statistical approach whereas data on English is extensively studied and databases on part-of-speech tagged information are readily available. Therefore, we chose to use the existing software to parse English sentences and use the rule-based approach to generate the Sinhala translation of an English sentence. The first phase of the translation scheme is the parsing of the English sentence in the English parser. We used the Stanford parser for this step. Stanford parser is a software that makes use of statistical methods to tag parts of a speech and to generate the most probable parsed tree. The generated tree is then traversed level by level, and at each level, the part-of-speech sequence is matched against a database to come up with a possible reordering for the given sequence. This reordered tree is then traversed to replace the English words with the most probable Sinhala translation. Either each word is translated individually or a phrase can be translated directly into a Sinhala phrase or a word. Unlike statistical translation methodology, these phrases are limited to the syntactic phrases that are usually recognized in usual natural language grammar entities. The bilingual dictionary used to translate consists of a set of words, their part-of-speech tags and their Sinhala translated word or a phrase of words that gives the translation. It might also contain a set of English phrases, along with Sinhala translated word or a phrase of words that gives the correct translation. Ambiguity problem is one problem yet to be resolved in this research. In natural languages, the same word is used with different meanings. This is called the ambiguity of the language and is problematic when it comes to machine translation.
dc.identifier.citationPeradeniya University Research Sessions PURSE - 2012, Book of Abstracts, University of Peradeniya, Sri Lanka, Vol. 17, July. 4. 2012 pp. 149
dc.identifier.isbn9789555891646
dc.identifier.issn13914111
dc.identifier.urihttps://ir.lib.pdn.ac.lk/handle/20.500.14444/1333
dc.language.isoen
dc.publisherThe University of Peradeniya
dc.subjectComputer engineering
dc.subjectEngineering
dc.subjectTranslator
dc.subjectSetrans
dc.titleSetrans: a machine translator from english to sinhala
dc.typeArticle
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
H.M.P.U.Herath.pdf
Size:
201.41 KB
Format:
Adobe Portable Document Format
Description:
License bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
1.71 KB
Format:
Item-specific license agreed to upon submission
Description:
Collections