Бидний тухай
Багш ажилтан
Ярианы технологид үгийн дуудлагыг бичвэрээс нь гаргаж өгдөг дуудлагын хөрвүүлэг нь чухал үүрэгтэй. Энэ өгүүлэлд дүрэмд суурилсан аргад тулгуурлан хөгжүүлсэн бичвэрээс дуудлагад хөрвүүлэх хэрэгслийн алгоритм, хэрэгжүүлэлт болон туршилтын үр дүнг танилцуулав. Энэхүү ажлаар хийж буй бичвэрийг дуудлагад хөрвүүлэх хэрэгсэл нь ижил бичлэгтэй боловч өөр дуудлагатай үгс (омограф)- ийн дуудлагыг ялгах боломжтой ба мөн өвөрмөц буюу дүрмийн бус үгс, гадаад үгсийн дуудлагыг дуудлагын толь бичгээс авах боломжтойгоороо өмнөх ижил төстэй ажлуудаас ялгаатай, шинэлэг юм. Монгол хэлэнд өргөн хэрэглэгддэг үгсээс омограф үгсийг тоолох туршилт хийж нийт 75 мянган үгсээс 92 омограф үг олсон.
This paper presents a pioneering work on building a Named Entity Recognition (NER) system for the Mongolian language. While state-of-the-art NER methods have produced results close to human performance for well-studied languages, the approaches that work in, typically fare much worse when applied directly to languages such as Mongolian, with an agglutinative morphology and a subject-object -verb word order. Our work explores a fittest feature set from a wide range of features. As well as we tried to apply various existing machine learning methods and find optimal ensemble of classifiers based on genetic algorithm. The classifiers used different feature representations. The resulting system constitutes the first-ever usable software package for Mongolian NER, while our experimental evaluation will also serve as a much- needed basis of comparison for further research.
This paper describes the development of finite state morphological transducer for Mongolian and presents some issues in Mongolian morphology, linguistic issues encountered and how they were dealt with. The work done here includes all the morphophonological rules needed for all Mongolian nominal and verb. Nominal morphotactic is implemented completely and verbal morphotactic covers one level continuation lexica. An evaluation is done via analysis on two separate corpora, which shows high-level and medium-level coverage respectively. It is more elaborate and accurate than previous implementations of its kinds.
An efficient data analysis of traffic flow plays an important role in achieving better transportation services. The aim of this work is to find out passengers' travel pattern from incomplete transport access data. Our proposed big data analytical model predicting endpoints of travel regularity gives significantly improved representation of live traffic behavior. We investigated nearly 38.3k patterns in three months data recorded 35M boarding actions.