Skip to main content

Digital Armenian

(Coordinator: Marat Yavrumyan) The virtual (digital) reality, “intelligent technologies” using elements of artificial intelligence are an integral part of our lives today. In these conditions, it is primary to ensure the operation of Armenian as a language capable of serving virtual (digital) reality, digital economy, digital public services, digital culture and everyday life, as well as creating digital reality. It should be noted that despite separate projects, Armenian today does not keep pace with current trends in language technology. This circumstance makes the Armenian language, the cultural heritage created in the Armenian language incomprehensible to the modern challenges, with all the ensuing consequences. To fill the gap of the past years and create the necessary preconditions for short-term development, the following four directions of actions are proposed:
  1. Infrastructure building and development
In the long run, the goal is to have a vibrant digital language ecosystem that can outline trends and development directions while responding to the challenges adequately and with all speed. It is necessary to combine the functions of research, development and implementation of state language policy. The unification can be both in the framework of creating a joint department and in adopting a state action plan (for example, the Estonian model can be successfully localized based on the priority of digitalization of the economy).
  1. System builder projects
The digital transformation of the economy, public services, education, state systems presuppose new, much larger scales and volumes of digital data or information circulating in Armenian. Data processing tools of this magnitude are now offered in the field of “natural language development” (NLP), based on research in the fields of mathematics and computer science since the 1960s. The proposed programs aim to create machine tools with artificial intelligence elements serving the Armenian data and content, which will allow to turn Armenian from a printed language to a digital language.
  • “Armenian Treebank” project, which integrates Armenian into the global systems of machine processing of languages using elements of artificial intelligence (Universal Dependencies, Stanford NLP, Spacy.io, etc.);
  • “Ngram” national system based on the digital program of the National Library. The system allows visualizing the reality reflected in the text through the chronology, frequency, context of the use of word pairs etc.
  • Text-to-Speech (TTS) and Speech-to-Text (STT) open systems (eg. on Mozilla Foundation tech platforms).
  • Dictionaries of both modern and professional Armenian vocabularies in digital Wiki environment.
  1. State (interstate) language policy
The created infrastructures will make it possible to carry out activities within the EEU in the medium term. By introducing a trilingual (Armenian, Russian, English) system of terminology of professional vocabulary’s terms, the necessary digital tools (machine professional, narrow field translations), in other words, to assume certain functions of the state language policy for the EEU member states (by localizing the experience of the Estonian Institute)
  1. Educational resources
In the medium term, the human resources gathered around the above-mentioned programs will allow introducing modern educational modules (“Computational Linguistics”, “Natural Language Development”, “Digital Tools in the Humanities”, etc.) in different Armenian universities (including in the provinces/marzes of Armenia).