Technological advancements are often viewed as threats to the vitality of Indigenous languages.
But, researchers at the University of Immaculate Conception in Davao City, together with the Department of Science and Technology-Philippine Council for Industry, Energy, and Emerging Technology Research and Development (DOST-PCIEERD), are harnessing emerging technologies to preserve and promote Indigenous languages in Mindanao.
The establishment of the Mindanao Natural Language Processing Research and Development Laboratory at UIC-Bangkerohan Campus in Davao City marks a significant step toward this goal, leveraging artificial intelligence to develop tools that support language preservation while driving technological innovation.
During its recent inauguration, DOST-PCIEERD Executive Director Enrico Paringit said the laboratory represents a significant contribution of science and technology to ongoing efforts to document and preserve Indigenous languages in the country, especially as these languages are gradually disappearing due to the decline in active speakers.
“Several languages now in the Philippines are endangered, so this means that they are at risk of being extinct, and we need to find a drastic way to preserve them. So one way to address that is to have these documented and keep them alive through teaching and developing tools like what this laboratory is doing,” he said.
DOST-PCIEERD supported the laboratory through a P5-million grant under its Institutional Development Program.
The laboratory focuses on Natural Language Processing, the intersection of artificial intelligence and human language, to facilitate communication between humans and machines. According to MinNa LProc project leader Kirstine Mae Adlaon, through NLP, they developed technologies such as machine translation and dialogue systems for Indigenous languages.
The machine translation system allows a seamless translation of phrases or terminologies from Indigenous languages to other languages and vice versa, while the dialogue systems, in the form of chatbots, enable users to request assistance in their respective Indigenous languages.
“Ang ginagawa namin dito sa laboratory, we make technologies, apps, like chatbot, machine translation systems na particularly for Mindanaoan languages and we make it sure that these technologies, like chatbots, machine translation, ay magamit talaga sya,” she shared.
(What we do here in the laboratory is we make technologies and apps, like chatbots and machine translation systems that are particularly for Mindanaoan languages, and we make sure that these technologies can really be used.)
Although the NLP has been widely explored, Adlaon emphasized that their use of Indigenous languages makes it especially relevant, noting it contributes to the preservation and revitalization of these languages through the collection, digitization, storage and sharing of language resources.
At present, the laboratory is processing three Indigenous languages in the Davao region: Manobo, Mansaka and Kalagan. However, it plans to expand its scope to cover all six endangered languages in Mindanao, such as Subanen, Butuanon, Kinamiging Manobo, Bagobo-Klata, Kagan and Kalagan.
“Ang goal ko (my goal) is for Mindanao; we have identified six endangered languages here in Mindanao; that’s the goal that’s the next target,” she shared.
While DOST-PCIEERD has been instrumental in providing the necessary computing equipment, Adlaon emphasized that their coordination with the National Commission on Indigenous Peoples XI paved the way for the project’s realization, especially in obtaining language data, which is among the essential elements.
She said NCIP XI assisted them in identifying Indigenous Peoples communities for their target Indigenous languages and facilitated their immersion within these communities from data collection to output validation.
“Every time we go to the community, nandiyan si NCIP na sumasama talaga sa amin (NCIP is always there with us). They’re the ones communicating to the tribe,” she said.
Meanwhile, Adlaon said that the Indigenous communities they visited welcomed them and supported the project by cooperating in the collection process.
For instance, Datu Edris Mamukid, Overall Datu of the Indigenous Cultural Communities of Kagan in Banaybanay, Davao Oriental, was delighted with the project, citing its vital role in preserving their linguistic heritage, which is tagged as among the endangered languages in Mindanao.
“With the people like you who make the extra mile to engage and immerse in our unique language. We are confident that despite the fast-changing environment, our language, our way of life as IPs are preserved and respected,” he said during the laboratory’s inauguration.
He is optimistic the project would allow the public to know more about not only the beauty of the Kagan language but also the culture and identity of the Kagan people.
While the project has been successful, Adlaon emphasized that they will continue their work, particularly on building the dataset for the Indigenous languages, noting that this is crucial for ensuring the reliability of the tools they have developed.
“Nasa thousands pa lang yung dataset na na-gather namin, unlike large language models like chatGPT models and Google, nasa billion parameter yung ginagamit nila. But at least we are happy that we started it, simula pa lang ito, we have to do a lot of things pa,” she shared.
(The dataset we’ve gathered so far is in the thousands, unlike large language models like ChatGPT and Google, which use billions of parameters. However, we are happy that we’ve started. This is just the beginning, and there’s still much more to be done.)
Also, she said they will begin working on other endangered Indigenous languages in Mindanao in partnership with other organizations, such as WikiClub Zamboanga for the Subanen language.
Moreover, Adlaon believed that with the laboratory’s establishment, it would open new opportunities for research and development that leverage science and technology to preserve and promote Indigenous languages.
She said they are open to supporting and collaborating with other researchers who are working on, or wish to work on the same project or any related undertakings. (ASO, PIA XI)