Skip to main content

DOST broaches ‘ISIP Program,’ ‘Project Marayum’ for natural language research

Written by: Abram Josh Marcelo

Edited by: Elijah Mejilla


DOST and AI-related institutions discuss Large Language Models for Philippine Languages. Photo fom DOST-PCIEERD.

DOST and AI-related institutions discuss Large Language Models for Philippine Languages. Photo fom DOST-PCIEERD.


Published: Wed Sep 18 2024 00:00:00 GMT+0000 (Coordinated Universal Time)
Updated: Wed Sep 18 2024 00:00:00 GMT+0000 (Coordinated Universal Time)

In August 2024, the DOST Philippine Council for Industry, Energy, and Emerging Technology Research and Development (PCIEERD) held a multi-disciplinary meeting tackling various projects to lay out a roadmap for natural language processing (NLP) and large language model development focused on Philippine languages.

The forum highlighted the Interdisciplinary Signal Processing for Pinoys: Software Applications for Education (ISIP: SAFE) Program, which developed reading and writing tutors for children and captioning systems for Philippine languages. Researchers from University of the Philippines Diliman (UPD) — Electrical and Electronics Engineering Institute and De La Salle University — College of Computer Studies (DLSU-CCS) implemented ISIP: SAFE.

Speakers further raised attention to Project Marayum — a community-built and curated online dictionary for languages in the country — developed by UPD graduates. FilWordNet by DLSU-CCS also gained recognition, it is a language resource that tracks word sense changes on the internet using network science and NLP.

DOST PCIEERD Executive Director Enrico C. Paringit emphasized the significance of these efforts to the country’s cultural heritage.

“This is an opportunity for us to maximize the use of technologies available to us in preserving and propagating our different languages. Through this roadmap, we can identify research gaps and possible solutions in terms of technology, human resource, and policy,”

said Paringit.

The meeting also mentioned the Mindanao Natural Language Processing Research and Development Laboratory funded by DOST. The said laboratory aims to preserve endangered languages from Mindanao using NLP.

Projects collaboratively built by researchers from other universities and industry cohorts took a share of the forum’s time as well.

“We thank our researchers and partners from the academe, government, and industry in making solid efforts in providing solutions and opening opportunities for the Filipino people to utilize science, technology, and innovation in this initiative,”

DOST Secretary Renato U. Solidum, Jr added.

Representatives from the Komisyon sa Wikang Filipino, AI Singapore, and academic institutions such as the Ateneo Social Computing Science Laboratory and Ateneo Center for Computing Competency and Research attended said event, endorsing the planned research roadmap.

References:

DOST-PCIEERD. (2024). DOST leads charge for natural language research roadmap. https://pcieerd.dost.gov.ph/articles/latest-articles/583-dost-leads-charge-for-natural-language-research-roadmap