Search Site

ADQ, Orion to establish JV

The partners commit to deploying $1.2bn in the next four years.

Alpha Dhabi acquires interest in NCTH

The deal increases NCTH's portfolio to 8 hotels with 1,500 keys.

Meraas awards construction contract

The $272m contract has been awarded for Bluewaters Bay.

SIB’s 2024 profit $272m

The profit surpassed AED 1 billion for the first time in bank's history.

AD Ports to invest in Kazakh port

Under the deal, AD Ports Group owns 51% stake.

Maori-led AI initiative can revolutionize language preservation, says Peter-Lucas Jones

Peter-Lucas Jones is the CEO of Te Hiku Media.
  • Peter-Lucas Jones, a Time100 AI 2024 List honoree, discussed AI's role in preserving the Te Reo Maori language at the World Economic Forum in Davos.
  • His team at Te Hiku Media developed Te Reo Maori natural language processing tools and an automatic speech recognition model.

Davos, Switzerland — Peter-Lucas Jones, a Kaitaia native, who has been listed in the prestigious Time100 AI 2024 List, was one of the contributors to the engaging discussions at the World Economic Forum in Davos last week. Jones is a leader of a kaimahi team that created a suite of Te Reo Maori NLP tools and their own automatic speech recognition model for the indigenous Te Reo language. Jones is the CEO of Te Hiku Media – a tribal radio station based in New Zealand. In an exclusive interview with TRENDS, he spoke about AI’s role in preserving their indigenous Te Reo language and other ramifications of the technology in today’s digital world.

How did you come up with the idea of using AI as a tool to preserve the language?

When we look at the digital transition and its impact on the archiving and preservation of Māori knowledge that’s been documented since the colonization of Aotearoa, New Zealand, we saw an opportunity to provide new and meaningful ways to our tribes to access traditional knowledge.

In Te Hiku Media we have an exceptional archive of film and audio items, thousands of them. These recordings have been conducted over the last 30 years, and they are the voices of elders, chiefs, and members of our tribes, those who are knowledgeable and knowledge keepers about all the different aspects of our culture, including but not limited to our native language. We’ve focused on interviewing our people about our biodiversity, which is a really important topic for the members of our tribes because our language is the home of all our nomenclature, our narratives about the mountains, about fresh water, about how it flows to the ocean, how it is our blood that pumps oxygen into every living thing.

We personify it, we name it. We believe it is part of us, and we are part of it. And part of our fiduciary responsibility as indigenous people is to protect it.

The way we embarked on this journey of teaching computers how to speak Māori was because we wanted to transcribe all of the information that we had captured and recorded over 30 years. The reason we wanted to do that was because, due to language discrimination, systemic bias, and colonization in Aotearoa, our language is starting to diminish among families and tribes. We saw this as a way to enable pathways of learning, learning language, and culture as a way of uplifting identity, as a way of reconnecting people with their genealogy. We saw it as a way of reconnecting people with a belief system and a philosophical worldview that is steeped in hundreds and sometimes thousands of years of observations, systematic measurements, and understandings that are contained within the traditional wisdom that our elders have shared with us. But when we started to transcribe that information, we quickly realized that there was no way we were going to get through 30 years of archives.

We decided to create a language platform. We decided to teach computers how to speak Māori to speed up the process. So originally, it was about enhancing the way that we did our work. It was about increasing productivity. It was about delivering on the expectations that our people had for us. We had been ordained by the members of our tribe, by our elders, to digitize our archives. But in digitizing those archives, we also realized that there are different ways of consuming information. Some people like to listen and hear, but some like to read. And those are different types of language skills. There are active skills, which are, of course, speaking and writing, and there are passive skills which are listening and reading. We wanted to provide access to material that people could read. So, on a very small budget but with an exceptional will that we could find a way, we created technology that performs at a 92% accuracy rate, a world’s first for an indigenous language. It outperforms the attempts by major technology companies throughout the world to document indigenous languages. People ask how we did it. But the important question is why we did it.

Because it represents an expression of indigenous self-determination. What would the very people and corporations that represent systemic policies and systemic regimes that were responsible for beating native languages out of generations of people, do to our language? What was it about our language that was so important? We realized that our language is our culture. Our culture is the home of all the information about the biodiversity in our forests and our oceans. It is our understanding of the environment. Language springs from the landscape, the life, and the environment that it survives in. We have survived in the Pacific for thousands of years. We learned that traditional wisdom about the climate and environment has commercial value. Why should we let international corporations mine our traditional knowledge for commercial value when we have seen the reality of the digital divide? And when we think about the age of intelligence, it’s not going to be a divide. It’s going to be a process that sees some people advance and some left behind. We decided that we do not want to be left behind, and we would like to lead initiatives and will lead initiatives that support the revitalization of our language and ways to transmit that language intergenerationally through digital means. Our people no longer live the way that we used to.

I grew up in an intergenerational home. I had access to my grandmother’s wisdom. That wisdom is something money can’t buy because it’s lived experience. When we think about our knowledge, and that we are guardians of that knowledge, we are reminded that the very people that saw the confiscation of our lands saw the Christianization of our people and enabled our people to become alienated from our resources, our language, and our culture.

Why on earth would we support people like that to sell the language back to us as a service through some big tech company? We recognize that as a risk, but that risk, too, is an opportunity. Because today’s generation understands that past injustices were not right. And when I come to a place like the World Economic Forum, I hear minds share ideas, we realize that we are not that different. We are all living together on this planet, and through developing technology and data licensing, of course, to provide predictions so that history does not repeat itself, and we see our resources, intellectual property, exploited, just like our natural resources and land was taken away from us, we still see an opportunity to collaborate. We still see an opportunity to cooperate. And that’s what brought me here, to share the story of data governance, to share the story of quality, to share the story about how important it is to have tools and agents that we audit, not every year but every day and every moment. We audit the quality, we audit the precision, we audit the accuracy, and through that, we learn how to improve.

But through that, we also learn how to share. We’ve shared so much. We’ve come so far in what we have developed in terms of natural language processing tools for Te reo Māori, speech-to-text, and text-to-speech. Text-to-speech has been an important feature in the new development of indigenous language technology. It operates exceptionally well, and it can scale. What we’ve developed for Te reo Māori can be scaled for other indigenous languages.

How did you engage with AI for Good?

Papa Reo is the name of our platform. I have been in conversations about AI for Good, which is something that is talked about in different industries. Our industry is very much about corpus gathering, collating data, and curating data. But what sets us aside is we are a tribal broadcasting institution. We are a monument to those courageous people who fought for our rights and interests to be recognized by the New Zealand government. We have a treaty –the Treaty of Waitangi — between the Māori people and the Crown — the British that came to New Zealand and colonized it. The treaty guarantees indigenous peoples’ rights and interests over our taonga, amongst other things. A taonga is a treasure, something very, very important, sacred, and it has meaning, and it has guardians. When we think about our role and language, we are the guardians of the language which is the last bastion, the cornerstone of a native society.

What is special about the native languages in New Zealand? 

There is one language, the Māori language, but there are many tribal variations and cultures.

Different tribal groups have different words of preference, idiomatic expressions, colloquialisms, and their own way of describing their unique relationship with their environment. We share an ontology. For example, our way of categorizing the world is unique, but it is related to our sister languages in the Pacific like Hawaiian, Tahitian, Kukai, and the language of the Marquesas Islands.

When we think about what we’ve achieved for Te Reo Māori and how it opens up a plethora of information and traditional wisdom, and how that can be applied, not only in the discovery of new medicines that are perhaps focused on things that come from the sea floor, but also our forests, also the insects that are around us, the fish that we know, and also how we’ve used those resources for many, many generations.

When we think about AI for good, we understand what harm looks like because we’ve experienced it. Our language has been harmed through the colonization process. We’re looking at artificial intelligence and our development of language technology as a way to reverse harm. The philosophy of reversing harm is not new, but how we apply our philosophy is based on our values and our principles. For example, we develop data licenses with companies that would like to work with us. We provide access to our APIs to those companies that may want to develop new and meaningful experiences through using our language technology. Our data license significantly objects to (certain things). We do not allow the use of our language technology for the surveillance of our people, further discriminate against our people, mine Māori data or indigenous data and we will not allow our language technology to be used to further diminish our ability to rise economically in a world that we are all part of.

Instead, we want to find meaningful ways to grow capability, to grow capacity, to grow skills for new jobs, to grow talent for new jobs that are going to be seen over the next decade in this age that is fast becoming known as the age of intelligence. When we think about the digital divide and about the global north and the global south, we cannot forget about the plight of indigenous people and the unique contribution that we have in saving this planet is going to involve. Our understanding of the ocean, our understanding of biodiversity, and our comprehensive attention to the environment around us are not just important in the past, but it is important now. Together, we can do more. We have achieved so much. And today, when you think about the World Economic Forum and we hear about the contributions of many people and how we can work together in the future, I believe we have come too far not to go further.

What are some of the positive moments you’re taking back with you that are going to frame 2025?

Whilst my view on things is very much driven by my values and my principles, I’m mindful that we are in an age that is going to require more computers. More computers are going to impact the parameters of models, big, large, small, and little. Whether we like it, the strain on natural resources to create the energy that is going to power the data centers that are required for AI to improve, to amplify the need to become more productive in a world that is very much focused and driven by commercial objectives. I am reminded that there are perhaps ways of accessing traditional knowledge, solar.. the ocean. How do we sustainably harness more natural resources so that we start to contribute to conversations that are focused on renewable energy rather than just burning coal? The reality is, that cooling costs money. Cooling centers that will house computers are going to cost money. That is one of the things that I’m taking into the conversations that I’m having this year. I’m taking that into the conversations that I’m having this year because we need to remember that this is not a romantic activity.

We often hear about data centers being built in certain places. And in those places, we might think, ‘Gosh! Is there a possibility to create, alongside this data center, a renewable energy facility that can also service the people who live in and around that data center?' How do we start to address community needs and commercial needs at the same time? Why does the solution only need to be commercially driven?

AI for Good can be a wonderful experience and something that can be used to help us solve world problems. But the first problem we need to solve is how we continue to survive in a world that forgets that energy costs money. Money is going to either be the thing that drives this conversation or the only thing.  Balancing the conversation, particularly around renewable energy, and the opportunity there to develop shared solutions. For example, we often hear about data centers being built in certain places. And in those places, we might think, ‘Gosh! Is there a possibility to create, alongside this data center, a renewable energy facility that can also service the people who live in and around that data center?’ How do we start to address community needs and commercial needs at the same time? Why does the solution only need to be commercially driven?

When we think about what a shared network of smaller data centers looks like in places like the Pacific, we could address electricity needs for those places with people who still do not have secure access to energy for their households and schools.

When I think about AI for Good, I’m not talking about tunnel vision. I’m talking about opening our eyes up and looking right around. But let’s not look alone. Let’s look together. (Edited by Hilal Mir)