This is the Linguist's Assistant.

As a grammatical analyst at All The Word, I work with this semi-automatic document authoring machine every day.

Let me show you what that looks like!

My Role

Grammatical Analyst

Time

5 months, ongoing

Team

Mettasari Njoto Sandjaja (the MTT)

Product

A first draft Indonesian Bible

In my current phase, I start with a source sentence and a target language translation.

I write rules to get the software to generate a matching translation to the one I receive from a Mother Tongue Translator (MTT).

It may seem redundant to get a translation generated of a sentence already translated, but once enough of these are done any further bodies of text can be generated much faster and with very impressive accuracy!

Let's look at an example of how this works.

The source sentence that we are translating from is not exactly English, as you can see here.

This sentence is supposed to mean "They gave books to each other." in English, but is represented here as three noun phrases and one verb phrase in a main clause. Each phrase also contains a noun or a verb in their respective brackets.

Every element here, clause, phrase, and word, has a set of values assigned to them.

For instance, our sentence's verb here 'to give' is marked as having a reciprocal reflexivity.

This means that the 'subject' and 'indirect object' here are both the 'agent' and 'destination' of the marked verb, as in both are giving and receiving books.

Languages like English would turn the grammatical destination noun into a reciprocal pronoun, like 'each other'. In the translation we were given by our MTT, this does not seem like the case for Indonesian.

Instead, the word 'saling' appears between 'mereka' and 'memberi'. From our previously generated sentences, we already know that 'mereka' means 'that' and 'memberi' means 'to give'.

Indonesian from what we have seen so far is a SVO (Subject, Verb, Object) language like English, so we have little reason to believe that 'saling' is a pronoun that appears before the verb.

That means it must be a reciprocal adjective or adverb. We then ask our MTT and find out that 'saling' is more closely associated to the noun, meaning it should be a reciprocal adjective. So then, we write a rule that deletes what would become the reciprocal pronoun and insert a reciprocal adjective with the surface form 'saling'.

We click 'generate', and the sentence matches! The red confirmation text is always cause for a sigh of relief.

Of course, the rule we wrote is not the only thing going on behind the scenes. For example, the program only knew where to put our reciprocal adjective because of a phrase structure rule that orders our constituents. 'Memberi' was generated from the root 'beri', as the suffix was added because the verb is transitive and the clause is in the active voice.

Now our newly made rule will make future sentences easier to generate, and we work on more sentences until the program can generate larger texts all on its own!

Overall, the Indonesian grammar has been a thoroughly awesome project to work on. On the right, you can see me meeting with my MTT and supervisor. I have learned so much from them, and I hope I'll keep learning more.

Thanks for reading!