In an earlier post, we described that we are dynamically generating all texts used in the POSTHCARD simulation. This provides flexibility when creating the scenarios, as the text will simply adjust to the context in which it is generated. I suggest you read the earlier blog post if you haven’t already, as it introduces the concept of generating text dynamically in more detail.

In this blog post, I’ll tell you a bit more about the templating language we’re developing to be able to reuse texts. The final system will be able to generate text in English, French and Dutch.

ExpReal

The software component that converts our templates into the appropriate text is called ExpReal (for Expressive Realiser). A simple version of ExpReal was already built for another project, but we’ve extended it to be applicable for our simulation. It takes a large list of templates, searches for the most relevant one based on conditions, populates the templates and finally sends the dynamic parts to SimpleNLG, which makes sure the texts are grammatically correct. This procedure is the same for all three languages (but with different grammar rules, of course).

Templates

One of the challenging aspects of ExpReal is to make it support multiple languages while keeping the template format the same for all languages. Meanwhile, we want the templates to be human-readable, writeable and understandable. Let me explain how:

By definition, text templates contain pieces of text that can be replaced by something else (variables). In ExpReal’s templating language, there are three types of such variables: grammatical blocks, entities and tasks/arguments.

‘Grammatical blocks’ are pieces of text in the template, which require some form of extra grammatical processing, such as verbs that need to be inflected or nouns that need to be replaced with a pronoun (‘him’, ‘her’, etc.). These blocks are enclosed by curly brackets {}, for example: {subject: Julia}. All grammatical blocks contain their grammatical role as a key (‘subject’, ‘object, ‘verb’, etc.) along with its value. This value has a specific format, which we won’t describe in detail, yet, but here’s a taste of it (with straight brackets [] denoting optional values):

{role: [determiner] [premodifiers] mainNoun [< ownerNoun]}

For example:

{subject: the big red jacket < Julia}

Which results in : Julia’s big red jacket

Or in the case of modifiers belonging after the main noun (in French):

{role: [determiner] [premodifiers] | mainNoun | [postmodifiers] [< ownerNoun]}

Or, for la grande veste rouge de Julia:

{subject: la grande | veste | rouge < Julia}

The second type of variables are entities. These are denoted by a percentage sign % prefix and are used for people and similar entities. They can be used inside grammatical blocks. For example, if we use {subject: %julia}, ExpReal will look up how it has to refer the entity ‘julia’. This can be a simple capitalised form (‘Julia’), but, depending on who is speaking, it could also be e.g. ‘mom’. There are two special entity variables: $speaker and $listener. These indicate that the variable should be replaced by a reference to whoever is speaking (or listening, resp.) in the current context. This can be considered an extra step in front of % sign entities. For example, “Hi, {object: $listener}!” can result in “Hi, Julia!” or “Hi, Marion!” depending on who is listening. Note that these two variables have a different prefix: $. This indicates that they need an extra preprocessing step, just like our last category of variables: tasks and arguments.

Tasks and arguments are also prefixed with a dollar sign: “Please, go $task.” Tasks can be any action a character in the simulation can perform (such as ‘walk towards something’), while arguments describe the object of those actions (‘walk towards the closet‘). These variables can be replaced by (part of) a sentence that is selected based on its value in the current context. In our example, replacing $task by “do the dishes” would result in “Please, go do the dishes.” With other values, it could be “Please, go walk the dog.” or “Please, go wash your hands.” etc. Similar to $tasks, $arguments are replaced based on the context. Typically, these arguments are replaced by objects that are then inserted into a $task or sentence, for example: $task = “wash $argument” says to wash something, which could be hands, face, clothes, the dog… We then have a dynamic sentences within a dynamic sentence! Now there are even more possible outcomes!

Conclusion

While the system is continuously under development, this post should have given you a little insight in how our text generation system currently works. Designing a templating language involves many types of choices. The language has to be simple enough to read and write as a human, but simultaneously contain enough information for proper and correct generation. On top of that, we have the challenge of supporting English, French and Dutch, which all have their peculiarities.