The Language Gamer
Twitter / Github
2020-05-30
Perceptual Grounding and Second Language Acquisition
Some years ago, my sister and I set out to make a language learning game. We didn't want to build just another app that told people what sentences in their second language meant in their native language. Instead, we wanted to allow people to explore and discover sentence meaning for themselves. Being told how a second language is translated into a native language slows down comprehension. Seeing translations isn't how we learn our native language, and it plays an auxiliary role in the most effective means of learning a second language - immersion.

So, what game mechanic could allow a player to explore the meaning of a language? At the time, I believed the key to learning language was perceptual grounding:

"Our knowledge of and beliefs about the world stem from our perceptions of the parts of the world with which we come in contact. In particular, our knowledge of our native tongue comes from experience and hence rests on perception."
- Jon Barwise, from "Scenes and Other Situations" in The Journal of Philosophy

The importance of perceptual grounding drove the design of the game. We knew that simple picture-and-word association pairs wasn't enough. At most, that would allow the player to learn nouns for basic level categories . The mechanic we eventually came up with was to give the player a collection of images, and allow the player to arrange them however they wanted. Towards the top of the screen there is a description in the second language. Above that, the player sees a "target" description, also in the second language (the language the player is trying to learn). The goal of each level is to arrange objects so that the description of the scene matches the target description. Here's a demo of what that looked like:



In many ways, I'm proud of what we built here. At the same time, in many ways, it was a failure. We weren't able to teach an adequately broad swath of the target language to really bootstrap someone's learning. In total, our game Stagecraft only taught around 200 words. So what went wrong?

1) Poor programming decisions

When we set out, I had relatively little programming experience. I fell victim to what I now know are common pitfalls for programmers: Building everything around a baroque class hierarchy, and premature optimization. Every different category of linguistic construction had it's own class: Independent clauses, nouns, verbs, adjectives, prepositional phrases. The complexity multiplied as new languages were added with different word orders, and different syntaxes that mapped to the semantics of the game in different ways. And I thought I was quite clever as I found ways to minimize the need to destroy and re-create these objects that modelled the natural language syntax and semantics. Before performance was an issue. Without profiling. These things created a brittle codebase that was difficult to maintain and extend as we tried to add more content.

2) Inadequate UI

In Stagecraft, the way the player interacts with the game is by moving images around. That means that the scene that's created is wholly determined by the relative position of those images. In particular, for any given positioning, images could only relate in a single way. For example, if a person is juxtaposed with a door, the person could be knocking on the door, opening the door, or just standing next to the door. Our UI limited us to only be able to choose one of those interactions - and hence only teach one of the associated verbs.

3) Paying too much attention to Gottlob Frege, and not enough to J L Austin.

In the history of natural language semantics, there's a bias towards statements as the object of study. By statements, I mean declarative sentences that are either true or false, depending on what the actual world is like. Statements are definitely a significant kind of utterance in natural language, but as the philosopher J L Austin insisted, there all sorts of other things we do with language other than describe the world: We greet, promise, threaten, and congratulate each other; we apologize, express attitudes, and so much more. How do you draw a picture of those things?

4) Non-Conversational Interaction

Natural language is perceptually grounded, but it's also socially grounded. When we learn our native language we are not, in general, making changes to our environment and then hearing it re-described by our parents. Rather, we're paying close attention to our environment and paying close attention to our parents' (and others) facial expression. We're paying attention to what they're paying attention to, keeping track of their facial expression, making inferences about what they're thinking, and then using that to make inferences about what they're saying. Then we start testing what we've learned by speaking, generally with the aim of having needs filled, and then seeing how the world reacts. These social and conversational aspects of language learning just didn't align with the game mechanic of Stagecraft.

5) Stubbornness

There are two basic ways that a piece of media keeps us engaged: Narrative and rewards. Stagecraft doesn't have narrative; there's no semantic connection from one level to the next. Rather, like most language learning apps these days, the content of each level was guided by spaced repetition, the idea that we maximize word learning by repeating our exposure to it at exponentially increasing intervals. So without narrative, we needed rewards, the common things of learning apps - a levelling system, streaks, badges. At the time, I was too stubborn to add these. I thought the content would be too useful to ignore, and those superficial tactics we're unnecessary. I was wrong, and we suffered for it.

Looking over these problems, 3 and 4 relate to limitations in the underlying idea. But 1, 2, and 5 are failures of execution. Could a game with broader UI affordances, coded in a more general way, and offering rewards to players - but based on a similar mechanic as Stagecraft - be successful? That's the question I've been asking myself. To answer it, I've been going through lists of the most frequent English words and trying to determine which ones could potentially be taught in a Stagecraft-like way. It's clear that a learner couldn't be brought to fluency in this manner, but perhaps it could still work to bootstrap their understanding of a new language.