Technically 'They'

MCMFriday, March 1, 2019

I should be writing more of The Anti-Anti-Anti-Christs, but instead I am battling a very irksome issue with the code, which was brought to light by the fact that the pronouns "they/them" are creating garbage grammar everywhere. Being highly distractible as I am, I haven't done much else but try to figure this out.

To understand the issue, I need to explain how this technically works.

First of all, Prism fiction is written in a modified version of markdown, which is basically plain text with some little formatting tricks that are converted into HTML (and thus styled). So you have:

This bit of text *is bold*

which renders as

This bit of text is bold

So far so good? Cool. Now we add the Prism layer, which runs like so:

[hero-fname] runs down the street.

Prism reads the square brackets and breaks the component pieces out to understand what's being requested. In this case, we're talking about the "hero", and specifically their first name. The engine converts that block into a special HTML tag like so:

<div data-user="hero" data-ref="fname"></div>

What happens at runtime is that, after you define your hero's first name, the site automatically populates all the relevant tags with the name you entered. So far so good.

Then we get a bit more complicated, by writing something like:

[hero-fname] reaches back and grabs [hero-zer] sandwich.

In this case, we're converting pronouns. When writing the story, it's best to use the same pronouns for every character, so they're easier to search and replace after the fact (trust me: writing a story with square brackets everywhere is not worth the effort).

Now, choosing the baseline pronoun set was actually a bit of trial all on its own. He/him turned into a nightmare because of how easily "he" can appear as a subset of other words, but she/her gets wacky because of things like:

Who did it? Her.
She ate her sandwich.

Stupid English, using the same word for two different modes. Grr. If you want to know what "difficult conversion" looks like, it's trying to figure out how to change 900 instances of "her" in a document, and forgetting how to do it every 5 instances.

The easiest solution was to pick a pronoun set that was distinct, familiar, and short. Zie/zer fit the bill perfectly. The thing is: this isn't just a system that's going to be used once and then thrown away. If it's going to be useful, it needs to be repeatable and relatively intuitive. It's harder at the start, but easier in the long run.

So now our text is converted as such:

<div data-user="hero" data-ref="fname"></div> reaches back and grabs <div data-user="hero" data-ref="zer"><div> sandwich.

The user chooses the pronoun set, the Prism engine looks up what those pronoun uses look like, and it updates the values in the HTML accordingly. Super simple, and everyone is happy.

Except no. Except "they". That is the outlier. Look:

"[hero-fname] won't survive it," [villain-fname] said. "[hero-Zie] isn't strong enough."

...converts into...

"Goodface won't survive it," Badface said. "They isn't strong enough."

Dammit, English!

Suddenly, nothing works the way it should. Someone choosing "they/them" as their pronouns is going to have a really miserable time reading the story, so obviously I need to fix it... but how?

There are a few bad options, as far as I can tell.

Option 1: Tag Loading

In this version, we load the tags themselves with the verb in question, and let Prism convert the words as necessary. So:

[hero-Zie:is]n't strong enough.

or maybe

[hero-Zie:isn't] strong enough.

The engine sees the verb at the end and looks it up against a table of exceptions, and returns the right value. So "is" would become "are", while most other verbs would just add or subtract an "s" to the end ("he jumps" vs "they jump").

This works, except when the phrasing is anything more than basic. Like so:

They just aren't strong enough.

The "just" breaks the system, and the whole thing falls apart. And any system that breaks that obviously is probably not gonna withstand a lot of use cases. Next!

Option 2: Verb Tagging

Instead of trying to include verbs in the pronoun tags, maybe the solution is to make them their own tags. It's the most flexible, and the least prone to accidents. Like this:

[hero-Zie] [hero:isn't] strong enough.

The engine would look up any blocks with colons as their delimiter and do a simple calculation: if the "hero" is "they", then look up the verb in the conversion list; otherwise, just output the word as-entered.

This has a big advantage in that it's extremely extensible (it doesn't matter what sentence structure you choose, it'll always adapt) but it's incredibly burdensome to use over time, because every verb has to be tagged, and suddenly your sentences become a huge mess of verbose syntax, making it incredibly hard to actually read your work.

Again, in the short run, this does a great job. In the longer run, it's possibly a deal-killer in terms of other writers being willing to adopt the technology.

Let's try something else.

Option 3: Contextual Verb Tagging

This one is similar to the one above, but it's slightly less obtrusive. Instead of tagging each verb with the subject and choice, we can do something simpler, like so:

[hero-Zie] [isn't] strong enough.

It's a small change, but trust me, it makes a huge difference when you're dealing with a 30,000-word document. In this case, we don't say who the verb belongs to; we let the engine assume it's connected to the hero by their proximity. It reads the sentences sequentially, changing its expectations as it goes.

The trick with this method is that the engine needs to crawl through the text sequentially, essentially saying "who's this? oh, ok, then..." over and over again until it reaches the end. That's not necessarily a dealbreaker, but it makes the system more complicated and slower than just saying "hey, everyone matching these variables! change!"

The other issue — which I can't say for certain happens, but am pretty sure will — is that English is such a stupid language that the contextual tags might not necessarily work. There's gotta be a situation where we'll be swapping in a random name, but still referring to someone else in the verbs. In those cases, maybe an explicitly-tagged verb would be a good workaround — but at what point is the whole thing user-unfriendly and prone to failure?

No Conclusions

I'm trying to think of smarter ways to do this, but I'm coming up empty. The third option is probably my best bet, overall, but until I figure out the performance issues, I can't deploy it. The answer might be to create a better piece of editing software, to take some of the burden off the writer, but that's a massive can of worms I don't want to open unless I have to

ALl of which is to say: I'm sorry to anyone using they/them on AAAC. I am trying very hard to solve this problem. I'm just not sure I'm smart enough to do it right :)