How Synthesized Speech Ruins Punctuation for the Blind
February 28, 2015
Oh, My Achin' Head
My brain is fried, and it's all the fault of the Hyphen key--that sneaky little fleck of plastic next to Mr. Equals, sitting there with his straight little lines, looking all innocent, like he has no idea what he's done to me. It's also the fault of every speech synthesizer I've ever met. Oh, and it's Blind Bargains' fault, too. And my English teachers. And my own. And we can't forget about my old Braillenote, he's part of this, too. And oh yeah, it's the internet's fault. And I think I need to go eat a cookie. And we're out of cookies--I'm too good a cook, so they never last. And I'm rambling, because Textbroker.com broke my brain. So it's their fault too.
Alright, what am I yammering about this time? Well, it's not about braille notetakers (mostly), which will be a relief to many of you. No, friends, today we're going to look at the weird world of how, at least in my personal surveys*, punctuation and sentence structure can be thrown off--or even reversed--by synthesized speech.
- These surveys have a sample size of 1 (read: yours truly), so they might not be fully representative of the BVI population. You know what, though, it's my blog, and I'm in a mood right now, so we're gonna take these insanely skewed, non-scientific results, and we're gonna run with them. Ready? Good.
What the -- Did You Say?
For the uninitiated reader, and I assume there aren't many of you since you're reading the ramblings of some blind dude on the internet, screen readers are programs used by people who can't see to let these people access computers. The screen reader speaks aloud what's on the screen, what the user types, what appears in popup boxes, you get the idea. It does this by harnessing the increasingly sophisticated technology known as speech synthesis, where a program will generate the sounds that approximate human speech. I'm over-simplifying it significantly, but I have to, because it's magic to me, and it's almost midnight--if you want to know more, hit up Google or DuckDuckGo.
Okay, so you're blind, and you're cruising around the web using synthetic speech to know what you're reading. As you're going along, you come across a bit of text that goes something like this:
I tried--and failed--to find more cookies.
If you're sighted, you're wondering why I used two hyphens there, and not an em dash (the big long dash thingy). If you're using a screen reader, and your punctuation level is set high enough, you heard "dash dash" twice, didn't you? Are you seeing the first problem yet? I just told sighted visitors they'd see hyphens, but I told screen reader users they'd hear dashes. But a dash is made of two hyphens, and some sorcery done by the computer when it sees those two hyphens together, right? So why would screen readers say "dash" when they're actually reading "hyphen" characters?
Welcome to problem number one! Every screen reader and every speech synthesizer I've ever met has done this. Yes, I'm aware that some might not, that there's probably some magical land where this isn't a problem, and citizens there who read this are shrugging at each other while raising their eyebrows in confusion. They're now realizing that, as they're blind, the eyebrow thing isn't doing much of anything. In places where American English is used (sample size of me, remember), this is a huge, huge problem. If you grow up hearing that you need to use dashes, and your screen reader tells you many times a day that a hyphen is a dash, what are you going to wind up thinking? Add to that the fact that the — symbol (in case your synth didn't say it, I put an em dash there) is either not spoken at all--hence the parenthetical--or spoken as one thing--which it is, even though you make it with two hyphens--and you're in a world of bafflement by fifth grade. By the way, I'm using the double hyphen in this blog instead of the proper em dash for just that reason: many synths ignore em dash and en dash completely. To sighted readers, it would look fine, but most blind readers would think I completely forgot a punctuation mark and created some of the worst run-on sentences ever.
It doesn't help that all the people in my scientifically sound, methodical study happened to use a Humanware Braillenote for most of their schooling, even into college (what are the chances?). It also doesn't help that every single one of them--yep, 100%--preferred to use the Keynote Gold synthesizer over the newer Eloquence synthesizer. You don't need to know what either of those synths sound like, because there's one vital (for our purposes) thing KNG did that Eloquence didn't: "it had the most beautiful way of handling dashes I've ever heard." I quoted that because, amazingly, everyone in this study said exactly those words.
When KNG was reading along and hit a dash, it would do this quick pause. It didn't end the part of the sentence before the dash like it had hit a period or a comma, it just… paused… then kept reading. It was perfect, the exact thing you'd want any reader (human or electronic) to do when they hit that punctuation mark.
So why am I complaining? Well, funny thing about that awesome feature: it didn't work on dashes! It didn't handle the em dash, the en dash, or two hyphens stuck together. The only way to achieve that glorious phonetic effect was to put a space, a single hyphen, and then another space into your text. That, and only that, would trigger that extraordinary pause thing. You've heard that the best way to proof read your work is to hear it read aloud, and that's exactly what I--um, I mean all the people in my survey--did. KNG would read the work, and if a pause was required, the only way to get it was to use a space-hyphen-space. Couple that with the fact that the hyphen was spoken as either "dash" or "dots 3 6" (which, in braille, it is) and why would anyone use the two-hyphen symbol that didn't read properly? It's all dashes anyway, right?
Can You Feel It?
"But Mr. Author, sir," I can hear you saying (no really, I can hear you… Hello). "Mr. Author, this is exactly why braille is so vitally important!"
Braille is vital, and I'd love nothing more than to see its ridiculously high prices drop to the point where everyone could learn and use it. Thing is, it didn't help anyone in my surveys (i.e. me), and here's why.
Way back in the day(of the mid 20 zeros), the Braillenote could handle the common braille codes of the time--no UEB, no Nemeth (yeah, Nemeth was common then, but this is a braille notetaker we're talking about; they can't be expected to be too modern). You'd write up your essay or test answers or whatever, and you'd read through them to check that they were okay. In braille, at least in American contracted, you had no special dash signs, you just had dots 3 6. Once was a hyphen, twice was one of those dashes between words, and space-hyphen-space was the magical formula that got you that awesome pause. Again, too, everything was spoken as "dash", just as it was with Windows at the time. So, to me, a hyphen and a dash were two words describing the same thing (dots 3 6) and sometimes you'd use two of those signs between words instead of one, for some reason. When I'd proof-read my work in braille, then, why would I figure there was a difference?
Oh, and speaking of braille codes… Remember that, in American contracted braille, putting dots 3 6 before another letter is a new symbol that means "com". The word "come", for instance, is written as dots 3 6 followed by the letter e. Since the hyphen is also dots 3 6, you'd often wind up with "com" in place of your second hyphen if you tried to use a proper dash, which was yet another reason for me to not use them. For instance:
You are--and always will be--my friend.
That sentence might translate in braille as this:
You are-comand always will be-commy friend.
That's just plain weird to read, isn't it? Leaving the space around the hyphen meant it wouldn't expand to "com", so why would I not?
It wasn't until I got online with my Braillenote that I started to see "Em dash" and "En dash" signs on webpages. When I say I saw them, by the way, I don't mean I saw some new symbol I hadn't seen before. I mean I literally read the words, written out. Imagine reading a sentence like:
He tried to hold still, to be silent--but that proved to be his fatal mistake.
Only, instead of seeing a couple hyphens or an em dash (—), you saw this:
He tried to hold still, to be silent Em dash but that proved to be his fatal mistake.
No, I didn't mess up there, that's exactly what I would read on the display, the words "Em dash" in place of some kind of symbol. There's a technical reason for why that would happen, but I won't go into it here. No matter the reason, though, you can probably appreciate how jarring and distracting that was. You're reading along, and suddenly, some punctuation mark comes along that's so full of itself it has to show up in words, rather than a few dots. You'd grow to resent that punk(tuation) mark, wouldn't you? You'd come to see it as a space-waster that thinks it's too important to be relegated to the likes of unknown symbols, too arrogant to be expressed in anything less than full, space-hogging, evil glory. You'd look at (well, feel, but we all know what I mean) that pretentious little jerk, and you'd think of the one or two "3 6" signs, and you'd wonder why anyone would use this ridiculous waste of space when much more compact options are right there, and so much easier to type. Notetakers don't auto-replace things like modern computers, you know, so whereas typing "--" into MS Word or Pages will give you the em dash, doing so on a notetaker would give you exactly what you typed--two hyphens. If you wanted a "real" em dash, you had to hunt it down in the Unicode character tables, or assign it to a special keystroke. Plus, Remember that these notetakers can show only 18 or 32 characters at a time, so wasting eight or nine spaces on a single dash is nearing egregious levels. Because of all this, back to my hyphens--still called dashes by all my equipment--I went, and there I've stayed ever since.
Are We Done Yet?
No, dear reader, we're not. I still have to tell you how Blind Bargains broke my brain, remember? Well, the first thing you need to know is that the awesome folks behind double B have a new, really great podcast that you need to go subscribe to right now. Go ahead, I'll wait. Done? Cool, you won't regret it.
Anyway, I was listening to the latest episode, #7 and one thing mentioned by a guest was a website that would pay authors to write for people who needed, well, writing done. The site, Text Broker seemed like something right up my alley, so I went to sign up. They want a sample article, of course, and I figured that would be a breeze. In the notes, they mention that they score submissions based on multiple criteria… Among them, "overuse of dashes".
Whoa, wait, overuse of dashes? I better make sure I know what they're talking about. After some research (read: thirty seconds on DuckDuckGo), my virtual friend the Grammar Girl came to my rescue… I thought. She told me that you use a colon to lead into an expected list, much like a period ends an expected sentence. A dash replaces the colon when that dash precedes something unexpected. Basically, it seems that the dash is the exclamation mark of the list precession world. Example:
I unwrapped the candy bar: it was chocolate, with nuts.
Expected ending, nothing funny here. But…
I unwrapped the candy bar--inside the wrapper was a hundred dollar bill!
Use the dash, because holy chocolate, there's money in your candy bar!
Okay, that made sense. But then… Then, the brain melting began.
In less than five minutes, my whole world was flipped upside down. I learned that a hyphen is what every screen reader everywhere (the lying little bits of computer instructions) calls a dash; that a dash is what most people call the result of a computer's messing with two hyphens put side by side; and that there's not only a difference between a regular dash and a fancy one, but there's a difference between the two types of fancy ones (the em and en dashes). As they say, or at least as I think I read somewhere: mind==blown.
My main point is that speech synthesizers can mess with how I perceive punctuation, and I've harped on dashes in place of colons a lot. It goes beyond that, though. Remember Keynote Gold, the synthesizer from the Braillenote? Well, there's one other thing it did really well: parentheses. When it saw text in parentheses, it would lower its pitch just a bit, making it obvious to the reader that the (I love using this word) parenthetical text was a kind of side note to the sentence. I grew up hearing that, and loved the effect it gave. Even though it's perfectly legal to use dashes around that kind of text, I always used parentheses, because that's what sounded best to me and so it's what I got used to. I haven't used a Braillenote in years, but the habits have been formed. Even though all the speech programs I use nowadays fail to do anything cool with dashes or parentheses, what KNG seemed to tell me was the proper way of writing has been engrained.
As I talked this over with my family, though, they both agreed that there is a place for each. Parentheses are for a real side note, something that would be muttered or whispered if you were to say the whole sentence aloud. Dashes are for text you'd say at the same volume, but that is still not quite part of the sentence. Who knew? I didn't… Well, I probably did at one time, but forgot because I liked the way parentheses were said.
Now that I mostly use the Mac, and the Alex voice from OS X has arrived on iOS, I'm faced with another punctuational problem: the quotation mark. You see, much like Keynote Gold would read a certain kind of dash (hyphen… Whatever) in a really cool way, Alex reads text surrounded by apostrophes in a really cool way. It's hard to describe, but it makes quoted words in the middle of a sentence stand out in exactly the way you want them to. The problem is that, while this effect happens for both the standard double quote (") and the apostrophe, my punctuation level is set to "most". That means that, instead of hearing the awesome pause around a quoted phrase if the quotes are the double kind, I hear the word "quote" instead. This ruins the effect entirely. BUT Alex doesn't do this if you use single quotes (apostrophes) instead, because that punctuation level ignores that kind of quote. As a result, I've started getting into the habit of putting single quotes around words I want to make stand out. I have no clue when or if I should quote words, and if I am supposed to, whether to use the single or double quote. Similarly, I haven't the faintest idea if I'm supposed to use the actual quote/apostrophe symbols, or the new ones. These new ones are read as many things: left-angle quotation mark, single quotation mark, left quote, left double quotation mark, and more. Other screen readers don’t always handle these fancy quotes very well, so I just don’t use them, even though it often feels like everyone else is.
So, it turns out that I've been doing it wrong for many years. I've always used the "space-hyphen-space" format, figuring that it was the same thing as the double dash. No wait, not double dash, double hyphen, because that key next to Equals is a hyphen, even if every synth I know insists that it's a dash. In reality, the dash is the result of computer trickery when you type two hyphens together, assuming you have the feature enabled (I don't, because it replaces other things I don't want it to replace). You also shouldn't use parentheses nearly as often as I do, preferring the double hyphen thing in many cases. What a fool I feel now, let me tell you! Oh yeah, and there are new quotes in there somewhere that most modern computers use by default, but screen readers often read them in odd ways so I don’t prefer them. Not only that, but I use single quotes (AKA apostrophes, because it’s not confusing enough without giving things two names) in place of quotes because I like how it sounds. I’m good with spelling and--usually--grammar, but I’m only now seeing how thoroughly electronic speech has messed with my understanding of, and common usage of, punctuation.
There. It's my fault, for not researching all this a long time ago; it's the Hyphen Key's fault, for not turning into a proper dash when you add shift to it (no, I don't know where we'd get the underscore if this happened, but it's almost one in the morning so I don't care right now); it's my teachers' fault, for never correcting it (in fairness, they probably did, and I just don't remember it); it's the internet's fault, for having so many writing styles and so few punctuation rules; it's Blind Bargains' fault, for making me investigate Text Broker; it's Text Broker's fault, for making me finally investigate this mess. But mostly it's my fault, because if I'd just figured this out back in middle or high school, none of this would've happened. But hey, I figured it out now, so despite the headache, I'm now wiser than I was six hours ago. And really, thanks should go to Eden, blind Bargains, Text Broker, and Grammar Girl, without whom I would still be happily using hyphens where I should be using dashes. I therefore remove them from my fault list and add them to my thanks list. Of course, I still don’t know about the quote thing, but I’m too tired to go investigate it.
After all this, though, questions remain. When do you use dashes in place of commas or semicolons? Is the rule about parenthetical text being whispered correct? When, if at all, are you allowed to use an ellipsis instead of a dash? If your machine can't, or is set to not, replace a double hyphen with a proper dash, is the double hyphen okay? As the Tootsie Pop commercials always said: "the world may never know" (and it looks like that might be true: we may never know.)
It's now been a week since Caleb arrived at our home. So far, Cosby has handled things quite well, though the novelty wore off
Recently, I read an article entitled 5 Reasons Why Guide Dogs Are a Terrible Idea! by Joe Orozco I wanted to offer my own take on this, to explain