Caption Fail, Fail? or the difference between /f/ and /th/

I’m waiting for my video to load the closed captions on YouTube. I tried to speak in the clearest, most enunciated language of which I am capable. I want caption success. That said, I’m listening to myself right now, and I sound silly and kind of insulting. Let’s say the closed captions completely reflect my intended words. What would that mean? Should I send to the original caption fail guys and say, boom! enunciation, my friends.

Here is the address:

I really did like the caption fail on Christmas carols that was in Multimodality in Motion. I wonder if it would be more difficult to do songs. I’m sure someone has tried to do a caption of Lil’ Wayne. I just looked to see if I could, and I can’t figure it out. Can people caption videos that they don’t upload? That would seem a key feature of adaptability.

UPDATE: Sunday, February 10th

Okay. My overall reaction is that this tool is overall-useful and maybe the mirth at its expense is just mirth. To be fair, I tried to do the opposite of the caption fail guys and speak slowly and clearly. And that probably isn’t realistic. Also, since Melanie told us that the software listens for the voices of middle-class white guys from the Midwest, well, there it is. I think I also have a good shot at being a national, accent-neutral weather person. The actual text is in the blog below (#1 – #3).

1) Found poetry. The mistaken words / phrases in my caption fail(?) have a cool, oblique quality to them. The grammar is not impossible, just the word combinations. For example, “in poor taste” became “indian port eight”. It sounds kind of like a rock lyric (i.e. “I thought it was in poor taste / we met at indian port eight” with some fuzzy guitars) that everyone thought meant my drug dealer’s house but was actually my mom’s street name. Now, the most obvious caption fail was the exchange of ‘deaf’ for ‘death.’ The line that stands on its own (“She is deaf”) but becomes “She is death” is so stark but it actually happens three times. The irony that captions that might be used by people who are deaf to access videos on YouTube change ‘deaf’ into ‘death’ is pretty inescapable. There is something to be said for consistency, I suppose. Of course, since I was trying pretty hard for a faithful rendering of what I said out loud and my original purpose was to be clear, the caption fail didn’t follow that. I got the feeling that the guys we watched in class were speaking quickly / in an affected way in order to find some fun / poetry. In this case, at least, I was trying to be a party pooper — at which I largely succeeded.

2) Stir-fry text? I’m going to stay with similarities of caption fail and stir-fry. Andrews suggests that stir-frys allow you to make your own text with “spastic interactivity.” He means that, once he designs it, readers make their own. With caption fail, it’s like I make the original and YouTube makes its own with that interactivity. I would imagine that if I read the same passage more quickly or with an affected accent, I might get more words. In other words, I might get ideas for a new stir-fry if I did a whole bunch of caption fails. One other comment. YouTube capital-F Fails at names. Try to match these (one name garnered two different fails):

1) Zdenick                                                        a) j ke e

2) Anne Gere’s                                                 b) “bruce taylor”

3) Ruth Anna                                                    c) “this a demic”

4) JPEE                                                            d) “retainer”

e) “in years”

Though I imagine you might get them, it seems noteworthy that, like the stir-fry that switched the author of the quote at each mouse touch, proper names (except Michigan and Caribbean…) changed.

3) Patterns. Like I said, the caption tool worked (for the most part). One pattern was the inability to recognize proper nouns. Another one is the /f/ and /th/ sounds.  At least with ‘deaf,’ the mistake happened every time. Verb endings also change on a pretty regular basis — for example ‘describe’ in the original becomes ‘describing’ in the text. That is something I read over / ignored at first, but when nuance is important that lack of accessibility to the full spectrum of verbs might be an issue.



