I have two dictation options which, being a horrible typist, I use frequently. The first is Dragon NaturallySpeaking on my laptop, which works fairly well. The second is dictating email to my phone, which does not. Capitalization rules seem to be based on some kind of random number generator. Homonyms are, of course, a problem, but so are misheard and missing words. Correcting these mistakes can eat up most, and sometimes all, of the time saved.
It is also tedious as hell.
I decided to let ChatGPT take a crack at it and see how well it worked. Here’s the prompt I used.
"Edit the following paragraphs with explanations in brackets after each change with explanations. : "
How did it work? It depends. On the part I was most interested in — homonyms, weird capitalization, and misheard or missing words — it caught almost everything I wanted it to. The other revisions it suggested weren’t particularly helpful. I believe I used just one of them, and that was because I had used the same word twice in the paragraph, not because of the reason given in the explanation
One of those unused suggestions struck me as a particularly interesting example of how differently ChatGPT "thinks." Here is the paragraph in question:
He used the windfall from the sale of PayPal along with funding from other investors to establish SpaceX, but the people actually in charge were highly respected aerospace veterans. They sometimes let the money guy wear an engineer’s cap and blow the whistle, but no one, including Musk himself, really thought he was running the train.The only change the algorithm suggested was substituting "operation" for "train." Normally, I wouldn't have been that surprised that it didn't make the analogous choice—LLMs aren't really capable of creating true analogies—but I assumed it would associate the terms "engineer’s cap" and "blow the whistle" with the word "train."
The bigger point here is that large language models do represent an impressive advance in the way we talk to computers and they talk to us. While they come nowhere near living up to the hype, they can provide us with some genuinely useful tools, as long as we keep their limitations in mind.
So there. I’ve now said nice things about LLMs in two posts. I hope you’re satisfied.
Given what you said you wanted, I would ask GPT to proofread rather than edit. Let it know that you are correcting a speech recognizer. Then tell it what you told the blog reader you were most concerned about, attention to capitalization, misspellings, and homonyms. Then give it an example of the style you want answer marked up. In general, all of these steps (context, instructuons, one or more specific examples).
ReplyDeleteAlso, which LLM did you use? 4o should be fine for this but o1 and o3 are a lot smarter. Also, having spent two days on the Claude jailbreak challenge, I have concluded it is much better with the English language.