Spaced out

Some time ago I wrote about advanced searching with Word, and also, advanced use of replace. Here's another one: a quick way to remove extra spaces. If you're like me, you occasionally put three spaces after a full stop (that's a period for US readers), or simply hit the space bar twice in the middle of a sentence. You can go blind searching by eye for extra white space on the screen, so save your eyesight and let the computer do it for you.

Run the Find/Replace dialog in Word. That's cntrl-f on the keyboard or Edit --> Find on the menu bar.

The string you're searching for is this:


[!\!\.\"\?^l]  


I've inverted the the colours and made it large so you can see there are two spaces at the end of the search string. You must include those two spaces in the search, or this won't work. Make sure you click the More button and then check the Use wildcards checkbox. Here's a screenshot:




If you don't care about the technicalities, you can stop reading now and just use the search as I described. Otherwise, here's how it works.

The two spaces at the end of the string will match any two spaces in your manuscript.

The problem is we want to catch two spaces in the middle of sentences, but two spaces is always correct when a sentence ends and a new one begins. We have to exclude the start of sentences; that's what the gobbledygook does.

Everything inside the square brackets [ ... ] will match a single character. The exclamation mark ! right at the start means match any one character except those that follow. A sentence could end in a full-stop (that's the \.), an exclamation mark (that's the \!), or a question mark (\?). We have to deal with the situation where a sentence ends with dialogue, in which case the final character is a speech mark (\"). We also don't care if the spaces trail the end of a paragraph (that's the ^l). The backslash character \ before the punctuation in each case is merely an instruction to Word to treat the following character as a literal, because each of those characters has a special meaning when wildcards are turned on. The \ escapes the special meaning. The ^l is a special symbol that denotes the end of a paragraph (actually, any manual line break).

So three spaces will be caught anywhere. Two spaces will be ignored if they're preceded by standard end-of-sentence punctuation. Two spaces preceded any other character will be caught, because the square brackets will match anything except the end of a sentence.


10 comments:

Steph Schmidt said...

Wait, does this mean all regular expressions work within word for searching?

But this is fantastic for catching accidental typos.

Sarah W said...

Gary, even if you didn't write fascinating posts about ancient history (and a certain favorite mystery series), I'd still visit this blog just to pick up tips on wringing the last bit of usefulness from MS Word.

You've saved me so much time -- thank you!

Gary Corby said...

Steph, they're not the same as the Unix regular expressions that you probably know and love (me too), but Word does have usable regular expressions.

I'm glad if it helps, Sarah! I'm just passing on the bits I use myself. My wife said I had some extra spaces in the glossary of the third book, but I couldn't see them, so I worked out this expression to catch them.

Lexi said...

Gary, I'm amazed you still use two spaces at the end of sentences. Doesn't it drive your editor mad?

I've gone over to one space on the advice of an editor friend (I blogged about it at the time) and never regretted it - particularly when formatting text for print. Plus it's simplicity itself to weed out the odd double with Find and Replace.

Nancy Kelley said...

I wondered at the double spaces as well, Lexi. Several people blogged and/or linked to an article about the single space earlier this spring. I think it was this article in The Slate: http://www.slate.com/id/2281146/

Is the typeset convention different in Australia, Gary?

Gary Corby said...

Hi Lexi & Nancy. I've yet to have a copyeditor complain about two spaces. I think that's probably the difference between a manuscript and typesetting. It doesn't actually matter what I do, because in these days of variable width fonts, the typesetting software will do its own thing anyway.

I believe the typesetting standard used to be two spaces but became one due to computer fonts. Speaking personally, I dislike the single space gap and rather wish printed books would increase it. I certainly find it much easier to read my own ms with the two spaces. If you check my tweets on twitter, you'll see even there I use two spaces unless I need to squeeze.

L. T. Host said...

You are awesome, Gary. I love all the Word tips you share with us! :)

Steve MC said...

And here I'd thought I was clever 'cause I'd written a macro to make sure there's two spaces (and only two spaces) between each sentence.

I prefer it that way, as well, 'cause you can see the shape of a sentence much more clearly, and it also gives more room when editing by hand.

Thanks for the tip!

Gary Corby said...

Thanks LT.

Maine, that's actually a good idea. I guess you wrote a macro to automatically insert two spaces every time you press the . key?

Anonymous said...

Thanks for this tip. I was afriad I was going to have to go through my 85,000 word manu and have to delete spaces. You are a life saver!