Advanced searching in Microsoft Word

Writers are not always the most technical of people, and fair enough, but there's one techie thing worth learning about because it makes global editing easier, and that's regular expressions.

Let's say - to pick a random example, not that this would ever happen to me - that your dear agent thinks you have too many verbs of the form was ---ing. Was walking, was looking, was defenestrating, and so on. How to find them?

You can read through the whole ms. Which will take forever. Or you can search for was. This will cut the search time and you won't miss any, but you'll have to check 100 times more was words than you want. You might think to search for "ing ". Because you can have spaces in searches. But like searching for was, there'll be a lot of wasted time.

Or you can click Use wildcards on the find dialog (you need to click More to find it) and write a regular expression. They work like this:

* matches any number of characters

If you click Use wildcards and search for was *ing then that matches was followed by a space, followed by any number of characters, followed by ing. So was looking matches but was crooked doesn't.

You might think that would find everything we want, and indeed it will, but it will also match text like "The fence was crooked but John wasn't looking". The text in red matches because the * matches any sequence of characters, including spaces. We want something slightly trickier. We want to match was followed by a space, followed by and number of text characters ending in ing.

[a-z] matches any one character from a to z.

You can follow this by a @ to mean one or more of the characters between square brackets. So

[a-z]@matches one or more characters between a and z.

What if the first letter of the word after was is a capital? Not a problem, you can have more than one range inside the square brackets.

[a-z,A-Z]@matches one or more alphabet characters, big or small.

So a search string of was [a-z,A-Z]@ing will do the job beautifully. If you're totally paranoid about there being a weird character in the present participle (beats me why, but still...) then you can do this instead:

[!a-z] means any one character except a to z. The ! means match the opposite. So

[! ] matches any character except a space. If you can't see it, I typed a space between the ! and the ].

So a search string of was [! ]@ing is what I'm using to weed the excessive present particples out of my manuscript. It might seem like a lot of effort to work this out, but believe me, it's heaps faster than checking every was within 90,000 words.

There are more wildcards than these. They're all listed on the Special button in the search dialog, so you don't have to remember them all.

53 comments:

Stephanie Thornton said...

Whoa! I had no idea MS Word would do that. I'll keep that in mind.

Now if I could just get the darn program to delete the random page breaks it inserted. Every time I think I've gotten rid of them, they pop back in. Thankfully it's only in about 15 pages.

Groan.

Gary Corby said...

Hi Stephanie. Not sure about your page breaks; have you used some sort of style formatting or maybe a table? That could do it.

Word is hard to use, no doubt about it. I have to say too, the recent versions which replaced the menu bar with that ribbon are, as far as I'm concerned, utterly unusable.

I'm sticking with 2003 until the wheels fall off. After that I'll look around for a replacement. Maybe OpenOffice, but having used both I'd say OpenOffice right now has more problems than Office 2003.

Carrie Clevenger said...

Quite possibly the most useful blog entry of the year to me so far. More tricks please?

Cheers,

Carrie

Gary Corby said...

His Carrie, there are regular expressions in OpenOffice too, though slightly different.

Searching for was [a-z,A-Z]*ing works in Writer. It doesn't have the ! operator, but it does have others and in some ways is more flexible than Word.

scaryazeri said...

Great, thanks Gary. Agree with Carrie, a very useful blog you have created here!

Gary Corby said...

Thanks Scary. It's really just a tidbit I happened to be using myself at that moment, but glad if it's helpful.

Carrie Clevenger said...

Gary, I'm trapped in using Open Office until Word Santa drops authorized software down my chimney. Do you have the Open Office equivalents?

Gary Corby said...

OpenOffice Writer is very similar, Carrie. Cntrl-f to get to the search dialog. Click More Options. Click the Regular Expressions checkbox. Now something like was [a-z,A-Z]*ing will work for you.

OpenOffice Writer has a much better help file than Word. Press F1 for the help window, Click the Index tab, type regular expressions in the Search Term box and hit return. It gives you the complete list of regular expression operators and a brief explanation of each.

Loretta Ross said...

Thanks for the tip! I can see where this would come in handy.

Actually, your timing is funny. I could have really used this . . . day before yesterday! I was doing my final edit on revisions and I searched the ms to make sure I spelled the name Sonuvabitch the same way throughout. I couldn't search for the whole name because, obviously, that would miss any I'd misspelled. So I searched on "bitch" and just looked at every instance of the word. Fortunately, I don't have too much of a potty mouth. ;)

Amalia Dillin said...

Thank Gary! I didn't know how to make Word do tricks like that!

Will it also tell you how many times a word appears? Or do I have to rely on wordle for that kind of a table?

I don't mind the 2007 redesign of word anymore-- It did drive me crazy at first, but now I've gotten pretty used to it.

Sarah Laurenson said...

Popped over from Janet's.

This is great. And I thought I was already a Word Wiz. Hah.

Thanks!

Sierra Godfrey said...

Thanks for this, it's very useful.
I also recently discovered Word's ability to pick and choose grammatical elements to highlight, one of which is the passive voice tool. You can set this under Spelling & Grammar and then Options.

Sage Ravenwood said...

Aha! You are my new favorite person. That is such a time saver. Thank you!

Joshua McCune said...

Gary, great rundown. Thanks for sharing.

TAWNA FENSKE said...

Holy cow, this is amazing! I followed a link here from Janet Reid's blog, and I'm so glad I did. This will save me oodles of time in the future. Thanks for the tip!

Tawna

Jane Lebak said...

Thanks for the tip! That's helpful for many reasons. One of my characters sometimes garbles words, and I don't always remember how I mixed up the spelling, making it hard to search on that term. But I could wildcard it and that wouldn't be an issue.

Thanks!

Sean Ferrell said...

You're a brilliant man, Mr. Corby, no matter what your agent says.

Deb Vlock said...

My brain has just exploded. I think I'll just read through the ms for the umpteenth time and do it manually.

Your Luddite friend,

Deb

Livia Blackburne said...

Hmm, I'm pleasantly surprised that Office offers regular expressions. Thanks

Welshcake said...

Came here from Janet's blog. That's brilliant! Thank you.

CKHB said...

WHOA. Cool.

Liana Brooks said...

I didn't word was that talented! Thanks!

steeleweed said...

Used Word2003 happily, but new Win7 PC came with Word2007 which is worse then Multimate. It tries to be all things to all people and ends up being useless for 99.9% of the users.
It has actually driven me to Open Office.

Tana said...

This is great. I just had an experience with find replace, trying to downgrade from two spaces after a period to one. (old school I know.) If anyone is wondering it's, find; . (space, space) replace; . (space) Of course you wouldn't type in the word space, you actual hit the space bar. Thanks Gary for furthering my tech savy ;)

Gary Corby said...

Hi Amalia, the few times I needed to work out how many instances of a particular word, I did a global replace of the word with itself. Word tells you how many replacements it did. There's probably a better way but I don't know it.

Gary Corby said...

Hi Sierra! Welcome to the blog.

Yes, you're right. You can use Word's grammar search for some of this. I confess I've never entirely trusted it. I recall Bill Gates many years ago telling an audience he used to send the Word team emails pointing out the errors in Word's grammar checker.

Gary Corby said...

Hi steelewood, I couldn't agree more about Word 2007 & later, but some people do actually think it's okay. Amalia, for example.

I tried using OpenOffice Writer with my second book and I was generally happy with it but for two things: it feels slower; and Writer's word count function is hideously wrong. The speed I could live with, but the wrong word count drove me back to Word. (sigh).

Gary Corby said...

To all the people who said thanks for the post: thanks for dropping in! It's lovely to meet you all.

Jemi Fraser said...

I'm another drop in from Janet's site. Wow! What a time save - thank you. Really!!

Gary Corby said...

My pleasure Jemi.

Stuart Neville said...

My inner geek salutes you, sir.

Gary Corby said...

Hey Stuart! My not-so-inner geek thanks you.

And congratulations on The Ghosts of Belfast being optioned for film.

Deb Salisbury, Magic Seeker and Mantua-Maker said...

Wow! Great tips!!! Thank you! I'm bookmarking and linking back.

Janet Reid sent me your way.

Peter Rozovsky said...

Thanks. This makes me want to write a novel just so I can look for stuff to purge from it.
==============
Detectives Beyond Borders
"Because Murder Is More Fun Away From Home"
http://www.detectivesbeyondborders.blogspot.com/

Sarahlynn said...

Brilliant! Thank you.

Gary Corby said...

Thanks Deb & Sarahlynn!

Peter, I have a feeling someone of your outstanding abilities would make a brilliant author. As long as you get in lots of song name puns, of course.

s.w. vaughn said...

Also here from Janet's blog, and thanks for the FANTASTIC tip! I didn't know about the wild card or range specific search bits. That's going to be really handy.

I did, however, find a fun way to flag searches in a manuscript without having to click back and forth in the search box and hit "find next" every time - if you use Find & Replace, hit the More Options button on the Replace function, and replace your search key with a different color font, you don't have to keep the search box running and your key stands out scrolling through the manuscript.

With that and the wild card/range stuff, I am so set for editing. :-)

Joylene Nowell Butler said...

I'm curious. This didn't work for me. Is it because I'm using Word:Mac?

CKHB said...

(FYI: you're getting an award from me on my blog today. No pressure. Ignore it if you want!)

Bill Kirton said...

I bow before you in awe and gratitude. I've been using Word for two or three centuries and I knew nothing of this. Thanks, Gary.

Gary Corby said...

Hi Joylene, my first thought is, have you clicked the Use Wildcards checkbox? Because if you haven't, Word will treat the wildcards as normal text. I've never used Word:Mac but I just looked up the documentation, which says wildcards work.

S.W. yes that's a great tip! You can indeed use replace to change the color or font or just about anything else.

Thanks Carrie. :-) I'm due to do an awards post some time, I think.

Aha Bill, I'll get you writing code eventually.

Joylene Nowell Butler said...

Sorry, I should probably explain. /after I clicked wildcats, it highlighted large passages, sometimes as many as 40 words.

Gary Corby said...

Joylene, I don't know the answer to that one. I had another read through the Word documentation for Macs but didn't see anything about auto-highlighting. You might need to ask a Mac Office guru, which isn't me.

I'd be willing to bet some of the people reading this blog are using Office on a Mac. Any hints, anyone?

Gary Corby said...

Joylene, I've worked it out!

The * is still matching the space character for you. So it's finding the first was followed by any amount of text, and ends with the first ---ing word. Then it highlights all of that as the selection. This could happen to others too. Here's the fix:

Instead of *, use an @ character, which means one or more of the previous character. This absolutely prevents space matching. So the search pattern is:

was [a-z,A-Z]@ing

Thank you so very much for telling me it failed on you. I've updated the post so it'll work for everyone.

Janine said...

I use the AutoCrit Editing Wizard. It finds all the overused words and phrases automatically without *me* having to be a tech-wizard :-)

Liz Czukas said...

This is a fabulous tip, thanks!

For the record, I'm of the mind that you can never have too many occurrences of the word defenestrate in any of its verb forms. Underused, for certain.

Cheers.

- Liz

Gary Corby said...

Thanks Liz. I'm totally with you on defenestration. There should be more of it.

Linda Lee Foltz said...

Gary,
So many blogs write about things that are of interest - but you make my life easier! As an author working on my second book, national speaker, child advocate, blogger,director of marketing,residential property manager, wife, mother of 6 (thank goodness they're grown, grandmother of 4 - every minute counts! Thanks for giving me more moments to enjoy! Linda Lee Foltz

Usurper said...

Say Gary, that I would like to replace every instance of a word that is entirely in capitals into a sentence case, as an example say: "APLOMB" to "aplomb" or "Aplomb". How would I put this into regular expressions?

Gary Corby said...

That's a good question, Usurper. But there's not enough room in a comment space. Let me answer that in another post...

Jane Finnis said...

Gary, I dropped in from DorothyL. Thanks for a very useful post, especially your lateral-thinking way of counting repetitions of a word! BTW not everyone knows that when doing a straight "Find" in Word, once you've typed in your keyword, you can press Escape and move between occurrences of the word by hitting Control with PageDown or PageUp. Often quicker than the button.

Gary Corby said...

Jane, how lovely to see you here!

That's a great point about escaping the search dialog.

I also just discovered you've started your own blog. I'm proud to be Follower #1.

Eric said...

Just came across this while looking for advanced wildcard tips... Good explanation and example Gary!

To Usurper's question about changing the case of every instance of a word: use Find's ability to find all instances (i.e. choose from the "Find In" pulldown button) then drop out of the dialog. All found instances will then be highlighted, so you can press Shift-F3 to toggle the case between lower, upper and initial cap.