Cleaning Up Pasted Text

Home Tutorials My MVP FAQs Useful Links

 

Cleaning up text pasted from emails or Web sites

The ease of copying and pasting text from Web sites and email greatly simplifies many tasks in Word, but problems often arise in making the pasted text conform to the style of the document into which it is pasted. One of the most common chores is getting rid of excess line breaks, which cause the text to wrap short of the right margin. There are several ways to work around this problem.

Assessing the problem text

The most efficient method of reformatting short lines of text depends on whether the breaks are line breaks or paragraph breaks. So the first line of attack must be to display nonprinting characters using the Show/Hide button on the Standard toolbar (in Word 2007/2010 this button is in the Paragraph group on the Home tab). (For more on nonprinting characters, see What do all those funny marks, like the dots between the words in my document, and the square bullets in the left margin, mean?) If each line ends in a pilcrow or paragraph mark (¶), then AutoFormat may be all you need. If each line ends in a bent arrow
(signifying a line break), you will need to use a different approach.

Using AutoFormat

AutoFormat settings are found on the AutoFormat tab of Tools | AutoCorrect Options in Word 2002 and 2003 (Tools | AutoCorrect in Word 2000 and earlier). In Word 2007/2010, find this tab at Office Button | Word Options | Proofing: AutoCorrect Options. No matter what other AutoFormat options you have enabled here, when you select a block of text with a paragraph break at the end of each full line, AutoFormat will delete all the paragraph breaks but the last.

Note: To run AutoFormat in Word 2003 and earlier, use Format | AutoFormat: Autoformat now (there may be an AutoFormat button on the Formatting toolbar in some versions, or you can add one using Tools | Customize). In Word 2007 and 2010, you will have to add an AutoFormat button to the Quick Access Toolbar (QAT) from the “Commands Not in the Ribbon” section of the Customize the Quick Access Toolbar dialog (accessed via Office Button | Word Options | Customize in Word 2007 and File | Options | Quick Access Toolbar in Word 2010).

Unfortunately, text pasted from the Web or email nowadays rarely has lines ending in paragraph breaks. But you can force this format by using Paste Special and selecting “Unformatted Text” (in Word 2002 and above, if you have “Paste Options” enabled, you can just Paste and then select the “Keep Text Only” option). This pastes your selection with paragraph breaks instead of line breaks, and AutoFormat will then do the trick.

Using Find and Replace

Sometimes, however, you will not want to paste as unformatted text. In that case, what you will most likely get is text with a line break at the end of each line. Provided there is an empty line at the end of each paragraph, cleanup is still relatively simple. It takes just two Replace operations.

First pass

  1. Press Ctrl+H to open the Replace dialog.

  2. In the “Find what” box, type ^l^l (those are lowercase Ls, representing two line breaks).

  3. In the “Replace with” box, type ^p (the code for a paragraph break).

  4. Click Replace All. You will now have a paragraph break at the end of each true paragraph.

Second pass

  1. In the “Find what” box, type ^l.

  2. In the “Replace with” box, type a space. (If there is already a space at the end of each line, leave the box empty.)

  3. Click Replace All.

This removes the line breaks and allows text to wrap naturally.

Harder cases

If there is not an empty line between paragraphs, you will probably have to insert paragraph breaks by hand. If the amount of text is not large, you can scroll through and press Enter wherever a paragraph break is needed. Then use Replace, as above, to replace each line break with a space. This will leave an extra space at either the beginning or the end of each paragraph. You can use Replace again to replace <space>^p or ^p<space> (as appropriate) with ^p. (Note that “<space>” represents pressing the spacebar; you don't type “<space>”!)

An alternative approach is to press Shift+Enter to enter an extra line break at the end of each paragraph, then follow the instructions in the section above.

Even when the amount of text is very large, there is no really good alternative to manual editing. But if you Paste Special as Unformatted Text and run AutoFormat, you may find that Word is almost as clever as you are at finding where a paragraph ends.

Note that the methods described above are suitable only for simple text. If you have copied and pasted an entire Web page, with graphics, tables, and frames, much more work will be required to format it for use in a Word document.

Other non-printing characters worth replacing

  • Often when you paste from the web, and also from some other applications, characters come in that display as paragraph marks but don't behave like “proper” paragraph breaks should—they behave like manual line breaks!. So you might find that when you center a “paragraph,” several other adjacent paragraphs also get centered. To cure this, use Replace: in the “Find what” box type ^013; in the “Replace with” box, type ^p and click Replace All.

  • Whenpastingfromtheweb,nonbreakingspacesoftencomein,ratherthanordinary spaces. You can get rid of them with a Replace operation; in the “Find what” box type ^s, in the “Replace with” box, insert a <space> character (press the spacebar), and click Replace All.

If you want to automate any of the above steps you can record them using the macro recorder and play them back as needed.

Neat tips

This following tip has appeared in Woody's Office Watch (WOW). When you cut and paste text from a Web site, there are often leading spaces at the beginning of each line. A very quick way to remove all these spaces is to select the text, center the selection (Ctrl+E), and then left-align the selection (Ctrl+L). All the extra spaces will have disappeared!

If the text you have pasted has "reply" characters, such as "greater than" symbols (>) at the beginnings of lines, you could use Replace repeatedly to search for this character followed by a space. An easier way to remove them, however, is to use column select (Alt+drag) to select just the leading characters. When they are selected, press Delete.

This article copyright © 2001, 2009, 2011, 2012 by Suzanne S. Barnhill. Written in collaboration with Dave Rado and originally published at http://word.mvps.org.