If the weird name throws you, 'grep' is an acronym for 'general regular expression
To search for words or phrases within the article you are viewing, do the following: Hold the Ctrl keyboard key and press the F keyboard key (Ctrl+F) or right-click (click the right mouse button) somewhere on the article and select Find (in this article). This will bring up a text box to type search words. If you’re using Apple Safari on Mac, a search bar will appear in the upper-right corner of the window. And yes, even in Apple Safari on iPad, a search bar will appear across the bottom of the screen if you hit Command+F on a linked keyboard. Once you see the search bar, click in the text input field and type in a word or phrase.
Search Text For Certain Words
program'. If that doesn't help, it's probably because you're wondering what aregular expression ('re' or 'regex') is. Basically, it's a pattern used to describe
a string of characters, and if you want to know aaaaaaall about them, I highly
recommend reading Mastering Regular Expressions by Jeffrey Friedl and
published by Unix über-publisher O'Reilly & Associates.
Regexes (regices, regexen, ...the pluralization is a matter of debate) are an extremely
useful tool for any kind of text processing. Searching for patterns with grep is
most people's first exposure to them, as like the article says, you can use them to search
for a literal pattern within any number of text files on your computer. The cool thing is
that it doesn't have to be a literal pattern, but can be as complex as you'd like.
useful tool for any kind of text processing. Searching for patterns with grep is
most people's first exposure to them, as like the article says, you can use them to search
for a literal pattern within any number of text files on your computer. The cool thing is
that it doesn't have to be a literal pattern, but can be as complex as you'd like.
The key to this is understanding that certain characters are 'metacharacters', which have
special meaning for the regex-using program. For example, a plus character (+) tells the
program to match one or more instances of whatever immediately precedes it, while parentheses
serve to treat whatever is contained as a unit. Thus, 'ha+' matches 'ha', but it also matches
'haa' and 'haaaaaaaaaaa', but not 'hahaha'. If you want to match the word 'ha', you can use
'(ha)+' to match one or more instances of it, such as 'hahaha' and 'hahahahahahahahaha'.
Using a vertical bar allows alternate matching, so '(ha|ho)+' matches 'hohoho', 'hahaha', and
'hahohahohohohaha'. Etc.
special meaning for the regex-using program. For example, a plus character (+) tells the
program to match one or more instances of whatever immediately precedes it, while parentheses
serve to treat whatever is contained as a unit. Thus, 'ha+' matches 'ha', but it also matches
'haa' and 'haaaaaaaaaaa', but not 'hahaha'. If you want to match the word 'ha', you can use
'(ha)+' to match one or more instances of it, such as 'hahaha' and 'hahahahahahahahaha'.
Using a vertical bar allows alternate matching, so '(ha|ho)+' matches 'hohoho', 'hahaha', and
'hahohahohohohaha'. Etc.
There are many of these metacharacters to keep in mind. Inside brackets ([]), a carat (^)
means that you don't want to match whatever follows inside the brackets. For Magritte
fans, '[^(a cigar)]' matches any text that is not 'a cigar'. The rest of the time, the carat tells
the program to match only at the beginning of a line, while a dollar sign ($) matches only at
the end. Therefore, '^everything$' matches the word 'everything' only when it is on a line all
by itself and '^[^(anything else)]' matches all lines that do not begin with 'anything else'.
means that you don't want to match whatever follows inside the brackets. For Magritte
fans, '[^(a cigar)]' matches any text that is not 'a cigar'. The rest of the time, the carat tells
the program to match only at the beginning of a line, while a dollar sign ($) matches only at
the end. Therefore, '^everything$' matches the word 'everything' only when it is on a line all
by itself and '^[^(anything else)]' matches all lines that do not begin with 'anything else'.
The period (.) matches any character at all, and the asterisk (*) matches zero or more times.
Compare this to the plus, which matches one or more times -- a subtle but important
difference. A lot of regular expressions look for '.*', which is zero or more of anything
(that is, anything at all). This is useful when searching for two things that might or might
not have anything else (that you probably don't care about) between them: 'foo.*bar' will match
on 'foobar', 'foo bar' & 'foo boo a wop bop a lop bam boo bar'. Changing the previous example
to a plus, 'foo.+bar', requires that anything -- come between foo and bar, but it doesn't matter
what, so 'foobar' doesn't match but the other two examples given do match.
Compare this to the plus, which matches one or more times -- a subtle but important
difference. A lot of regular expressions look for '.*', which is zero or more of anything
(that is, anything at all). This is useful when searching for two things that might or might
not have anything else (that you probably don't care about) between them: 'foo.*bar' will match
on 'foobar', 'foo bar' & 'foo boo a wop bop a lop bam boo bar'. Changing the previous example
to a plus, 'foo.+bar', requires that anything -- come between foo and bar, but it doesn't matter
what, so 'foobar' doesn't match but the other two examples given do match.
For details, try the man pages -- 'man grep'. There are a lot of different versions of the
program, so details may vary. All of this should be valid for OSX though.
program, so details may vary. All of this should be valid for OSX though.
Confusing? Maybe, but regular expressions aren't that bad when you get used to them, and
they can be a very useful tool to take advantage of it you know what you're doing. An example.
they can be a very useful tool to take advantage of it you know what you're doing. An example.
Let's say you have an website stored on your computer as a series of html documents.
As a cutting edge developer, you've seen the CSS light and want to delete all the
tags wherever they're just saying e.g. face='sans-serif' &/or size='12', because the
stylesheet can now do that for you. On the other hand, it's possible that the patterns
'face='sans-serif' or 'size='12' could show up in normal text (though admittedly
that's unlikely). In fact, what you really want to know is wherever those patterns show up in
a font tag, but you don't care about anywhere else that they might appear. Here's one way to
find that pattern:
As a cutting edge developer, you've seen the CSS light and want to delete all the
tags wherever they're just saying e.g. face='sans-serif' &/or size='12', because the
stylesheet can now do that for you. On the other hand, it's possible that the patterns
'face='sans-serif' or 'size='12' could show up in normal text (though admittedly
that's unlikely). In fact, what you really want to know is wherever those patterns show up in
a font tag, but you don't care about anywhere else that they might appear. Here's one way to
find that pattern:
This does a number of things. The -i tells grep to ignore case (otherwise it's case sensitive,
and won't match 'FONT' if you're looking for 'font' or 'Font'). The -r tells it to recursively
descend through the directories from wherever the command starts -- in this case, all htm and
html files in the current directory. Everything in single quotes is the pattern we're matching.
We tell grep to match on any text that starts with ' (thus staying within the font tag), and then either the face or
size definition that we're interested in. The one glitch here is that line breaks can break
things, though there are various ways around that. Finding them is left as the proverbial
exercise for the reader. :)
and won't match 'FONT' if you're looking for 'font' or 'Font'). The -r tells it to recursively
descend through the directories from wherever the command starts -- in this case, all htm and
html files in the current directory. Everything in single quotes is the pattern we're matching.
We tell grep to match on any text that starts with ' (thus staying within the font tag), and then either the face or
size definition that we're interested in. The one glitch here is that line breaks can break
things, though there are various ways around that. Finding them is left as the proverbial
exercise for the reader. :)
The next question is, what do you want to do with this information you've come up with?
Presumably you want to edit those files in order to fix them, right? With that in mind, maybe
it would be useful to just make a list of matches. Grep normally outputs all the lines that
match the pattern, but if you just want the filenames, use the -l switch. If you want to save
the results into a file, redirect the output of the command accordingly. With those changes,
we now have:
Presumably you want to edit those files in order to fix them, right? With that in mind, maybe
it would be useful to just make a list of matches. Grep normally outputs all the lines that
match the pattern, but if you just want the filenames, use the -l switch. If you want to save
the results into a file, redirect the output of the command accordingly. With those changes,
we now have:
Great. But we can do better still. If you are comforable with the vi editor, you can call vi
with that command directly. The trick is to wrap the command in backticks (`). This is a cool
little Unix trick that runs the contained command & returns the result for whatever you want
to do with it. Thus you can simply put this command:
with that command directly. The trick is to wrap the command in backticks (`). This is a cool
little Unix trick that runs the contained command & returns the result for whatever you want
to do with it. Thus you can simply put this command:
The result of this command, as far as your tcsh shell is concerned, is something along the lines
of
of
etc. The beautiful thing here is that if you quit vi & re-run the command later, it will be
able to effectively 'pick up where you left off', since files you've already edited will
presumably no longer match the grep command.
able to effectively 'pick up where you left off', since files you've already edited will
presumably no longer match the grep command.
And if you want to get really ambitious, you can use these techniques in ways that
allow you to do all your editing directly from the command line, without having to go into an
interactive editor such as vi or emacs or whatever. If you make it this far in your experiments,
then the next step is to learn to filter the results of a match and process the filtered data
in some way, using tools such as sed, awk, and perl. Using these tools, you can find all
instances of the pattern in question, break it down however you like, substitute or shuffle the
parts around however you like, and then build it all back up again. This is fun stuff! By this
point, you're getting pretty heavily into Unix arcana, and the best book that I've seen about
these tricks is O'Reilly's Unix Power Tools, by various authors. If you really want to leverage
the power of the tools that all Unixes come with, including OSX, then this is a great place to
both start & end up. There's plenty of material in there to keep you busy for months & years...
allow you to do all your editing directly from the command line, without having to go into an
interactive editor such as vi or emacs or whatever. If you make it this far in your experiments,
then the next step is to learn to filter the results of a match and process the filtered data
in some way, using tools such as sed, awk, and perl. Using these tools, you can find all
instances of the pattern in question, break it down however you like, substitute or shuffle the
parts around however you like, and then build it all back up again. This is fun stuff! By this
point, you're getting pretty heavily into Unix arcana, and the best book that I've seen about
these tricks is O'Reilly's Unix Power Tools, by various authors. If you really want to leverage
the power of the tools that all Unixes come with, including OSX, then this is a great place to
both start & end up. There's plenty of material in there to keep you busy for months & years...
- Search options:
On other pages:
Displaying the Search/Find Window Pane
When a PDF is opened in the Acrobat Reader (not in a browser), the search window pane may or may not be displayed. To display the search/find window pane, use 'Ctrl+F'.
When the Find window opens, follow these steps and refer to Figure 1 below:
- Click the small arrow on the right side of the box.
- Select the drop down item - 'Open Full Acrobat Search'.
How To Search For Words In Messages On Mac
Figure 1
Search Options
There are several ways to search for information within a PDF document. These include the following:
- Basic Search
- Advanced Search
Search In Files For Text
Basic Search Options
To execute a basic search request complete the following steps:
- Type your search term(s) inside the 'text box' where you are asked: 'What word or phrase would you like to search for?'
- Click the 'Search' button to execute the search request.
Advanced Search Options
To get to the Advanced Search feature, click on 'Show More Options' at the bottom of the search window pane.
Brief explanation of the options available in the advanced search are:
- Match Exact Word Or Phrase - Searches for the entire string of characters, including spaces, in the same order in which they appear in the text box.
- Match Any Of The Words - Searches for any instances of at least one of the words typed. For example, if you search for each of, the results include any instances in which one or both of the two words appear: each, of, each of, or of each.
- Match All Of The Words - Searches for instances that contain all your search words, but not necessarily in the order you type them. Available only for a search of multiple PDFs or index definition files.
- Boolean Query - Uses the Boolean operators that you type with the search words into the What Word Or Phrase Would You Like To Search For box. Available only for searching multiple PDFs or PDF indexes.
Note: You cannot run wildcard searches using asterisks (*) or question marks (?) when searching PDF indexes.
Click 'Use Advanced Search Options' near the bottom of the search window pane to display the advanced search information. To execute an advanced search request complete the following steps:
- Type your search term(s) inside the 'text box' where you are asked:'What word or phrase would you like to search for?'
- Select an option from the drop down menu for 'Return results containing:'
- Click the 'Search' button to execute the search request.
Sample Search Request Using Advanced Search Options
For the purposes of this example, steps are provided to illustrate how to execute a search request for finding information about diazinon and kaolin in a PDF document. Assume that a PDF document is opened in the browser. If the search window pane is not displayed, please refer back to 'Displaying the Search Window Pane' for assistance.
The Search Criteria
Below are the steps to be followed for completing a search request to find information about diazinon and kaolin. Refer to Figure 2.
- Click 'Show More Options' near the bottom of the search window pane. (Step 1)
- Select 'Match Any Of The Words' from the drop down menu for 'Return results containing:' (Step 2)
- Type 'diazinon kaolin' inside the 'text box' where you are asked: What word or phrase would you like to search for?' (Step 3)
- Click the 'Search' button to execute the search request. (Step 3)
Figure 2
The Search Results
In this example, the search results produced 10 items in the PDF document for information about diazinon and kaolin. See Figure 3 below.
Figure 3
Additional Information
Setting a Preference for Displaying the Advanced Search Option as the Default
- Select 'Edit' from the menu option at the top-left of the computer screen.
- Select 'Preferences'.
- The Preferences popup window is displayed.
- Under Categories: select 'Search'.
- Refer to the 'Search' section and check the box 'Always show more options in advanced search'.
- Click the 'Ok' button to save.