It is possible to perform search and replace operations on strings using
regular expressions. How complicated this is naturally depends on how much flexibility
you need:
- to replace instances of an expression in a string with a fixed string,
then you can use a simple call to String.replaceAll();
- if the replacement string isn't fixed, then you can use a loop with a
Pattern and Matcher in which you have complete
control over the replacement string.
Replacing with a fixed string
If you just want to replace all instances of a given expression within a string
with another fixed string, then things are fairly straightforward. For example,
the following replaces all instances of digits with a letter
X:
str = str.replaceAll("[0-9]", "X");
The following replaces all instances of multiple spaces with a single space:
str = str.replaceAll(" {2,}", " ");
We'll see in the next section that we should be careful about passing
"raw" strings as the second paramter, since a couple of characters in this string
actually have special meanings.
Replacing with a sub-part of the matched portion
In the replacement string, we can refer to
captured groups
from the regular expression. For example, the following expression removes
instances of the HTML 'bold' tag from a string, but leaves the text
inside
the tag intact:
str = str.replaceAll("<b>([^<]*)</b>", "$1");
In the expression
<b>([^<]*)</b>, we
capture
the text between the open and close tags as group 1. Then, in the replacement string,
we can refer to the text of group 1 with the expression
$1. (The second group
would be
$2 etc.)
Including a dollar sign in the replacement string
To actually include a dollar in the replacement string, we need to put a backslash
before the dollar symbol:
str = str.replaceAll("USD", "\\$");
The static method
Matcher.quoteReplacement() will replace instances
of dollar signs and backslashes in a given string with the correct form to allow them
to be used as literal replacements:
str = str.replaceAll("USD",
Matcher.quoteReplacement("$"));
In general:
- If there is a chance that the replacement string will include a dollar sign
or a backslash character, then you should wrap it in Matcher.quoteReplacement()1.
More flexible find and replacement operations
The
replaceAll() method is suitable for cases where the replacement
string is fixed or of a fixed format. For more flexibility,
the
Matcher.find() method can be used.
No comments:
Post a Comment