Team LiB
Previous Section Next Section

FAQs

Q: 

How do I start using the regex package?

Simply import the java.util.regex.* package.

Q: 

How do I find out whether a string contains a substring?

If you're really looking for a explicit substring, instead of a pattern description, then use the String.indexOf method. However, if you need to actually confirm the existence of a pattern, then you have two paths open to you. The first is to use a variation of the String.split method with a negative number as the second parameter:   String tokens[] = candidate.split(subStringPattern,-1);  and make sure the resulting array has more than a single element:   boolean isThere = tokens.length - 1? true: false;  The problem here is that if the phrase you're looking for just happens to be the last element in the candidate sentence, then size of the array will be still be 1, which will lead to a false conclusion. Try this with the candidate this is  the phrase I want and the phrase description want. with a period trailing the t character. Your second option is to use a short method like the following, which will always work:   /*** Confirms, or denies, the existence of the regex* as part of the candidate String.* @param the -code-String-/code- candidate* @param the -code-String-/code- subStringPattern* @return -code-boolean-/code- true if the regex* describes part of the* @author M Habibi*/ public static booleancontainsSubtring(String candidate, String subStringPattern){boolean retval = false;//compile the patternPattern pattern = Pattern.compile(subStringPattern);  //see if any part of the candidate contains the//descriptionMatcher matcher = pattern.matcher(candidate);retval = matcher.find(); return retval;}

Q: 

How do I confirm the existence of the nth occurrence of a substring?

The solution here is similar to the one given previously, including the usage of the String.split method. The same limitations apply. As far the methodbased solution is concerned, the only modifications that you need to make to the method are the following. First, adjust the method signature so that it accepts a third parameter as the number of interactions, so that the signature looks like the following:   public static boolean containsSubtring( String candidate, String subStringPattern, int n )  Second, add the loop indicated in bold:   boolean retval = false; //compile the patterns Pattern pattern = Pattern.compile(subStringPattern);//see if any part of the candidate contains the //description Matcher matcher = pattern.matcher(candidate); for (int i=0; i- n; i++)  {  retval = matcher.find();  if (!retval) break;  } return retval;

Q: 

How do I swap out the $ in I want to use a $ character so that the resulting string reads I want to use a \$ character?

For the candidate String    String candidate = `I want to use a $ character`;  the solution is the somewhat counterintuitive regex pattern   String newString = candidate.replaceAll(`\\$`,`\\\\\\$`);  The initial parameter, \\$ , is clear enough. You want the dollar sign, which just happens to be a regex metacharacter meaning end-of-line. Because you do want the actual dollar sign character and not the end-of-line, you have to delimit the dollar sign, producing the pattern \$ . However, you also need to meet the needs of the String object's constructor, which expects to treat anything following a \ as a String metacharacter. Because \$ isn't a String metacharacter (it's a regex metacharacter), you need to tell the String object's constructor to ignore the \ . Thus, you need to delimit it once again, producing \\$ . This leads to the second part of the pattern: \\\\\\$ . Here, the first \ delimits the second \ , the third \ delimits the fourth \ , and the fifth \ delimits the sixth. Thus, the String  \\\\\\$ results in \\\\$ . Internally, the method has to rip out the \$ part of I  want to use a $ character and replace it with something , but what is that something? The method has decomposed the original String you gave it into two parts: a substring consisting of I  want to use a and a second substring consisting of character . Normally, the Matcher.replaceAll method inserts whatever you give it between these two substrings, concatenates the result, and returns that. However, because what you gave it just happens to contain the dollar symbol, there is an added wrinkle. As the Matcher.replaceAll description in this chapter shows, the dollar sign has special significance in the replaceAll method. It's used to refer to a subgroup that has been captured by the pattern. Because you don't want it to have that significance, you need to delimit it again. Hence, the pattern \\\$ , in which the first \ delimits the second \ , and the third \ delimits the $ , thus logically producing \$ .

Answers

A: 

Simply import the java.util.regex.* package.

A: 

If you're really looking for a explicit substring, instead of a pattern description, then use the String.indexOf method. However, if you need to actually confirm the existence of a pattern, then you have two paths open to you. The first is to use a variation of the String.split method with a negative number as the second parameter:

String tokens[] = candidate.split(subStringPattern,-1);

and make sure the resulting array has more than a single element:

boolean isThere = tokens.length > 1? true: false;

The problem here is that if the phrase you're looking for just happens to be the last element in the candidate sentence, then size of the array will be still be 1, which will lead to a false conclusion. Try this with the candidate this is the phrase I want and the phrase description want. with a period trailing the t character.

Your second option is to use a short method like the following, which will always work:

   /**
   * Confirms, or denies, the existence of the regex
   * as part of the candidate String.
   * @param the <code>String</code> candidate
   * @param the <code>String</code> subStringPattern
   * @return <code>boolean</code> true if the regex
   * describes part of the
   * @author M Habibi
   */

   public static boolean
   containsSubtring(String candidate, String subStringPattern)
   {
       boolean retval = false;
       //compile the pattern
       Pattern pattern = Pattern.compile(subStringPattern);
       //see if any part of the candidate contains the
       //description
       Matcher matcher = pattern.matcher(candidate);
       retval = matcher.find();

       return retval;
   }

A: 

The solution here is similar to the one given previously, including the usage of the String.split method. The same limitations apply. As far the methodbased solution is concerned, the only modifications that you need to make to the method are the following.

First, adjust the method signature so that it accepts a third parameter as the number of interactions, so that the signature looks like the following:

public static boolean containsSubtring(
  String candidate,
  String subStringPattern,
  int n
)

Second, add the loop indicated in bold:

boolean retval = false;
//compile the patterns
Pattern pattern = Pattern.compile(subStringPattern);

//see if any part of the candidate contains the
//description
Matcher matcher = pattern.matcher(candidate);
for (int i=0; i< n; i++)
{
  retval = matcher.find();
  if (!retval) break;
}
return retval;

A: 

For the candidate String

       String candidate = "I want to use a $ character";

the solution is the somewhat counterintuitive regex pattern

String newString = candidate.replaceAll("\\$","\\\\\\$");

The initial parameter, \\$, is clear enough. You want the dollar sign, which just happens to be a regex metacharacter meaning end-of-line. Because you do want the actual dollar sign character and not the end-of-line, you have to delimit the dollar sign, producing the pattern \$.

However, you also need to meet the needs of the String object's constructor, which expects to treat anything following a \ as a String metacharacter. Because \$ isn't a String metacharacter (it's a regex metacharacter), you need to tell the String object's constructor to ignore the \. Thus, you need to delimit it once again, producing \\$.

This leads to the second part of the pattern: \\\\\\$ . Here, the first \ delimits the second \, the third \ delimits the fourth \, and the fifth \ delimits the sixth. Thus, the String \\\\\\$ results in \\\\$.

Internally, the method has to rip out the \$ part of I want to use a $ character and replace it with something, but what is that something? The method has decomposed the original String you gave it into two parts: a substring consisting of I want to use a and a second substring consisting of character.

Normally, the Matcher.replaceAll method inserts whatever you give it between these two substrings, concatenates the result, and returns that. However, because what you gave it just happens to contain the dollar symbol, there is an added wrinkle.

As the Matcher.replaceAll description in this chapter shows, the dollar sign has special significance in the replaceAll method. It's used to refer to a subgroup that has been captured by the pattern. Because you don't want it to have that significance, you need to delimit it again. Hence, the pattern \\\$, in which the first \ delimits the second \, and the third \ delimits the $, thus logically producing \$.


Team LiB
Previous Section Next Section