In this example, I need a method that will validate a date format. The requirements are very explicit. Some sort of punctuation between the various date tokens is required, and a space isn't considered punctuation. The method should accept either two digits or four digits for the year, and either one or two digits for the day and month. I also need to make sure the date isn't in the future. I can expect the first date token to be the month, the second date token to be the day of that month, and the last date token to be the year. Thus, valid entries might be as follows:
11/30/2002
4/25/03
03-29/2003
11/30/1902
2/25-03
06#9/2003
Again, the first thing to do is search the Web. I find a few patterns that might work here. The first follows, and it's described as being very robust, dealing with leap years, and so on: ^(?:(?:(?:0?[13578]|1[02])(\/|-|\.)31)\1|(?:(?:0?[1,3-9]|1 [0-2])(\/|-|\.)(?:29|30)\2))(?:(?:1[6-9]|[2-9]\d)?\d{2})$|^(?:0?2(\/|-|\.)29\3 (?:(?:(?:1[6-9]|[2-9]\d)?(?:0[48]|[2468][048]|[13579][26])|(?:(?:16|[2468] [048]|[3579][26])00))))$|^(?:(?:0?[1-9])|(?:1[0-2]))(\/|-|\.)(?:0?[1-9]|1\d|2[0-8]) \4(?:(?:1[6-9]|[2-9]\d)?\d{2})$.
I decide to pass on it for now.
The second pattern I find is ^\d{1,2}\/\d{1,2}\/\d{4}$, which looks promising, but limits itself to four-digit years. It could work, but I would have to tweak it. Next, I come across ((\d{2})|(\d))\/((\d{2})|(\d))\/((\d{4})|(\d{2})). At first glance, it looks like I might have to pull the second pattern toward the third.
I don't like the third pattern as is, because it uses a lot capturing groups that it doesn't really need. Noncapturing groups would do as well, and they would be more efficient. This immediately makes me a little suspicious. It's also more verbose than I had hoped for. Of course, it's possible that the verbose nature of the expression makes it more efficient, but I doubt that an author who was worried about efficiency would have left all of those useless capturing groups in there.
Whichever pattern I choose, I'll probably want to replace any and all punctuation within the candidate string with a character that's easy to work with. I would just get rid of all punctuation, but then I wouldn't know if a date such as 1111971 was referring to January 11, 1971, or November 1, 1971. Thus, I'm going to need a line of code like this:
String scrubbedDate = date.replaceAll("\\p{Punct}","@");
Here, I'll probably use the @ symbol as a replacement delimiter. It doesn't have any sort of special regex meaning, so it's easier to work with. Next, I'll need to write a pattern to capture the month, day, and year, and make sure that it constitutes a valid date.
Wait a minute—I wonder if there's an easier way. What if I used the String.split method around the punctuation and extracted the date from the remaining digits? Then I could just use straight Java code to validate the actual date. To do that, I'll need something like this:
String[] datetokens = date.split("\\p{Punct}");
This looks fairly easy, so I go with it.
My algorithm becomes the following: Split the date along punctuation marks, use it to create a Calendar object, compare that to today, and return true if the Calendar object is less than or equal to today. I can write the preliminary method signature as follows:
public static boolean isDateValid(String date)
Figure 5-1 shows the algorithm.
Listing 5-4 presents the full implementation.
![]() |
01 import java.util.regex.*; 02 import java.io.*; 03 import java.util.logging.Logger; 04 import java.util.GregorianCalendar; 05 import java.util.Calendar; 06 /** 07 *matches dates 08 */ 09 public class MatchDates{ 10 private static final String DATE_PATTERN = "date"; 11 private static final String PROP_FILE = "../regex.properties"; 12 private static Logger log = Logger.getAnonymousLogger(); 13 public static int LOWER_YEAR_LIMIT = -120; 14 public static void main(String args[]) throws Exception{ 15 if (args != null && args.length==1) 16 { 17 boolean b =isDateValid(args[0]); 18 log.info(""+b); 19 } 20 else 21 { 22 System.out.println("usage: java MatchDates dd/dd/dddd"); 23 } 24 } 25 /** 26 * Confirms that given date format consists of one or two digits 27 * followed by a punctuation, followed by one or two digits 28 * followed by a punctuation, followed by two or four digits. Further, 29 * it actually validates that the date is less then today, and 30 * and not more then <CODE>LOWER_YEAR_LIMIT</CODE> =120 years in 31 * the past. This method even takes leap years and such into account 32 * @param the <code>String</code> date to be consider 33 * @return <code>boolean</code> true if 34 * 35 * @author M Habibi 36 */ 37 public static boolean isDateValid(String date) 38 { 39 boolean retval=false; 40 date = date.trim(); 41 //does the candidate have three digits? Otherwise 42 //the month, day, and year extraction below could 43 //throw a number format exception. 44 boolean hasThreeDigitSections = 45 date.matches("\\d+\\p{Punct}\\d+\\p{Punct}\\d+"); 46 if (hasThreeDigitSections) 47 { 48 String[] dateTokens = date.split("\\p{Punct}"); 49 if (dateTokens.length == 3) 50 { 51 //Java months are zero based, so subtract 1 52 int month = Integer.parseInt(dateTokens[0]) -1; 53 int day = Integer.parseInt(dateTokens[1]); 54 int year = Integer.parseInt(dateTokens[2]); 55 //in case a 2 digit year was entered 56 if (year < 100) 57 year += 2000; 58 //get boundary years 59 GregorianCalendar today = new GregorianCalendar(); 60 //get a lowerLimit that is LOWER_YEAR_LIMIT less then 61 //today 62 GregorianCalendar lowerLimit = new GregorianCalendar(); 63 lowerLimit.add(Calendar.YEAR, LOWER_YEAR_LIMIT); 64 //create a candidate representing the proposed date. 65 GregorianCalendar candidate = 66 new GregorianCalendar(year, month,day); 67 //check the validity of the date 68 if 69 ( 70 candidate.before(today) 71 && 72 candidate.after(lowerLimit) 73 &&//month could be off, say the user entered 55 74 month == candidate.get(Calendar.MONTH) 75 &&//day could be off, say the user entered 55 76 day == candidate.get(Calendar.DAY_OF_MONTH) 77 ) 78 { 79 retval = true; 80 } 81 } 82 } 83 return retval; 84 } 85 }
![]() |
The previous example deferred almost all of the heavy lifting to regular expressions. The code in Listing 5-4 uses regex for the split method. Otherwise, it's fairly conventional Java code. This doesn't mean the regex contribution is trivial—as a matter of fact, I would say it's critical. However, once the split method's regex contribution is assimilated, you're back in the comfortable world of Java.