Team LiB
Previous Section Next Section

Confirming Date Formats

In this example, I need a method that will validate a date format. The requirements are very explicit. Some sort of punctuation between the various date tokens is required, and a space isn't considered punctuation. The method should accept either two digits or four digits for the year, and either one or two digits for the day and month. I also need to make sure the date isn't in the future. I can expect the first date token to be the month, the second date token to be the day of that month, and the last date token to be the year. Thus, valid entries might be as follows:

Again, the first thing to do is search the Web. I find a few patterns that might work here. The first follows, and it's described as being very robust, dealing with leap years, and so on: ^(?:(?:(?:0?[13578]|1[02])(\/|-|\.)31)\1|(?:(?:0?[1,3-9]|1 [0-2])(\/|-|\.)(?:29|30)\2))(?:(?:1[6-9]|[2-9]\d)?\d{2})$|^(?:0?2(\/|-|\.)29\3 (?:(?:(?:1[6-9]|[2-9]\d)?(?:0[48]|[2468][048]|[13579][26])|(?:(?:16|[2468] [048]|[3579][26])00))))$|^(?:(?:0?[1-9])|(?:1[0-2]))(\/|-|\.)(?:0?[1-9]|1\d|2[0-8]) \4(?:(?:1[6-9]|[2-9]\d)?\d{2})$.

I decide to pass on it for now.

The second pattern I find is ^\d{1,2}\/\d{1,2}\/\d{4}$, which looks promising, but limits itself to four-digit years. It could work, but I would have to tweak it. Next, I come across ((\d{2})|(\d))\/((\d{2})|(\d))\/((\d{4})|(\d{2})). At first glance, it looks like I might have to pull the second pattern toward the third.

I don't like the third pattern as is, because it uses a lot capturing groups that it doesn't really need. Noncapturing groups would do as well, and they would be more efficient. This immediately makes me a little suspicious. It's also more verbose than I had hoped for. Of course, it's possible that the verbose nature of the expression makes it more efficient, but I doubt that an author who was worried about efficiency would have left all of those useless capturing groups in there.

Whichever pattern I choose, I'll probably want to replace any and all punctuation within the candidate string with a character that's easy to work with. I would just get rid of all punctuation, but then I wouldn't know if a date such as 1111971 was referring to January 11, 1971, or November 1, 1971. Thus, I'm going to need a line of code like this:

String scrubbedDate = date.replaceAll("\\p{Punct}","@");

Here, I'll probably use the @ symbol as a replacement delimiter. It doesn't have any sort of special regex meaning, so it's easier to work with. Next, I'll need to write a pattern to capture the month, day, and year, and make sure that it constitutes a valid date.

Wait a minute—I wonder if there's an easier way. What if I used the String.split method around the punctuation and extracted the date from the remaining digits? Then I could just use straight Java code to validate the actual date. To do that, I'll need something like this:

String[] datetokens = date.split("\\p{Punct}");

This looks fairly easy, so I go with it.

My algorithm becomes the following: Split the date along punctuation marks, use it to create a Calendar object, compare that to today, and return true if the Calendar object is less than or equal to today. I can write the preliminary method signature as follows:

  public static boolean isDateValid(String date)

Figure 5-1 shows the algorithm.

Click To expand
Figure 5-1: The algorithm for the isDateValid method

Listing 5-4 presents the full implementation.

Listing 5-4: Validating a Date
Start example
01 import java.util.regex.*;
02 import java.io.*;
03 import java.util.logging.Logger;
04 import java.util.GregorianCalendar;
05 import java.util.Calendar;

06 /**
07 *matches dates
08 */
09 public class MatchDates{
10 private static final String DATE_PATTERN = "date";
11 private static final String PROP_FILE = "../regex.properties";
12 private static Logger log = Logger.getAnonymousLogger();
13 public static int LOWER_YEAR_LIMIT = -120;

14    public static void main(String args[]) throws Exception{
15       if (args != null && args.length==1)
16       {
17         boolean b =isDateValid(args[0]);
18         log.info(""+b);
19       }
20       else
21       {
22         System.out.println("usage: java MatchDates dd/dd/dddd");
23       }
24    }

25    /**
26    * Confirms that given date format consists of one or two digits
27    * followed by a punctuation, followed by one or two digits
28    * followed by a punctuation, followed by two or four digits. Further,
29    * it actually validates that the date is less then today, and
30    * and not more then <CODE>LOWER_YEAR_LIMIT</CODE> =120 years in
31    * the past. This method even takes leap years and such into account
32    * @param the <code>String</code> date to be consider
33    * @return <code>boolean</code> true if
34    *
35    * @author M Habibi
36    */


37    public static boolean isDateValid(String date)
38    {
39      boolean retval=false;
40      date = date.trim();

41      //does the candidate have three digits? Otherwise
42      //the month, day, and year extraction below could
43      //throw a number format exception.
44      boolean hasThreeDigitSections =
45       date.matches("\\d+\\p{Punct}\\d+\\p{Punct}\\d+");

46      if (hasThreeDigitSections)
47      {
48         String[] dateTokens = date.split("\\p{Punct}");

49         if (dateTokens.length == 3)
50         {
51          //Java months are zero based, so subtract 1
52          int month = Integer.parseInt(dateTokens[0]) -1;

53          int day = Integer.parseInt(dateTokens[1]);
54          int year = Integer.parseInt(dateTokens[2]);

55          //in case a 2 digit year was entered
56          if (year < 100)
57            year += 2000;

58          //get boundary years
59          GregorianCalendar today = new GregorianCalendar();
60          //get a lowerLimit that is LOWER_YEAR_LIMIT less then
61          //today
62          GregorianCalendar lowerLimit = new GregorianCalendar();
63          lowerLimit.add(Calendar.YEAR, LOWER_YEAR_LIMIT);

64          //create a candidate representing the proposed date.
65          GregorianCalendar candidate =
66          new GregorianCalendar(year, month,day);
67          //check the validity of the date
68          if
69          (
70             candidate.before(today)
71           &&
72             candidate.after(lowerLimit)
73           &&//month could be off, say the user entered 55
74             month == candidate.get(Calendar.MONTH)
75           &&//day could be off, say the user entered 55
76             day == candidate.get(Calendar.DAY_OF_MONTH)
77          )
78          {
79              retval = true;
80          }
81      }
82     }
83     return retval;
84    }
85 }
End example

The previous example deferred almost all of the heavy lifting to regular expressions. The code in Listing 5-4 uses regex for the split method. Otherwise, it's fairly conventional Java code. This doesn't mean the regex contribution is trivial—as a matter of fact, I would say it's critical. However, once the split method's regex contribution is assimilated, you're back in the comfortable world of Java.


Team LiB
Previous Section Next Section