CONTENTS PREFACE COPYRIGHT AND LICENSE INTRODUCTION Installation Acknowledgements FAQ SYNTAX What is a regular expression? Perl5 regular expressions THE INTERFACES Pattern PatternCompiler PatternMatcher MatchResult THE CLASSES Perl5Pattern Perl5Compiler Perl5Matcher PatternMatcherInput Perl5StreamInput Util Perl5Debug SAMPLE PROGRAMS MatchResult example Difference between matches() and contains() Case sensitivity Searching an InputStream Splits Substitutions APPENDIX Package API reference (javadoc generated) | |
The ClassesThe current set of OROMatcher TM implement Perl5 regular expressions, but future releases will include classes for other regular expression grammars that users request. As a side note, you do not need to include the Util or Perl5Debug classes with software you write with OROMatcher TM if you do not use those classes in your code. This can reduce the size of your software distribution by a few kilobytes. Perl5PatternPerl5Pattern implements the Pattern interface for Perl5 regular expressions. The only reason it is made visible to the programmer is for type safety when calling thePerl5Matcher(Perl5StreamInput, Perl5Pattern)
method and for programmer accesibility when the class is made serializable
in a future release incorporating 1.1 features. Currenly we want the
package to be usable with the 1.0.2 and 1.1.* JDK's. But we will release
a 1.1 enhanced version of the package leveraging the 1.1 features, such
as serializability, that our users want. At that point we will distribute
both 1.0.2 and 1.1 versions of the classes.
Perl5CompilerThe Perl5Compiler class creates compiled regular expressions conforming to the Perl5 regular expression syntax. It generates Perl5Pattern instances upon compilation to be used in conjunction with a Perl5Matcher instance. Please refer to the Syntax section for more information on Perl5 regular expressions.The Perl5Compiler compile() methods can take the following flags which can be bitwise or'ed together to affect the nature of the compiled pattern:
Perl5MatcherThe Perl5Matcher classes function according to the PatternMatcher interface when used with Perl5Patterns. Perl5Matcher contains 3 methods that don't appear in the PatternMatcher interface:
PatternMatcherInputThe PatternMatcherInput class is used to preserve state across calls to the contains() methods of PatternMatcher instances. It is also used to specify that only a subregion of a string should be used as input when looking for a pattern match. All that is meant by preserving state is that the end offset of the last match is remembered, so that the next match is performed from that point where the last match left off. This offset can be accessed from the getCurrentOffset() method and can be set with the setCurrentOffset(int) method.You would use a PatternMatcherInput object when you want to search for more than just the first occurrence of a pattern in a string, or when you only want to search a subregion of the string for a match. An example of its most common use is:
PatternMatcher matcher;
PatternCompiler compiler;
Pattern pattern;
PatternMatcherInput input;
MatchResult result;
compiler = new Perl5Compiler();
matcher = new Perl5Matcher();
try {
pattern = compiler.compile(somePatternString);
} catch(MalformedPatternException e) {
System.out.println("Bad pattern.");
System.out.println(e.getMessage());
return;
}
input = new PatternMatcherInput(someStringInput);
while(matcher.contains(input, pattern)) {
result = matcher.getMatch();
// Perform whatever processing on the result you want.
}
// Suppose we want to start searching from the beginning again with
// a different pattern.
// Just set the current offset to the begin offset.
input.setCurrentOffset(input.getBeginOffset());
// Second search omitted
// Suppose we're done with this input, but want to search another string.
// There's no need to create another PatternMatcherInput instance.
// We can just use the setInput() method.
input.setInput(aNewInputString);
Perl5StreamInputThe Perl5StreamInput class is used to look for pattern matches in an InputStream in conjunction with the Perl5Matcher class. It is called Perl5StreamInput instead of Perl5InputStream to stress that it is a form of streamed input for the Perl5Matcher rather than a subclass of InputStream. Perl5StreamInput performs special internal buffering to accelerate pattern searches through a stream. You can determine the size of this buffer and how it grows by using the appropriate constructor. You should avoid using buffer increments smaller than 4096 bytes, as they will adversely affect peformance.If you want to perform line by line matches on an InputStream, you should use DataInputStream or BufferedReader class (depending on whether you are using JDK 1.0.2 or 1.1) in conjunction with one of the PatternMatcher methods taking a String, char[], or PatternMatcherInput as an argument. The DataInputStream and BufferedReader readLine() methods are implemented as native methods and therefore more efficient than supporting line by line searching within Perl5StreamInput. In the future the programmer will be able to set this class to save all the input it sees so that it can be accessed later. This will avoid having to read a stream more than once for whatever reason.
For an example of how to use the Perl5StreamInput class, look at
streamInputExample.java
.
A grep method is not included for two reasons:
For an example of how to use the split and substitute methods look
at splitExample.java
and substituteExample.java .
Copyright © 1997 ORO, Inc. All rights reserved. Original Reusable Objects, ORO, the ORO logo, and "Component software for the Internet" are trademarks or registered trademarks of ORO, Inc. in the United States and other countries. Java is a trademark of Sun Microsystems. All other trademarks are the property of their respective holders. |