Readings : References, Tutorials and Articles
- Regular Expressions in Single UNIX Specification
- Regular Expressions in
POSIX.1-2008
- Tutorial on Regular-Expressions.info
- Regexp Syntax Summary
- Regular Expression Flavor Comparison
Special Characters of ERE (Extended Regular Expression)
from Regular Expressions in Single UNIX Specification
An ERE special character has special properties in certain contexts. Outside those contexts, or when preceded by a backslash, such a character is an ERE that matches the special character itself. The extended regular expression special characters and the contexts in which they have their special meaning are:
- . \ [ (
- The period, left-bracket, backslash and left-parenthesis are special except when used in a bracket expression. Outside a bracket expression, a left-parenthesis immediately followed by a right-parenthesis produces undefined results.
- )
- The right-parenthesis is special when matched with a preceding left-parenthesis, both outside a bracket expression.
- * + ? {
-
The asterisk, plus-sign, question-mark and left-brace are special except when used in a bracket expression (see RE Bracket Expression ). Any of the following uses produce undefined results:
- if these characters appear first in an ERE, or immediately following a vertical-line, circumflex or left-parenthesis.
- if a left-brace is not part of a valid interval expression.
- |
-
The vertical-line is special except when used in a bracket expression. A vertical-line appearing first or last in an ERE, or immediately following a vertical-line or a left-parenthesis, or immediately preceding a right-parenthesis, produces undefined results.
- ^
-
The circumflex is special when used:
- as an anchor
- as the first character of a bracket expression
- $
-
The dollar sign is special when used as an anchor.
Regex with Java
You can find the most proper information to use regex with Java in the API documentation ofjava.util.regex.Pattern
class
Formal rules for bracket expression
Bracket expressions such as [0-9a-zA-Z], [^0-9a-zA-Z], or [0-9a-zA-Z.?*+-] are kind of different from normal expressions. One of the most important differences is metacharacters or special characters. Including that, more formal detailed description for bracket expression can be found in the following- RE Bracket Expression in Single UNIX Specification by X/Open group
- RE Bracket Expression in
POSIX.1-2008
Capturing, Grouping and Backreferences
NOT operator in Regex
Nested pairs search
- Can I use Perl regular expressions to match balanced text?
- .NET Regular Expressions: Regex and Balanced Matching
- Regex Recursion (Matching Nested Constructs)
0 comments:
Post a Comment