Beruflich Dokumente
Kultur Dokumente
Previous
Next
REGEXP_LIKE
REGEXP_LIKE
is similar to the LIKE condition, except REGEXP_LIKE performs regular expression matching instead of the simple pattern matching performed by LIKE. This condition evaluates strings using characters as defined by the input character set. This condition complies with the POSIX regular expression standard and the Unicode Regular Expression Guidelines. For more information, please refer to Appendix C, " Oracle Regular Expression Support".
regexp_like_condition::=
is a character expression that serves as the search value. It is commonly a character column and can be of any of the datatypes CHAR, VARCHAR2, NCHAR, NVARCHAR2, CLOB, or NCLOB. pattern is the regular expression. It is usually a text literal and can be of any of the datatypes CHAR, VARCHAR2, NCHAR, or NVARCHAR2. It can contain up to 512 bytes. If the datatype of pattern is different from the datatype of source_string, Oracle converts pattern to the datatype of source_string. For a listing of the operators you can specify in pattern, please refer to Appendix C, " Oracle Regular Expression Support". match_parameter is a text literal that lets you change the default matching behavior of the function. You can specify one or more of the following values for match_parameter: o 'i' specifies case-insensitive matching. o 'c' specifies case-sensitive matching. o 'n' allows the period (.), which is the match-any-character wildcard character, to match the newline character. If you omit this parameter, the period does not match the newline character. o 'm' treats the source string as multiple lines. Oracle interprets ^ and $ as the start and end, respectively, of any line anywhere in the source string, rather than only at the start or end of the entire source string. If you omit this parameter, Oracle treats the source string as a single line.
If you specify multiple contradictory values, Oracle uses the last value. For example, if you specify 'ic', then Oracle uses case-sensitive matching. If you specify a character other than those shown above, then Oracle returns an error. If you omit match_parameter, then:
o o o
The default case sensitivity is determined by the value of the NLS_SORT parameter. A period (.) does not match the newline character. The source string is treated as a single line.
Examples The following query returns the first and last names for those employees with a first name of Steven or Stephen (where first_name begins with Ste and ends with en and in between is either v or ph):
SELECT first_name, last_name FROM employees WHERE REGEXP_LIKE (first_name, '^Ste(v|ph)en$'); FIRST_NAME -------------------Steven Steven Stephen LAST_NAME ------------------------King Markle Stiles
The following query returns the last name for those employees with a double vowel in their last name (where last_name contains two adjacent occurrences of either a, e, i, o, or u, regardless of case):
SELECT last_name FROM employees WHERE REGEXP_LIKE (last_name, '([aeiou])\1', 'i'); LAST_NAME ------------------------De Haan Greenberg
Previous
Next
Copyright 1996, 2003 Oracle Corporation All Rights Reserved. Home Book List Contents Index
Skip Headers Oracle Database Application Developer's Guide - Fundamentals 10g Release 1 (10.1) Part Number B10795-01
Previous
Next
What are Regular Expressions? Oracle Database Regular Expression Support Oracle Database SQL Functions for Regular Expressions Metacharacters Supported in Regular Expressions Constructing Regular Expressions See Also:
Oracle Database SQL Reference for additional details on Oracle Database SQL functions for regular expressions Oracle Database Globalization Support Guide for details on using SQL regular expression functions in a multilingual environment Mastering Regular Expressions published by O'Reilly & Associates, Inc.
searches for the pattern: 'a', followed by either 'b' or 'c', then followed by 'd'. This regular expression matches both 'abd' and 'acd'. A regular expression is specified using two types of characters:
Metacharacters--operators that specify algorithms for performing the search. Literals--the actual characters to search for.
Note: The interpretation of metacharacters differs between tools that support regular expressions in the industry. If you are porting regular expressions from another environment to Oracle Database, ensure that the regular expression syntax is supported and the behavior is what you expect.
The database provides a set of SQL functions that allow you to search and manipulate strings using regular expressions. You can use these functions on any datatype that holds character data such as CHAR, NCHAR, CLOB, NCLOB, NVARCHAR2, and VARCHAR2. A regular expression must be enclosed or wrapped between single quotes. Doing so, ensures that the entire expression is interpreted by the SQL function and can improve the readability of your code. Table 12-1 gives a brief description of each regular expression function.
Note: As with all text literals used in SQL functions, regular expressions must be enclosed or wrapped between single quotes. If your regular expression includes the single quote character, enter two single quotation marks to represent one single quotation mark within your expression.
Description This function searches a character column for a pattern. Use this function in the WHERE clause of a query to return rows matching the regular expression you specify. See the Oracle Database SQL Reference for syntax details on the REGEXP_LIKE function.
REGEXP_REPLACE This function searches for a pattern in a character column and replaces each occurrence of that pattern with the pattern you specify. See the Oracle Database SQL Reference for syntax details on the REGEXP_REPLACE function. REGEXP_INSTR This function searches a string for a given occurrence of a regular expression pattern. You specify which occurrence you want to find and the start position to search from. This function returns an integer indicating the position in the string where the match is found. See the Oracle Database SQL Reference for syntax details on the REGEXP_INSTR function. REGEXP_SUBSTR This function returns the actual substring matching the regular expression pattern you specify.
SQL Function
Description See the Oracle Database SQL Reference for syntax details on the REGEXP_SUBSTR function.
Metacharacter Syntax
. + ?
Operator Name One or More -- Plus Quantifier Zero or One -Question Mark Quantifier Zero or More -- Star Quantifier
Description Matches one or more occurrences of the preceding subexpression Matches zero or one occurrence of the preceding subexpression Matches zero or more occurrences of the preceding subexpression
Interval--Exact Count Matches exactlym occurrences of the preceding subexpression Interval--At Least Count Interval--Between Count Matching Character List Non-Matching Character List Or Subexpression or Grouping Backreference Escape Character Matches at least m occurrences of the preceding subexpression Matches at least m, but not more than n occurrences of the preceding subexpression Matches any character in list ... Matches any character not in list ...
'a|b'
Treat expression ... as a unit. The subexpression can be a string of literals or a complex expression containing operators. Matches the nth preceding subexpression, where n is an integer from 1 to 9. Treat the subsequent metacharacter in the expression
\n \
Metacharacter Syntax
^ $ [:class:]
Operator Name Beginning of Line Anchor End of Line Anchor POSIX Character Class POSIX Collating Sequence POSIX Character Equivalence Class
Description as a literal. Match the subsequent expression only when it occurs at the beginning of a line. Match the preceding expression only when it occurs at the end of a line. Match any character belonging to the specified character class. Can be used inside any list expression. Specifies a collating sequence to use in the regular expression. The element you use must be a defined collating sequence, in the current locale. Match characters having the same base character as the character you specify.
[.element.]
[=character=]
The dot operator '.' matches any single character in the current character set. For example, to find the sequence--'a', followed by any character, followed by 'c'--use the expression:
a.c
One or More--Plus The one or more operator '+' matches one or more occurrences of the preceding expression. For example, to find one or more occurrences of the character 'a', you use the regular expression:
a+
Zero or One--Question Mark Operator The question mark matches zero or one--and only one--occurrence of the preceding character or subexpression. You can think of this operator as specifying an expression that is optional in the source text. For example, to find--'a', optionally followed by 'b', then followed by 'c'--you use the following regular expression:
ab?c
Zero or More--Star The zero or more operator '*', matches zero or more occurrences of the preceding character or subexpression. For example, to find--'a', followed by zero or more occurrences of 'b', then followed by 'c'--use the regular expression:
ab*c
Interval--Exact Count The exact-count interval operator is specified with a single digit enclosed in braces. You use this operator to search for an exact number of occurrences of the preceding character or subexpression. For example, to find where 'a' occurs exactly 5 times, you specify the regular expression:
a{5}
Interval--At Least Count You use the at-least-count interval operator to search for a specified number of occurrences, or more, of the preceding character or subexpression. For example, to find where 'a' occurs at least 3 times, you use the regular expression:
a{3,}
Interval--Between Count You use the between-count interval operator to search for a number of occurrences within a specified range. For example, to find where 'a' occurs at least 3 times and no more than 5 times, you use the following regular expression:
a{3,5}
You use the matching character list to search for an occurrence of any character in a list. For example, to find either 'a', 'b', or 'c' use the following regular expression:
[abc]
This expression matches the first character in each of the following strings:
at bet cot
The following regular expression operators are allowed within the character list, any other metacharacters included in a character list lose their special meaning (are treated as literals):
Range operator '-' POSIX character class [::] POSIX collating sequence [. .] POSIX character equivalence class [= =]
Non-Matching Character List Use the non-matching character list to specify characters that you do not want to match. Characters that are not in the non-matching character list are returned as a match. For example, to exclude the characters 'a', 'b', and 'c' from your search results, use the following regular expression:
[^abc]
This expression matches characters 'd' and 'g' in the following strings:
abcdef ghi
As with the matching character list, the following regular expression operators are allowed within the non-matching character list (any other metacharacters included in a character list are ignored):
Range operator '-' POSIX character class [::] POSIX collating sequence [. .] POSIX character equivalence class [= =]
For example, the following regular expression excludes any character between 'a' and 'i' from the search result:
[^a-i]
This expression matches the characters 'j' and 'l' in the following strings:
hijk lmn
Or Use the Or operator '|' to specify an alternate expression. For example to match 'a' or 'b', use the following regular expression:
a|b
Subexpression You can use the subexpression operator to group characters that you want to find as a string or to create a complex expression. For example, to find the optional string 'abc', followed by 'def', use the following regular expression:
(abc)?def
This expression matches strings 'abcdef' and 'def' in the following strings:
abcdefghi defghi
Backreference The backreference lets you search for a repeated expression. You specify a backreference with th '\n', where n is an integer from 1 to 9 indicating the n preceding subexpression in your regular expression. For example, to find a repeated occurrence of either string 'abc' or 'def', use the following regular expression:
(abc|def)\1
The backreference counts subexpressions from left to right starting with the opening parenthesis of each preceding subexpression. The backreference lets you search for a repeated string without knowing the actual string ahead of time. For example, the regular expression:
^(.*)\1$
matches a line consisting of two adjacent appearances of the same string. Escape Character Use the escape character '\' to search for a character that is normally treated as a metacharacter. For example to search for the '+' character, use the following regular expression:
\+
This expression matches the plus character '+' in the following string:
abc+def
abcdef
Beginning of Line Anchor Use the beginning of line anchor ^ to search for an expression that occurs only at the beginning of a line. For example, to find an occurrence of the string def at the beginning of a line, use the expression:
^def
End of Line Anchor The end of line anchor metacharacter '$' lets you search for an expression that occurs only at the end of a line. For example, to find an occurrence of def that occurs at the end of a line, use the following expression:
def$
POSIX Character Class The POSIX character class operator lets you search for an expression within a character list that is a member of a specific POSIX Character Class. You can use this operator to search for characters with specific formatting such as uppercase characters, or you can search for special characters such as digits or punctuation characters. The full set of POSIX character classes is supported.
To use this operator, specify the expression using the syntax [:class:] where class is the name of the POSIX character class to search for. For example, to search for one or more consecutive uppercase characters, use the following regular expression:
[[:upper:]]+
The expression does not return a match for the following string:
abcdefghi
Note that the character class must occur within a character list, so the character class is always nested within the brackets for the character list in the regular expression. See Also: Mastering Regular Expressions published by O'Reilly & Associates, Inc. for more information on POSIX character classes POSIX Collating Sequence The POSIX collating sequence element operator [. .] lets you use a collating sequence in your regular expression. The element you specify must be a defined collating sequence in the current locale. This operator lets you use a multicharacter collating sequence in your regular expression where only one character would otherwise be allowed. For example, you can use this operator to ensure that the collating sequence 'ch', when defined in a locale such as Spanish, is treated as one character in operations that depend on the ordering of characters. To use the collating sequence operator, specify [.element.] where element is the collating sequence you want to find. You can use any collating sequence that is defined in the current locale including single-character elements as well as multicharacter elements. For example, to find the collating sequence 'ch', use the following regular expression:
[[.ch.]]
You can use the collating sequence operator in any regular expression where collation is needed. For example, to specify the range from 'a' to 'ch', you can use the following expression:
[a-[.ch.]]
POSIX Character Equivalence Class Use the POSIX character equivalence class operator to search for characters in the current locale that are equivalent. For example, to find the Spanish character '' as well as 'n'. To use this operator, specify [=character=], to find all characters that are members of the same character equivalence class as the specified character. For example, the following regular expression could be used to search for characters equivalent to 'n' in a Spanish locale:
[[=n=]]
This expression matches both 'N' and '' in the following string:
El Nio
Note:
The character equivalence class must occur within a character list, so the character equivalence class is always nested within the brackets for the character list in the regular expression. Usage of character equivalents depends on how canonical rules are defined for your database locale. See the Oracle Database Globalization Support Guide for more information on linguistic sorting and string searching.
Previous
Next
Copyright 1996, 2003 Oracle Corporation All Rights Reserved. Home Book List Contents Index