December 30, 2017

Regular expression(Regex, Regexp)

❑ Regular expression(Regex, Regexp) : It is a expression to search for a string of a particular rule.

❑ Format of regular expression : Delimiter(start)-Regex pattern-Delimiter(end)-Flag
* e.g. /<(h[1-6])>[a-\w\s']+<\/\1>/gi
* /^[0-9]{2,3}-[0-9]{3,4}-[0-9]{4}$/gi → 000-0000-0000(phone number).
* /^[a-zA-Z0-9]+$/gi → alphabet and number.
* Each computer language has a slightly different format.
❍ Delimiter : By convention, "/" is used, but in fact all characters except "\" are allowed.
❍ Regex pattern : It is a string for filtering. It consists of meta characters.
❍ Flag
- i : case Insensitive
- s : dotall. When searching with the metacharacter ".", includes newlines(\n).
- g : Global. Search more than one match.
- m : Multiline. Includes a newline character in the string to be filtered
- y : stickY. search strings after the offset specified by the current position (= lastIndex property).
- x : Ignore whitespace characters.

❑ Meta characters : A character used in regular expressions pattern.
Meta characterFunction
.Search for a single character.
e.g. /.../ → abc, bcd, cde
^search whether a string begins with a specific string.
e.g. /^ab/ → abbb, abcd
$Search whether a string ends with a specific string.
e.g. /ab$/ → ccab, ooab
+Search a string repeated 1 or more times.
e.g. /a+b/ → aaaaab, aab 
*Search a string repeated 0 or more times.
e.g. /a*b/ → b, aab
?Search for a string with or without strings.
e.g. /a?b/ → b, ab
|Search for a string with OR operation.
e.g. /a|b/ → a, b, ab
{n,m}Search for a string repeated n~m times.
e.g. /a{0,2}b/ → b, aab
{n}search for a string repeated n times.
e.g. /fa{3}b/ → faaab
{n,}search for a string repeated n times or more.
e.g. /a{0,}b/ → b, aaaaab
[]String class. The internal string is treated as a single character.
e.g. /[a-zA-Z]/ → a, k, A, W 
[^]The "^" inside the string class means NOT operation.
e.g. /[^a]bc/ → bbc, cbc, dbc
\Search for a special character(\\, \(, \^ etc.)
e.g. /\^bc*/ → ^bc, ^bca
\bSearch for a character on the boundary.
e.g. str="good password"
/password\b/ → password
/pass\b/ → null
/\bgood/ → good
\BSearch for a character not on the boundary.
e.g. str="good password"
/pass\B/ → pass
/\Bpass/ → null
\dSearch for  a number.
e.g. /\dusd/ → 5usd, 1usd
\DSearch for a non-numeric character.
e.g. /aa\D/ → aak, aam
\sSearch for a whitespace character(Space, Tab, Carriage Return, New line, Vertical tab).
e.g. /a\sbde/ → a bde
\SSearch for a non whitespace character.
e.g. /s\Sbde/ → acbde, a1bde
\nSearch for a new line character.
\rSearch for a carriage return character.
\tSearch for a tab character.
\vSearch for a vertical tab character.
\wSearch for a underline, alphabet and number character.
\WSearch for an non-underline, non-alphabet and non-numeric character.
\xSearch for an ASCII code.
e.g. /\x61\x62c/ → abc
\uSearch for an UNICODE.
e.g. /\u0023\u0024/ → #$
()Assign the searched results to a group.
e.g. str="99092202"
/(9)\1/ → 99
(?:)Don't assign the searched results to a group.
e.g. str="99092202"
/(?:9)\1/ → null
(?=)Search from the matched result.
e.g. str="99092202"
/09(?=220)/ →  09
(?!)Search from the non-matched result.
e.g. str="99092202"
/9909(?!2202)/ → null
\[n]Back reference.
Search for past group again.
e.g. str="990902202"
/(9)(90)\2/ → 99090

※ Practice for regular expression(Link)