<p>Last time we <a href="https://www.sisense.com/blog/string-parsing-in-sql/">talked about matching strings in SQL</a>, we covered tactics that work well for strings on the simple side. For strings with more complicated patterns, the regular expressions below are a handy tool to have in your belt.</p>



<p>Regular expressions are easy to get started with, so let’s jump right in. Postgres and Redshift follow the POSIX standard for regular expressions, so for this post we will focus on that.</p>



<h2 class="wp-block-heading"><strong>Comparison Operators</strong></h2>



<p>Comparison expressions compare strings for relationships.</p>



<ul><li>~ matches a regular expression that is case sensitive. E.g. &#8216;a&#8217; ~ &#8216;A&#8217; will return false</li><li>~* matches a regular expression that is case insensitive. E.g &#8216;a&#8217; ~* &#8216;A&#8217; will return true</li><li>!~ does not match a regular expression and is case sensitive. E.g. &#8216;a&#8217; !~ &#8216;A&#8217; will return true</li><li>!~* does not match a regular expression and is case insensitive. E.g. &#8216;a&#8217; !~* &#8216;A&#8217; will return false</li></ul>



<h2 class="wp-block-heading"><strong>OR Operator</strong></h2>



<p>The <strong>or</strong> operator is denoted by a pipe or vertical bar: |. This matches one of multiple alternatives, e.g. 1|2 matches the number 1 or the number 2.</p>



<h2 class="wp-block-heading"><strong>Groups</strong></h2>



<p>With regular expressions, we say that a string matches a particular pattern. While you can match exactly one item, or a simple set of alternatives, regular expressions are most useful when we combine multiple groups.</p>



<p>A group is specified with parentheses. These define the scope of the operators. For example, if we wanted to match 111 or 121, the pattern 1(1|2)1 would match that.</p>



<h2 class="wp-block-heading"><strong>Repetition</strong></h2>



<p>Being able to match very specific patterns is great, however, often we want to match things that repeat. Below are some great operators used to express how often we want the pattern to repeat.</p>



<ul><li>? matches zero or one occurrences</li><li>* matches zero or more</li><li>+ matches one or more</li><li>{n} matches exactly n occurrences of the preceding pattern</li><li>{n,} matches n or more occurrences</li><li>{n,m} matches between n and m occurrences</li></ul>



<h2 class="wp-block-heading"><strong>Position Operators</strong></h2>



<p>Position operators signal the beginning or end of the search string.</p>



<ul><li>^ matches the beginning of a string. E.g. ^a matches a as the first letter of a string.</li><li>$ matches the end of a string.</li></ul>



<h2 class="wp-block-heading"><strong>Other Operators</strong></h2>



<p>Our last set of operators round out our selection.</p>



<ul><li>. matches any character</li><li>[ ] Square parentheses match any single character contained in the parentheses and can contain ranges. E.g. [a-z] matches any lowercase letter.</li><li>[^ ] matches any character not contained in the set. E.g. [^123] matches any character that is not 1, 2 or 3.</li></ul>



<h2 class="wp-block-heading"><strong>Example</strong></h2>



<p>Suppose we have a campaign column, where we want to find campaigns that do not start with a number, and contain a country code somewhere in the campaign id. To do this, we would use this pattern:</p>



<p>^[^0-9].*[A-Z]{2}.*</p>



<p>Let’s break it down:</p>



<ul><li>^[^0-9] matches to the first character, and says that the first character cannot be a number between 0 and 9.</li><li>.* matches any number of any character</li><li>[A-Z] matches any uppercase letter between A and Z</li><li>{2} says there are two uppercase characters. This is because country codes are two characters.</li></ul>



<p>Note: there are usually many ways to match a string or a pattern with a regular expression, so as an exercise, practice matching things multiple ways. For example, our earlier expression of 1(1|2)1 would match the same set of strings as 1[12]{1}1 and 111|121.</p>



<p>Now it’s time to start matching!</p>


Getting Started with Regular Expressions

LinkedIn

Twitter

GitHub

curve-image-unique-image-unique

curve

3-dark-2-image-unique-image-unique

3 DARK 2

Get the latest in analytics right in your inbox.

Article