<p>In this post, we will cover some of the more common MySQL functions that differ in syntax from Redshift, as well as various rules and tricks to keep in mind!</p>



<h2 class="wp-block-heading">General Differences Between MySQL And Redshift</h2>



<h3 class="wp-block-heading">Grouping</h3>



<p>One of the most common pitfalls when converting MySQL syntax to Redshift involves the group by requirements. Redshift is more stringent, and requires that all non-aggregate functions in the select statement must be included in the group by clause. As an example, MySQL syntax allows for a query like so:</p>



<pre class="wp-block-code"><code>select
   [created_at:week],
   country,
   count(1)
from
   orders
group by
   1</code></pre>



<p>This same query would give an error in Redshift, as it would require the country column to be included in the group by clause.</p>



<h3 class="wp-block-heading">Using Variables</h3>



<p>Unlike MySQL, Redshift does not allow you to define variables within a query. To compensate for this, Redshift includes Window Functions that allow you to iterate over your data in a similar manner to how a variable would be used in MySQL. Let’s take a look at a query that calculates the cumulative sum of a column.</p>



<pre class="wp-block-code"><code>-- MySQL: Add Cumulative Column
set @iterating_variable := 0;
select
   created_at
   , number_of_orders
   , (@iterating_variable := 
       @iterating_variable + number_of_orders)
from 
   orders
order by
   created_at</code></pre>



<p>To accomplish the same task in Redshift, we can use the sum() window function:</p>



<pre class="wp-block-code"><code>-- Redshift: Add Cumulative Column
select
   created_at
   , number_of_orders
   , sum(number_of_orders) 
       over (order by id rows unbounded preceding)
from
   orders
order by
   Created_at</code></pre>



<p>Now both of these queries display the number of orders by day as well as the running cumulative total of orders!</p>



<figure class="wp-block-image fancybox"><img decoding="async" src="https://cdn.sisense.com/wp-content/uploads/redshift-bar.png" alt="Redshift bar" class="wp-image-79962"/></figure>



<h3 class="wp-block-heading">Subqueries</h3>



<p>One downside to MySQL is its reliance upon subqueries. If you’re looking to use multiple subqueries/CTE’s in your main query, it can quickly get overwhelming. Redshift allows you to use With Clauses to build temporary tables that only exist within the query.</p>



<p>While MySQL also has temporary tables, they can only be called once within a query. Redshift’s temporary tables, created through these with clauses, can be referenced multiple times in the query!</p>



<h3 class="wp-block-heading">Generating Series</h3>



<p>Neither MySQL nor Redshift has a built-in function to generate a series of dates or values. However, they each have a couple of clever options to imitate this behavior. In fact, we already have a great blog post that details these methods in greater detail!</p>



<h3 class="wp-block-heading">Calling Names with Spaces or Reserved Words</h3>



<p>In terms of differences between the two SQL Types, this is actually one of the more straightforward cases. MySQL makes use of backticks to “wrap” these names, while Redshift double-quotes them. An example can be seen below where we call two tables: one with a space in its name, and one that is a reserved word.</p>



<pre class="wp-block-code"><code>-- MySQL
select *
from
   `Table One`,
    `Order`
-- Redshift
select *
from
   “Table One”,
   “Order”</code></pre>



<h3 class="wp-block-heading">Concatenating Strings</h3>



<p>MySQL’s concat() function lets you pass in multiple strings to concatenate together. Redshift’s concat() function only allows you to pass in two strings, so you would have to nest this function in order to concatenate more than two values.</p>



<pre class="wp-block-code"><code>-- MySQL
select concat(`Sisense ’, ‘is ’, ‘great’)
-- Redshift
select concat(‘Sisense ’, concat(‘is ’, ‘great’))</code></pre>



<p>Redshift also has a shortcut for concatenation, using double-pipe notation in place of a function call:</p>



<pre class="wp-block-code"><code>select ‘Sisense ’ || ‘is ’ || ‘great’</code></pre>



<h2 class="wp-block-heading">Date/Time Specific Functions</h2>



<h3 class="wp-block-heading">Now()</h3>



<p>In some of the later versions of Redshift, now() is a deprecated function. You would want to use getdate() or sysdate() to return the current time based on the timezone of your database.</p>



<h3 class="wp-block-heading">From_unixtime() and Unix_timestamp()</h3>



<p>Redshift does not have a default function equivalent to from_unixtime() that converts unix timestamps into date timestamps. Instead, we can employ some clever math shown in this <a href="https://blog.valkrysa.com/2017/03/14/amazon-redshift-converting-unix-epoch-time-into-timestamps/" target="_blank" rel="noreferrer noopener" aria-label=" (opens in a new tab)">Valkrysa blog post</a>:</p>



<pre class="wp-block-code"><code>select timestamp ‘epoch’ + your_time_colum * interval ‘1 second’</code></pre>



<p>The first part of this select statement grabs the date and timestamp that acts as the threshold for unix time (1970-01-01 00:00:00). Since unix time measures the amount of seconds that have elapsed since this date, we just need to convert it into ‘second’ literals. Now, we have our timestamp of 1970-01-01 00:00:00, and we can add the total number of seconds to get the appropriate date.</p>



<p>Converting from a timestamp value into unix time is simpler, and actually has two functions to support this. You can do:</p>



<pre class="wp-block-code"><code>select extract(epoch from your_time_column)</code></pre>



<p>or:</p>



<pre class="wp-block-code"><code>select date_part(epoch, time_column)</code></pre>



<p>In each of the two functions, Redshift is essentially calculating the number of seconds that have elapsed from 1970-01-01 00:00:00 and your timestamp column.</p>



<h2 class="wp-block-heading">Interval Literals</h2>



<p>In many ways, Redshift and MySQL are on the same page in terms of using interval literals with timestamps. The major difference surrounds the pickiness of syntax, as well as whether or not the Redshift instance is running on a lead node. In Redshift, best practice recommends single quoting your literal value. This would look similar to below:</p>



<pre class="wp-block-code"><code>select getdate() - interval ‘1 day’</code></pre>



<p>For queries that do not run on the lead node, Redshift does not allow for interval values higher than a week. So attempting to add or subtract a ‘month’ or ‘year’ interval value can throw an error if the timestamp column is evaluated across multiple nodes.</p>



<h2 class="wp-block-heading">Date_sub() and Date_add()</h2>



<p>These functions act similarly to the interval literals discussed above. Redshift combines both of these functions into a single dateadd() function. You’ll notice that Redshift’s version calls for three parameters instead of two.</p>



<p>In order to get the same behavior of MySQL’s date_sub(), you would want to pass in a negative interval to the dateadd() function. That means your queries may look something like this:</p>



<pre class="wp-block-code"><code>select dateadd(day, 1, getdate())  -- Returns Same Time Tomorrowselect
dateadd(day, -1, getdate()) -- Returns Same Time Yesterday</code></pre>



<h2 class="wp-block-heading">Date_format()</h2>



<p>This is an important function in MySQL for getting timestamps to display in the exact way that you would like. Redshift has two similar functions depending on your use-case. to_char() takes a timestamp, and allows you to pass in a parameter that control the formatting. The to_date() function takes a string or numeric value, and uses the same format options as to_char() to control the end result.</p>



<p>Let’s take a look at some of the different formats we can return! Running the following query:</p>



<pre class="wp-block-code"><code> select
      getdate() as format_one,
      to_char(getdate(), ‘MON-DD-YYYY’) as month_day_year,
      to_char(getdate(), ‘Day, Month DD YYYY’) as day_month_year,
      to_char(getdate(), ‘YYYY-Q') as year_quarter</code></pre>



<p>Returns the following values:</p>



<figure class="wp-block-image fancybox"><img decoding="async" src="https://cdn.sisense.com/wp-content/uploads/redshift-table.png" alt="Redshift table" class="wp-image-79968"/></figure>



<h2 class="wp-block-heading">Datediff()</h2>



<p>Redshift’s datediff() function is more robust and flexible than MySQL’s in terms of the level of specificity. MySQL’s datediff() is limited to returning a whole number value of days between two dates. Redshift’s datediff() allows you to choose what format to calculate the difference in (e.g. minutes, hours, days, weeks).</p>



<p>One other potential misstep to watch out for is the order of parameters. MySQL’s version of the function takes two parameters, where a positive value is returned if the first parameter is larger than the second. Redshift reverses the order of the parameters if you want to return a positive value. That means your queries would run like this:</p>



<pre class="wp-block-code"><code>-- MySQL: returns 1
select datediff(now(), now() - interval 1 day)
-- Redshift: returns 1
select datediff(day, getdate() - interval ‘1 day’, getdate()) 
Day(), Week(), Month(), etc.</code></pre>



<p>Redshift uses the extract() function to pull out the desired numeric value of your timestamps. Rather than having a specific function for each date interval, extract() allows you to pass in the exact interval you want to find.</p>



<pre class="wp-block-code"><code>-- MySQL
select day(now()), week(now()), month(now())
-- Redshift
select 
  extract(day from getdate())
  , extract(week from getdate())
  , extract(month from getdate())</code></pre>



<h2 class="wp-block-heading">Conclusion</h2>



<p>While it’s not quite as similar as Postgres, you can see that MySQL does share many similarities with Redshift! For the majority of these functions, the key difference is something as small an additional parameter, the ordering of parameters, or even just a different name.</p>


Converting MySQL Syntax and Functions into Redshift

LinkedIn

Twitter

GitHub

curve-image-unique-image-unique

curve

3-dark-2-image-unique-image-unique

3 DARK 2

Get the latest in analytics right in your inbox.

Article