Snowflake: Replace multiple patterns using single Regex_replace

Requirement: Want to achieve all below into one regex_replace

  1. Replace comma, tab and new line character to empty
  2. Replace double quotes into a single quote
  3. Replace pipe into a dash

Also would like to know if replace performance is faster than regex_replace

Below is what I tried out

 with c as (select ' abc \n def\n' as t union select 'abc\tdef' as t) select t,regexp_replace(t, '[,\t\n]','') from c; with c as (select ' abc de"f\n' as t union select 'abc\td|ef' as t) select t,replace(replace(replace(t, '\n',' '),'"','\''),'|','-') from c;
0

1 Answer

The regexp_replace function will only match one pattern and replace it with a literal value, so you can't make it selectively replace depending on what it matches. You can write a UDF to do that in one step, which will make your SQL a lot neater, particularly if you have to put it in a lot of statements. This one does all three replacements:

create or replace function CLEANUP_STRING(STR string)
returns string
language javascript as
$$ var s = STR.replace(/[,\t\n]/g, ""); // Replace comma, tab and new line character to empty s = s.replace(/"/g, "'"); // Replace double quote with single quote s = s.replace(/\|/g, "-"); // Replace pipe with dash return s;
$$;
-- This will display in black font, showing that the line break is gone.
select cleanup_string('This is a "double quoted string". This has a | pipe. This has a new \n line. This has a, comma.');
2

Your Answer

Sign up or log in

Sign up using Google Sign up using Facebook Sign up using Email and Password

Post as a guest

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge that you have read and understand our privacy policy and code of conduct.

You Might Also Like