Regular expressions (regex) in Looker Studio provide a powerful way to manipulate and extract data based on patterns within your dataset. Regex can be particularly useful when you need to clean, filter, or transform text fields in your reports. For example, you can extract specific parts of a string, validate input, or remove unwanted characters using regex in calculated fields.
Here’s a step-by-step guide on how to use regex in Looker Studio:
Step-by-Step Guide to Using Regex in Looker Studio
1. Open Your Report and Access the Data Panel
- Open the report where you want to apply regex.
- Ensure that your data source is connected and available in the Data Panel on the right-hand side of the screen.
2. Add a New Calculated Field
- In the Data Panel, click on + Add a Field at the bottom of the list to create a new calculated field.
- This is where you will define your regex expression to manipulate or extract data.
3. Write a Regex Expression
Looker Studio supports several key regex functions for working with text fields:
REGEXP_EXTRACT(text, pattern)
: Extracts matching substrings based on a regex pattern.REGEXP_MATCH(text, pattern)
: Checks if the text matches the regex pattern and returnsTRUE
orFALSE
.REGEXP_REPLACE(text, pattern, replacement)
: Replaces text that matches the regex pattern with a replacement string.
Example Use Cases for Regex in Looker Studio
Example 1: Extracting Specific Text Using REGEXP_EXTRACT()
You can use REGEXP_EXTRACT()
to extract a specific portion of a string. For instance, if you have an email field and want to extract the domain (e.g., @gmail.com
), you can use the following expression:
REGEXP_EXTRACT(Email, "@(.*)")
- Explanation:
- The regex pattern
"@(.*)"
searches for the "@" symbol followed by any characters (captured by.*
). - The result would be the domain part of the email (e.g.,
gmail.com
).
- The regex pattern
Example 2: Matching Patterns with REGEXP_MATCH()
You can use REGEXP_MATCH()
to check whether a string matches a specific pattern. For example, if you want to filter for email addresses from Gmail, you can create a calculated field with the following formula:
REGEXP_MATCH(Email, "@gmail.com$")
- Explanation:
- The regex pattern
"@gmail.com$"
checks if the email ends with@gmail.com
. - This will return
TRUE
for Gmail addresses andFALSE
for others.
- The regex pattern
Example 3: Replacing Text Using REGEXP_REPLACE()
You can use REGEXP_REPLACE()
to remove or replace certain characters or patterns. For example, if you want to remove any special characters from a product code and only keep alphanumeric characters, you can use:
REGEXP_REPLACE(ProductCode, "[^A-Za-z0-9]", "")
- Explanation:
- The pattern
"[^A-Za-z0-9]"
matches any non-alphanumeric character. - The
REGEXP_REPLACE()
function will replace those characters with an empty string, effectively removing them.
- The pattern
Example 4: Extracting URL Parameters
If you have a field containing URLs, and you want to extract a query parameter (e.g., utm_source
), you can use REGEXP_EXTRACT()
like this:
REGEXP_EXTRACT(Url, "utm_source=([^&]+)")
- Explanation:
- The pattern
"utm_source=([^&]+)"
looks forutm_source=
followed by one or more characters until it encounters an&
symbol. - This extracts the value of the
utm_source
parameter from the URL.
- The pattern
Common Regex Patterns and Use Cases
Extracting Digits: Use
\d
to match digits.REGEXP_EXTRACT(Field, "\d+")
This will extract the first set of digits from a field.
Extracting Words: Use
\w+
to match words (alphanumeric characters).REGEXP_EXTRACT(Field, "\w+")
This extracts the first word or alphanumeric sequence from the field.
Matching Specific Formats (e.g., Dates): Use patterns to match specific formats, such as dates.
REGEXP_MATCH(DateField, "\d{4}-\d{2}-\d{2}")
This matches a date in the format
YYYY-MM-DD
.Removing Whitespace: Use
\s
to match spaces, tabs, or other whitespace characters.REGEXP_REPLACE(Field, "\s", "")
This removes all whitespace from a string.
Using Regex for Advanced Filtering
You can also use regex in Looker Studio for advanced filtering scenarios. For instance, if you want to create a filter control that only shows rows where a field matches a specific pattern, you can:
- Create a Calculated Field: Use
REGEXP_MATCH()
to returnTRUE
orFALSE
based on whether a pattern is matched. - Apply a Filter: Use this calculated field as a filter in your report to show only rows that match the regex pattern.
Example: Filtering for Valid Phone Numbers
Let’s say you want to filter rows where the phone number is in the format 123-456-7890
:
- Create a calculated field with this formula:
REGEXP_MATCH(PhoneNumber, "\d{3}-\d{3}-\d{4}")
- Use this field in your filters to show only valid phone numbers.
Best Practices for Using Regex in Looker Studio
Test Your Patterns: Regex can be tricky, so make sure to test your expressions on small datasets to ensure they’re working as expected before applying them to larger reports.
Keep Patterns Simple: While regex is powerful, overly complex patterns can slow down report performance. Try to keep your expressions as simple as possible.
Combine with Other Functions: You can combine regex with other Looker Studio functions, such as
IF()
orCASE WHEN
, to create more dynamic calculated fields.Document Your Expressions: If you’re creating complex regex patterns, consider adding comments or documentation to your Looker Studio report to explain how the pattern works. This will help future users understand your calculations.
Conclusion
Regex in Looker Studio is a versatile tool that can help you clean, manipulate, and extract data based on patterns. Whether you’re extracting specific parts of text, validating input formats, or replacing unwanted characters, regex provides a robust solution for managing text fields in your reports. By combining regex with calculated fields, filters, and other Looker Studio functions, you can build more dynamic and powerful reports tailored to your needs.
For more customization options, explore how to use calculated fields in Looker Studio or add formula fields for enhanced data manipulation.
Published