There are many different ways to extract numbers, text, and punctuation from a cell in Google Sheets.
Learning all of the different functions that can be used to extract a substring from a string can be overwhelming at first, but it is definitely worth the time to understand these functions if you are a regular spreadsheet user.
In this tutorial, I will show you how to extract numbers or text from a string in Google Sheets.
Table of Contents
Functions for Extracting in Google Sheets
As we discussed earlier, there are many different functions for extraction in Google Sheets. Depending on what you are trying to extract from your string, some of these functions are better suited for the task than others.
Here are the functions I will be covering in this guide:
- LEFT
- RIGHT
- MID
- SPLIT
- REGEXEXTRACT
- REGEXREPLACE
LEFT Function
The LEFT function will return a substring from the beginning of a string according to the specified number of characters.
The syntax of the LEFT function is:
=LEFT(string, [number_of_characters])
- string – this is the string that you want to return the substring from
- number_of_characters – this is the number of characters that will be returned from the left side of the string. This argument is optional and will be set to 1 by default
Here are a few examples of this function is used:
You can see that the function just returns a substring that is the length of the second parameter of the function (number_of_characters) starting on the left of your string.
RIGHT Function
The RIGHT function is the opposite of the LEFT function. It will return a substring from the end of a string according to the specified number of characters.
The syntax of the RIGHT function is:
=RIGHT(string, [number_of_characters])
- string – this is the string that you want to return the substring from
- number_of_characters – this is the number of characters that will be returned from the right side of the string. This argument is optional and will be set to 1 by default
Here are a few examples of this is used:
You can see from the example above that each that this function will simply extract starting on the right-hand side of your string and extract the number of characters specified in the second argument (number_of_characters)
MID Function
The MID function will return a substring from the middle of a string. In this function, you specify which character you want to start your extraction from, and how many characters to extract.
The syntax of the MID function is:
=MID(string, starting_at, extract_length)
- string – this is the string that you want to extract the substring from
- starting_at – this is the index of the character that you want to start your extraction from. It starts on the left with the first character having an index of 1
- extract_length – this is the length of the string to extract. Here you will put in the number of characters you want to extract
Here are a few examples of this is used:
In the example above, look at how the function extracts the substring starting at the character used in the second (starting_at) parameter, and extracts the length of the third (extract_length) parameter.
So the first formula in my sheet =MID(A2,1,2) will take the string in A2, start at the first character, and return 2 characters.
SPLIT Function
The SPLIT function will take a string in a cell and put fragments of the string into different cells by dividing the string according to the character or string specified.
The syntax of the MID function is:
=SPLIT(text, delimiter, [split_by_each], [remove_empty_text])
- text – this is the text that you wish to split into different cells
- delimiter – this is the character or characters that the text will be split up by
- split_by_each – this argument controls whether you divide text around each character used as the delimiter or divide text using the entire delimiter. For example, if your delimiter is “abc’ then your text will be split around each character of “a”, “b”, and “c” if this argument is set to TRUE. If it set to FALSE your text will be split around “abc” together. This argument is optional and set to TRUE by default.
- remove_empty_text – this argument controls whether to remove empty text from the split cells. It is optional and set to TRUE by default. If set to FALSE empty cells will be added between consecutive delimiters
Here are a few examples of this is used:
In the above example, look at how my cell is split by the character used as my delimiter argument in the formula. In the 4th and 5th row of the example, pay attention to how changing the split_by_each argument from TRUE to FALSE affects how the cells are split. This should paint a more clear picture of how this function is used.
For more info on using the SPLIT function, check out my other resource on how to split a cell.
REGEXEXTRACT Function
REGEXEXTRACT is a powerful function for extracting data from a substring. What this function does is it will extract a substring of data that matches your specified regular expression.
The syntax of REGEXEXTRACT is:
=REGEXEXTRACT(text, regular_expression)
- text – the string to extract the regular expression from
- regular_expression – the part of the text that will be returned
With this function, you can return multiple results by using capture groups. These are part of patterns that are enclosed in parenthesis. If you do not use capture groups the function will return the entire match.
Google Sheets uses RE2 for regular expressions. You can find the full list of accepted values here. For a quick resource on the most common characters, check out this resource on the syntax for regular expressions.
An important thing to note is that when using regular expressions with this function if you place a plus sign (+) after your regular expression, your extracted value will display more than one character. If you do not use a plus sign, your formula will only return one character.
Here is an example of this function is used:
REGEXREPLACE Function
The REGEXREPLACE function will take a text string and replace part of it that matches a regular expression with a specified replacement text.
The syntax of this function is:
=REGEXREPLACE(text, regular_expression, replacement)
- text – the text that will be used and part of it replaced
- regular_expression – the part of the text that will be replaced
- replacement – the text that will replace the original text
The syntax for regular expressions used by REGEXREPLACE is the same as with REGEXTRACT, so please follow the links in the previous section to learn more about the syntax.
Here is an example of this function is used in a spreadsheet:
Extraction in Google Sheets Cheat Sheet
Now that we have covered the basics of the functions used for extractions, here are some formulas that you can copy and paste into your own sheets for some common extraction tasks:
Extract the Last Name
=RIGHT(A2,LEN(A2)-FIND(“*”,SUBSTITUTE(A2,” “,”*”,LEN(A2)-LEN(SUBSTITUTE(A2,” “,””)))))
Extract First Word/Name
=REGEXEXTRACT(A3,”[^[:space:]]+”)
Remove Punctuation
=REGEXREPLACE(A4,”[[:punct:]]”, “”)
Extract Text from String
=REGEXREPLACE(A5,”[^[:alpha:]]”, “”)
Extract Numbers from String
=VALUE(REGEXREPLACE(A6,”[^[:digit:]]”, “”))
Here is an example of each of these formulas in a spreadsheet:
Closing Thoughts
Understanding the different ways to parse strings and extract the desired characters is a powerful skill to develop in any spreadsheet program. As you work with the files with thousands and thousands of lines, being able to put together a formula to pull the data you need will greatly speed up productivity.
In this tutorial, I have covered some of the basics of these extraction functions. As you continue to develop your spreadsheet skills, you can combine some of these functions to create powerful parsing formulas.
More Google Sheets Tutorials:
If Cell Contains
How to Use the ISNUMBER Function
How to Remove the Last Character from a String
How to Use the LEN Function
How to Use the SUBSTITUTE Function