In this article, we'll show you how you can get the position of the matching string when using a regular expression.
Find the Position of the First Match Only
If you're simply looking for the index of the first match in a string based on a regular expression then using String.prototype.search()
is perhaps a good choice as it is built for exactly that. For example:
const regex = /foo*/; const str = 'american football, foosball, football'; const index = str.search(regex); console.log(index); // output: 9 console.log(str[index]); // output: 'f'
If a match is not found, search()
returns -1
.
Get the Position and Array of the First Match Only
Using String.prototype.match()
:
If you want the position of the first match as well as an array having the first complete match (along with the related capturing groups), then String.prototype.match()
is a good choice. For example:
const regex = /(?:, )?([a-z]*\s?foo[a-z]*)/i; const str = 'handball, american football, foosball'; const match = str.match(regex); console.log(match?.index); // output: 8 console.log(match?.[0]); // output: ', american football' console.log(match?.[1]); // output: 'american football' console.log(match.length); // output: 2
As you can see from the example above, the first array index is the first complete match, and all subsequent indexes include matches from capturing groups. If a match is not found, match()
returns null
.
To be able to use the index
property, it must be noted that the regular expression must not use the g
(global) flag because that would yield a different return.
Using RegExp.prototype.exec()
:
For a regular expression without the g
or y
flag, the exec()
method looks only for the first match (similar to str.match(regexp)
). For example:
const regex = /(?:, )?([a-z]*\s?foo[a-z]*)/i; const str = 'handball, american football, foosball'; const match = regex.exec(str); console.log(match?.index); // output: 8 console.log(match?.[0]); // output: ', american football' console.log(match?.[1]); // output: 'american football' console.log(match.length); // output: 2
As you can see, similar to match()
, the resulting array's first index is the first complete match, and all subsequent indexes include matches from capturing groups. Also, this method too, returns null
if a match is not found.
Unlike using match()
, the exec()
method does not yield a different result when using the g
flag. In fact, it can be used to iterate over multiple matches in a string (with capture groups), as opposed to just getting the matching strings with match()
.
Get the Array Along With the Start and End Position of All Matches
Using RegExp.prototype.exec()
Earlier, we looked at how we can use exec()
to find a single match. It can also be used to iterate over multiple matches found in a string by setting the g
or y
flags like so:
const regex = RegExp('foo*', 'g'); const str = 'american football, foosball, football'; const indexPairs = []; while (null !== (matchArr = regex.exec(str))) { indexPairs.push([matchArr.index, regex.lastIndex]); } console.log(indexPairs); // output: [9, 12], [19, 22], [29, 32]
Let's consider another, slight more complex, example:
const regex = /(?:, )?([a-z]*\s?foo[a-z]*)/gi; const str = 'handball, american football, foosball'; const indexPairs = []; while (null !== (matchArr = regex.exec(str))) { indexPairs.push([matchArr.index, matchArr.index + matchArr[1].length]); } console.log(indexPairs); // output: [8, 25], [27, 35]
The exec()
method can also be used for starting the search from a specific position.
Using String.prototype.matchAll()
Introduced in ES10 / ES2019, the matchAll()
method returns an iterator
object (i.e. an object that defines a sequence and returns a value upon its termination) that can be used with for...of
, array spread, or Array.from()
constructs. For example:
const regex = RegExp('foo*', 'g'); const str = 'american football, foosball, football'; const matches = str.matchAll(regex); const indexes = []; for (const match of matches) { indexes.push(match.index); } console.log(indexes); // output: [9, 19, 29]
Please note that matchAll()
internally makes a clone of the regular expression object — so, unlike using RegExp.prototype.exec()
, lastIndex
property does not change as the string is scanned.
Let's look at a slightly more complex example:
// ES10+ const regex = /(?:, )?([a-z]*\s?foo[a-z]*)/gi; const str = 'handball, american football, foosball'; const matches = str.matchAll(regex); const indexPairs = []; for (const match of matches) { indexPairs.push([match.index, match.index + match[1].length]); } console.log(indexPairs); // output: [8, 25], [27, 35] console.log([...matches]); // []
As you may have noticed in the example above, the iterable
object returned by matchAll
is not a restartable. This means that, for example, once the iterator is exhausted after the for..of
iteration, we must call matchAll()
again to create a new iterator if we need to restart. Or, alternatively, you can also use the spread syntax or Array.from()
to store the matches in an array so you can re-use the result:
// ES10+ const regex = /(?:, )?([a-z]*\s?foo[a-z]*)/gi; const str = 'handball, american football, foosball'; const matches = [...str.matchAll(regex)]; // or, `Array.from(str.matchAll(regex));` console.log(matches); // output: [ [', american football', 'american football'], [', foosball', 'foosball'] ] console.log(matches?.[0]?.index); // output: 8 console.log(matches?.[1]?.index); // output: 27
As you can see from the example above, the first match in each sub-array is the complete match and all subsequent indexes include matches from capturing groups.
Please note that matchAll()
throws an exception if the g
flag is not specified in the regular expression.
Get the Array Along With the Start and End Position of Matches Starting From a Specific Position
Another benefit of using the RegExp.prototype.exec()
method is that when it is used with the g
(or y
) flag, the search is performed on the string starting from position stored in the lastIndex
property. If there are any more matches in the string, then the lastIndex
property is set to the index immediately after the match. The lastIndex
property can also be manually set to any position to start the search from that specific position, for example:
const regex = RegExp('foo*', 'g'); const str = 'american football, foosball, football'; regex.lastIndex = 12; const indexPairs = []; while (null !== (matchArr = regex.exec(str))) { indexPairs.push([matchArr.index, regex.lastIndex]); } console.log(indexPairs); // output: [19, 22], [29, 32]
However, if we wanted to start searching from exactly at the given position (as opposed to starting anywhere after it), then we can use the y
(sticky) flag like so:
const regex = RegExp('foo*', 'y'); const str = 'american football, foosball, football'; regex.lastIndex = 18; const match1 = regex.exec(str); console.log(match1?.index); // output: undefined console.log(match1?.lastIndex); // output: undefined regex.lastIndex = 19; const match2 = regex.exec(str); console.log(match2?.index); // output: 19 console.log(match2?.lastIndex); // output: undefined
This post was published (and was last revised ) by Daniyal Hamid. Daniyal currently works as the Head of Engineering in Germany and has 20+ years of experience in software engineering, design and marketing. Please show your love and support by sharing this post.