std::regex_iterator
Prerequisites
1. What is an std::regex_iterator
std::regex_iterator is an iterator used to iterate over all regex matches in a string.
Instead of finding just one match, it allows you to loop through every match
1
| string + regex → multiple matches → iterate one by one
|
1
2
3
4
5
| #include <regex>
std::regex pattern("...");
std::sregex_iterator it(begin, end, pattern);
std::sregex_iterator end;
|
Example
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
| #include <iostream>
#include <regex>
int main()
{
std::string text = "abc 123 def 456";
std::regex pattern("\\d+"); // match numbers
std::sregex_iterator it(text.begin(), text.end(), pattern);
std::sregex_iterator end;
for (; it != end; ++it)
std::cout << it->str() << "\n";
}
|
What Does Iterator Hold?
Each iterator element is:
1
| std::match_results<std::string::const_iterator>
|
✔️ Access Data
1
2
3
| it->str(); // full match
it->position(); // position in string
it->length(); // length of match
|
When we use:
- log parsing
- extracting numbers
- parsing structured text
- simple DSL parsing
- config parsing
2. Capture Groups
Example
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
| #include <iostream>
#include <regex>
int main()
{
std::string text = "Name: Kang Age: 30";
std::regex pattern("(\\w+):\\s*(\\w+)");
std::sregex_iterator it(text.begin(), text.end(), pattern);
std::sregex_iterator end;
for (; it != end; ++it)
{
std::cout << "Key: " << (*it)[1] << "\n";
std::cout << "Value: " << (*it)[2] << "\n";
}
}
|
1
2
3
4
| Key: Name
Value: Kang
Key: Age
Value: 30
|
1
2
3
4
5
| (\\w+):\\s*(\\w+)
1. it[0]: (\\w+):\\s*(\\w+)
2. it[1]: first parentheses (\\w+)
2. it[2]: second parentheses (\\w+)
|
| Type | Description |
|---|
std::sregex_iterator | string iterator |
std::cregex_iterator | C-string iterator |
std::regex_iterator | generic template |
Regex is Expensive
- Uses complex pattern engine
- Backtracking possible
- Not cache-friendly
| Method | Speed |
|---|
| manual parsing | 🔥 fastest |
| string functions | ⚡ fast |
| regex_iterator | 🐢 slow |
3-1. Optimization Tips
1
| std::regex pattern("...");
|
4. regex_iterator vs regex_search
regex_search
1
2
| std::smatch match;
std::regex_search(text, match, pattern);
|
1
2
3
4
5
6
7
| std::string text = "abc 123 def 456";
std::regex pattern("\\d+");
std::smatch match;
std::regex_search(text, match, pattern);
std::cout << match.str();
|
finds first match only
regex_iterator
1
2
3
4
5
6
7
8
| std::string text = "abc 123 def 456";
std::regex pattern("\\d+");
std::sregex_iterator it(text.begin(), text.end(), pattern);
std::sregex_iterator end;
for (; it != end; ++it)
std::cout << it->str() << "\n";
|
finds all matches
5. Common Mistakes
❌ Recreating regex inside loop
1
2
3
4
| for (...)
{
std::regex pattern("..."); // ❌ slow
}
|
❌ Forgetting escape
1
2
| "\d+" // ❌ wrong
"\\d+" // ✅ correct
|
👉 regex is not for high-frequency loops
6. Example
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
| #include <iostream>
#include <regex>
int main()
{
std::string text = "Contact: test@mail.com or hello@world.com";
std::regex pattern("\\w+@\\w+\\.\\w+");
std::sregex_iterator it(text.begin(), text.end(), pattern);
std::sregex_iterator end;
for (; it != end; ++it)
std::cout << it->str() << "\n";
}
|