Hi, again thank you for your awesome tutorial. Could you please help me with this? I don’t know what I’m doing wrong, I tryed to capture ‘this’ in the multiline string bellow with the re.finditer and the re.findall methods. Even though both methods find the word ‘this’ they gave different results. Why is this? I find the re.finditer method more informative because it gives me the span, but it is not getting the whole matches in this case.

Also, I would like to know if it is possible to pass as parameter to the re methods more than one flag. I mean, if the first word in the multiline string change to “This” I would like to pass re.I, re.M (both flags) to the re methods.

multiline = '''
this is a really long string
the long string is long enough to prove that
this string in
this paper, work as expected'''

find = re.compile('this')
matches = find.finditer(multiline, re.M)
re.finditer(multiline, r'this', re.M)
for match in matches:
    print(match)

OUT>>> <re.Match object; span=(75,79), match='this'>
       <re.Match object; span=(75,79), match='this'>

re.findall('this', multiline, re.M)
OUT>>> ['this', 'this', 'this']

Thank you in advance.

raulfz on Jan. 30, 2021

Sorry, when I posted the previous message I haven’t finished the video, so now I know how to pass multiple flags, and even better how to apply to specific parts of my search.

Thank you!

raulfz on Feb. 1, 2021

Just in case someone finds the same issue I reported before, this is the problem. When I used .finditer() method I first used the re.compile() method. It’s in this re.compíle() method that I should have included the flags in order to work the same as with the findall() method.

Roy Telles on March 16, 2021

I still don’t quite understand why the fourth “SPAM” at 10:49 didn’t match. The video and transcript say it is because the “s” can be lowercase or uppercase, but then it goes on to say “literal small ‘p-a-m’” but I think it meant to say “capital ‘P-A-M’“. Considering the group contains only s’s it makes sense that the fourth doesn’t match because it would still be “sPAM” which doesn’t match ‘(s)pam’ in the regex. Thank you!

Christopher Trudeau RP Team on March 16, 2021

Hi Roy,

I believe you are talking about this snippet:

>>> re.findall("(?i:s)pam", "Spam, spam, spam, SPAM")
['Spam', 'spam', 'spam']

When I said “literal p-a-m”, what I mean is the literal portion of the small characters in the REGEX, not in the result. The “(?-i:s)” portion of the regex makes the “s” case insensitive, but only applies inside the brackets. This means “Spam” and “spam” match because the “pam” is small in those cases. The fourth “SPAM” doesn’t match because the “PAM” portion of the string doesn’t match the “pam” literal in the REGEX.

If you want to ignore case in the whole regex you are better off using “re.IGNORECASE” on the whole thing rather than the “(?-i:)” flag. If you did that then it would match the fourth “SPAM”.

>>> re.findall("spam", "Spam, spam, spam, SPAM", re.IGNORECASE)
['Spam', 'spam', 'spam', 'SPAM']

Note the difference between that and the piece of code that follows immediately after in the video where I’m using both the “(?-i:)” flag and the “re.IGNORECASE” global flag, where you get a different result.

>>> re.findall("(?-i:s)pam", "Spam, spam, spam, SPAM", re.IGNORECASE)
['spam', 'spam']

Hope that clears things up for you.

Become a Member to join the conversation.