[ Python: regex vs find(), strip() ]
I am learning Python, and need to format "From" fields received from IMAP. I tried it using str.find()
and str.strip()
, and also using regex. With find(), etc. my function runs quite a bit faster than with re (I timed it). So, when is it better to use re? Does anybody have any good links/articles related to that? Python documentation obviously doesn't mention that...
Answer 1
find
only matches an exact sequence of characters, while a regular expression matches a pattern. Naturally only looking an for exact sequence is faster (even if your regex pattern is also an exact sequence, there is still some overhead involved).
As a consequence of the above, you should use find
if you know the exact sequence, and a regular expression (or something else) when you don't. The exact approach you should use really depends on the complexity of the problem you face.
As a side note, the python re
module provides a compile
method that allows you to pre-compile a regex if you are going to be using it repeatedly. This can substantially improve speed if you are using the same pattern many times.
Answer 2
If you intend to do something complex you should use re
. It is more scalable than using string methods.
String methods are good for doing something simple and not worth bothering with regular expressions.
So, it depends on what are you doing, but usually you should use regular expressions since they are more powerful.