How Do I Check If A String Matches A Set Pattern In Python?
Solution 1:
import re
string = 'the apple is red'
re.search(r'^the (apple|orange|grape) is (red|orange|violet)', string)
Here's an example of it running:
In [20]: re.search(r'^the (apple|orange|grape) is (red|orange|violet)', string). groups()
Out[20]: ('apple', 'red')
If there are no matches then re.search()
will return nothing.
You may know "next to nothing about regex" but you nearly wrote the pattern.
The sections within the parentheses can also have their own regex patterns, too. So you could match "apple" and "apples" with
r'the (apple[s]*|orange|grape)
Solution 2:
The re
based solutions for this kind of problem work great. But it would sure be nice if there were an easy way to pull data out of strings in Python without have to learn regex (or to learn it AGAIN, which what I always end up having to do since my brain is broken).
Thankfully, someone took the time to write parse
.
parse
parse
is a nice package for this kind of thing. It uses regular expressions under the hood, but the API is based on the string
format specification mini-language, which most Python users will already be familiar with.
For a format spec you will use over and over again, you'd use parse.compile
. Here is an example:
>>>import parse>>>theaisb_parser = parse.compile('the {} is {}')>>>fruit, color = theaisb_parser.parse('the apple is red')>>>print(fruit, color)
apple red
parmatter
I have put a package I created for my own use on pypi in case others find it useful. It make things just a little bit nicer. It makes heavy usage of parse
. The idea is to combine the functionality of a string.Formatter
and a parse.Parser
into a single object, which I have called a parmatter
(also the package name).
The package contains a number of useful custom parmatter types. StaticParmatter
has a precompiled parsing specification (similar to the object from parse.compile
above). Use it like this:
>>>from parmatter import StaticParmatter>>>theaisb = StaticParmatter('the {} is {}')>>>print(theaisb.format('lizard', 'chartreuse'))
the lizard is chartreuse
>>>fruit, color = theaisb.unformat('the homynym is ogive')>>>print(fruit, color)
homynym ogive
Note that for "unformatting", the parse
package uses the method name parse
. However, my package uses unformat
. The reason for this is that parmatter
classes are subclassed from string.Formatter
, and string.Formatter
already has a .parse()
method (which provides different functionality). Additionally, I think unformat
is a more intuitive method name, anyway.
EDIT: see also my previous answer to another question, which discusses these packages as well.
Post a Comment for "How Do I Check If A String Matches A Set Pattern In Python?"