How To Retrieve Partial Matches From A List Of Strings

January 30, 2024 Post a Comment

For approaches to retrieving partial matches in a numeric list, go to: How to return a subset of a list that matches a condition? Python: Find in list But if you're looking fo

Solution 1:

startswith and in, return a Boolean
The in operator is a test of membership.
This can be performed with a list-comprehension or filter
Using a list-comprehension, with in, is the fastest implementation tested.
If case is not an issue, consider mapping all the words to lowercase.
- l = list(map(str.lower, l)).
Tested with python 3.10.0

`filter`:

Using filter creates a filter object, so list() is used to show all the matching values in a list.

l = ['ones', 'twos', 'threes']
wanted = 'three'# using startswith
result = list(filter(lambda x: x.startswith(wanted), l))

# using in
result = list(filter(lambda x: wanted in x, l))

print(result)
[out]:
['threes']

`list-comprehension`

l = ['ones', 'twos', 'threes']
wanted = 'three'# using startswith
result = [v for v in l if v.startswith(wanted)]

# using in
result = [v for v in l if wanted in v]

print(result)
[out]:
['threes']

Which implementation is faster?

Tested in Jupyter Lab using the words corpus from nltk v3.6.5, which has 236736 words
Words with 'three'
- ['three', 'threefold', 'threefolded', 'threefoldedness', 'threefoldly', 'threefoldness', 'threeling', 'threeness', 'threepence', 'threepenny', 'threepennyworth', 'threescore', 'threesome']

from nltk.corpus import words

%timeit list(filter(lambda x: x.startswith(wanted), words.words()))
[out]:
64.8 ms ± 856 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

%timeit list(filter(lambda x: wanted in x, words.words()))
[out]:
54.8 ms ± 528 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

%timeit [v for v in words.words() if v.startswith(wanted)]
[out]:
57.5 ms ± 634 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

%timeit [v for v in words.words() if wanted in v]
[out]:
50.2 ms ± 791 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

Solution 2:

Instead of returning the result of the any() function, you can use a for-loop to look for the string instead:

deffind_match(string_list, wanted):
    for string in string_list:
        if string.startswith(wanted):
            return string
    returnNone>>> find_match(['ones', 'twos', 'threes'], "three")
'threes'

Solution 3:

A simple, direct answer:

test_list = ['one', 'two','threefour']
r = [s for s in test_list if s.startswith('three')]
print(r[0] if r else'nomatch')

Result:

threefour

Not sure what you want to do in the non-matching case. r[0] is exactly what you asked for if there is a match, but it's undefined if there is no match. The print deals with this, but you may want to do so differently.

Solution 4:

I'd say the most closely related solution would be to use next instead of any:

>>> next((s for s in l if s.startswith(wanted)), 'mydefault')
'threes'>>> next((s for s in l if s.startswith('blarg')), 'mydefault')
'mydefault'

Just like any, it stops the search as soon as it found a match, and only takes O(1) space. Unlike the list comprehension solutions, which always process the whole list and take O(n) space.

Ooh, alternatively just use any as is but remember the last checked element:

>>>ifany((match := s).startswith(wanted) for s in l):
        print(match)

threes
>>>ifany((match := s).startswith('blarg') for s in l):
        print(match)

>>>

Another variation, only assign the matching element:

>>> if any(s.startswith(wanted) and (match := s) for s in l):
        print(match)

threes

(Might want to include something like or True if a matching s could be the empty string.)

Solution 5:

this seems simple to me so i might have misread but you could just run it through a foor loop w/ an if statement;

l = ['ones', 'twos', 'threes']
wanted = 'three'

def run():
    for s in l:
        if (s.startswith(wanted)):
            return s

print(run())

output: threes

alezinhacris

How To Retrieve Partial Matches From A List Of Strings

Solution 1:

`filter`:

`list-comprehension`

Which implementation is faster?

Solution 2:

Solution 3:

Solution 4:

Solution 5:

Post a Comment for "How To Retrieve Partial Matches From A List Of Strings"

Widget HTML #3

How To Retrieve Partial Matches From A List Of Strings

Solution 1:

filter:

list-comprehension

Which implementation is faster?

Solution 2:

Solution 3:

Solution 4:

Solution 5:

Post a Comment for "How To Retrieve Partial Matches From A List Of Strings"

Widget HTML #3

`filter`:

`list-comprehension`