Nested Lambda Statements When Sorting Lists

January 31, 2024 Post a Comment

I wish to sort the below list first by the number, then by the text. lst = ['b-3', 'a-2', 'c-4', 'd-2'] # result: # ['a-2', 'd-2', 'b-3', 'c-4'] Attempt 1 res = sorted(lst, key=l

Solution 1:

There are 2 points to note:

One-line answers are not necessarily better. Using a named function is likely to make your code easier to read.
You are likely not looking for a nested lambda statement, as function composition is not part of the standard library (see Note #1). What you can do easily is have one lambda function return the result of anotherlambda function.

Therefore, the correct answer can found in Lambda inside lambda.

For your specific problem, you can use:

res = sorted(lst, key=lambda x: (lambda y: (int(y[1]), y[0]))(x.split('-')))

Remember that lambda is just a function. You can call it immediately after defining it, even on the same line.

Note #1: The 3rd party toolz library does allow composition:

from toolz import compose

res = sorted(lst, key=compose(lambda x: (int(x[1]), x[0]), lambda x: x.split('-')))

Note #2: As @chepner points out, the deficiency of this solution (repeated function calls) is one of the reasons why PEP-572 is ~~considered~~ implemented in Python 3.8.

Solution 2:

We can wrap the list returned by split('-') under another list and then we can use a loop to handle it:

# Using list-comprehension
>>> sorted(lst, key=lambda x: [(int(num), text) for text, num in [x.split('-')]])
['a-2', 'd-2', 'b-3', 'c-4']
# Using next()
>>> sorted(lst, key=lambda x: next((int(num), text) for text, num in [x.split('-')]))
['a-2', 'd-2', 'b-3', 'c-4']

Solution 3:

In almost all cases I would simply go with your second attempt. It's readable and concise (I would prefer three simple lines over one complicated line every time!) - even though the function name could be more descriptive. But if you use it as local function that's not going to matter much.

You also have to remember that Python uses a key function, not a cmp (compare) function. So to sort an iterable of length n the key function is called exactly n times, but sorting generally does O(n * log(n)) comparisons. So whenever your key-function has an algorithmic complexity of O(1) the key-function call overhead isn't going to matter (much). That's because:

O(n*log(n)) + O(n)   ==  O(n*log(n))

There's one exception and that's the best case for Pythons sort: In the best case the sort only does O(n) comparisons but that only happens if the iterable is already sorted (or almost sorted). If Python had a compare function (and in Python 2 there really was one) then the constant factors of the function would be much more significant because it would be called O(n * log(n)) times (called once for each comparison).

So don't bother about being more concise or making it much faster (except when you can reduce the big-O without introducing too big constant factors - then you should go for it!), the first concern should be readability. So you should really not do any nested lambdas or any other fancy constructs (except maybe as exercise).

Long story short, simply use your #2:

defsorter_func(x):
    text, num = x.split('-')
    returnint(num), text

res = sorted(lst, key=sorter_func)

By the way, it's also the fastest of all proposed approaches (although the difference isn't much):

Summary: It's readable and fast!

Code to reproduce the benchmark. It requires simple_benchmark to be installed for this to work (Disclaimer: It's my own library) but there are probably equivalent frameworks to do this kind of task, but I'm just familiar with it:

# My specs: Windows 10, Python 3.6.6 (conda)import toolz
import iteration_utilities as it

defapproach_jpp_1(lst):
    returnsorted(lst, key=lambda x: (int(x.split('-')[1]), x.split('-')[0]))

defapproach_jpp_2(lst):
    defsorter_func(x):
        text, num = x.split('-')
        returnint(num), text
    returnsorted(lst, key=sorter_func)

defjpp_nested_lambda(lst):
    returnsorted(lst, key=lambda x: (lambda y: (int(y[1]), y[0]))(x.split('-')))

deftoolz_compose(lst):
    returnsorted(lst, key=toolz.compose(lambda x: (int(x[1]), x[0]), lambda x: x.split('-')))

defAshwiniChaudhary_list_comprehension(lst):
    returnsorted(lst, key=lambda x: [(int(num), text) for text, num in [x.split('-')]])

defAshwiniChaudhary_next(lst):
    returnsorted(lst, key=lambda x: next((int(num), text) for text, num in [x.split('-')]))

defPaulCornelius(lst):
    returnsorted(lst, key=lambda x: tuple(f(a) for f, a inzip((int, str), reversed(x.split('-')))))

defJeanFrançoisFabre(lst):
    returnsorted(lst, key=lambda s : [x if i elseint(x) for i,x inenumerate(reversed(s.split("-")))])

defiteration_utilities_chained(lst):
    returnsorted(lst, key=it.chained(lambda x: x.split('-'), lambda x: (int(x[1]), x[0])))

from simple_benchmark import benchmark
import random
import string

funcs = [
    approach_jpp_1, approach_jpp_2, jpp_nested_lambda, toolz_compose, AshwiniChaudhary_list_comprehension,
    AshwiniChaudhary_next, PaulCornelius, JeanFrançoisFabre, iteration_utilities_chained
]

arguments = {2**i: ['-'.join([random.choice(string.ascii_lowercase),
                              str(random.randint(0, 2**(i-1)))]) 
                    for _ inrange(2**i)] 
             for i inrange(3, 15)}

b = benchmark(funcs, arguments, 'list size')

%matplotlib notebook
b.plot_difference_percentage(relative_to=approach_jpp_2)

I took the liberty to include a function composition approach of one of my own libraries iteration_utilities.chained:

from iteration_utilities import chained
sorted(lst, key=chained(lambda x: x.split('-'), lambda x: (int(x[1]), x[0])))

It's quite fast (2nd or 3rd place) but still slower than using your own function.

Note that the key overhead would be more significant if you used a function that had O(n) (or better) algorithmic complexity, for example min or max. Then the constant factors of the key-function would be more significant!

Solution 4:

lst = ['b-3', 'a-2', 'c-4', 'd-2']
res = sorted(lst, key=lambda x: tuple(f(a) for f, a inzip((int, str), reversed(x.split('-')))))
print(res)

['a-2', 'd-2', 'b-3', 'c-4']

Solution 5:

you could convert to integer only if the index of the item is 0 (when reversing the splitted list). The only object (besides the result of split) which is created is the 2-element list used for comparison. The rest are just iterators.

sorted(lst,key = lambda s : [x if i elseint(x) for i,x inenumerate(reversed(s.split("-")))])

As an aside, the - token isn't particularly great when numbers are involved, because it complicates the use of negative numbers (but can be solved with s.split("-",1)

alezinhacris