复制列表中的字符串并将整数后缀添加到新添加的字符串中

2022-01-10 00:00:00 python list string performance duplicates

问题描述

假设我有一个列表:

l = ['a', 'b', 'c']

及其后缀列表:

l2 = ['a_1', 'b_1', 'c_1']

我想要的输出是:

out_l = ['a', 'a_1', 'b', 'b_2', 'c', 'c_3']

结果是上面两个列表的交错版本.

我可以编写常规的 for 循环来完成这项工作,但我想知道是否有更 Pythonic 的方式(例如,使用列表推导或 lambda)来完成它.

我尝试过这样的事情:

list(map(lambda x: x[1]+'_'+str(x[0]+1), enumerate(a)))# 这只返回 ['a_1', 'b_2', 'c_3']

此外,对于一般情况,即 l2 不一定是 l 派生的 2 个或更多列表,需要进行哪些更改?

解决方案

基准代码,供参考.

功能

def cs1(l):定义_cs1(l):对于 i, x in enumerate(l, 1):产量 x产量 f'{x}_{i}'返回列表(_cs1(l))定义 cs2(l):out_l = [无] * (len(l) * 2)out_l[::2] = lout_l[1::2] = [f'{x}_{i}' for i, x in enumerate(l, 1)]返回out_l定义 cs3(l):返回列表(chain.from_iterable(zip(l, [f'{x}_{i}' for i, x in enumerate(l, 1)])))定义阿贾克斯(l):返回 [i for b in [[a, '{}_{}'.format(a, i)]for i, a in enumerate(l, start=1)]对于我在 b]def ajax_cs0(l):# 建议改进 ajax 解决方案return [j for i, a in enumerate(l, 1) for j in [a, '{}_{}'.format(a, i)]]克里斯(左):返回 [值对于 zip 中的对(l,[f'{k}_{j+1}' 对于 j,k 在 enumerate(l)])对于 val 成对]

Suppose I have a list:

l = ['a', 'b', 'c']

And its suffix list:

l2 = ['a_1', 'b_1', 'c_1']

I'd like the desired output to be:

out_l = ['a', 'a_1', 'b', 'b_2', 'c', 'c_3']

The result is the interleaved version of the two lists above.

I can write regular for loop to get this done, but I'm wondering if there's a more Pythonic way (e.g., using list comprehension or lambda) to get it done.

I've tried something like this:

list(map(lambda x: x[1]+'_'+str(x[0]+1), enumerate(a)))
# this only returns ['a_1', 'b_2', 'c_3']

Furthermore, what changes would need to be made for the general case i.e., for 2 or more lists where l2 is not necessarily a derivative of l?

解决方案

yield

You can use a generator for an elegant solution. At each iteration, yield twice—once with the original element, and once with the element with the added suffix.

The generator will need to be exhausted; that can be done by tacking on a list call at the end.

def transform(l):
    for i, x in enumerate(l, 1):
        yield x
        yield f'{x}_{i}'  # {}_{}'.format(x, i)

You can also re-write this using the yield from syntax for generator delegation:

def transform(l):
    for i, x in enumerate(l, 1):
        yield from (x, f'{x}_{i}') # (x, {}_{}'.format(x, i))

out_l = list(transform(l))
print(out_l)
['a', 'a_1', 'b', 'b_2', 'c', 'c_3']

If you're on versions older than python-3.6, replace f'{x}_{i}' with '{}_{}'.format(x, i).

Generalising
Consider a general scenario where you have N lists of the form:

l1 = [v11, v12, ...]
l2 = [v21, v22, ...]
l3 = [v31, v32, ...]
...

Which you would like to interleave. These lists are not necessarily derived from each other.

To handle interleaving operations with these N lists, you'll need to iterate over pairs:

def transformN(*args):
    for vals in zip(*args):
        yield from vals

out_l = transformN(l1, l2, l3, ...)


Sliced list.__setitem__

I'd recommend this from the perspective of performance. First allocate space for an empty list, and then assign list items to their appropriate positions using sliced list assignment. l goes into even indexes, and l' (l modified) goes into odd indexes.

out_l = [None] * (len(l) * 2)
out_l[::2] = l
out_l[1::2] = [f'{x}_{i}' for i, x in enumerate(l, 1)]  # [{}_{}'.format(x, i) ...]

print(out_l)
['a', 'a_1', 'b', 'b_2', 'c', 'c_3']

This is consistently the fastest from my timings (below).

Generalising
To handle N lists, iteratively assign to slices.

list_of_lists = [l1, l2, ...]

out_l = [None] * len(list_of_lists[0]) * len(list_of_lists)
for i, l in enumerate(list_of_lists):
    out_l[i::2] = l


zip + chain.from_iterable

A functional approach, similar to @chrisz' solution. Construct pairs using zip and then flatten it using itertools.chain.

from itertools import chain
# [{}_{}'.format(x, i) ...]
out_l = list(chain.from_iterable(zip(l, [f'{x}_{i}' for i, x in enumerate(l, 1)]))) 

print(out_l)
['a', 'a_1', 'b', 'b_2', 'c', 'c_3']

iterools.chain is widely regarded as the pythonic list flattening approach.

Generalising
This is the simplest solution to generalise, and I suspect the most efficient for multiple lists when N is large.

list_of_lists = [l1, l2, ...]
out_l = list(chain.from_iterable(zip(*list_of_lists)))


Performance

Let's take a look at some perf-tests for the simple case of two lists (one list with its suffix). General cases will not be tested since the results widely vary with by data.

Benchmarking code, for reference.

Functions

def cs1(l):
    def _cs1(l):
        for i, x in enumerate(l, 1):
            yield x
            yield f'{x}_{i}'

    return list(_cs1(l))

def cs2(l):
    out_l = [None] * (len(l) * 2)
    out_l[::2] = l
    out_l[1::2] = [f'{x}_{i}' for i, x in enumerate(l, 1)]

    return out_l

def cs3(l):
    return list(chain.from_iterable(
        zip(l, [f'{x}_{i}' for i, x in enumerate(l, 1)])))

def ajax(l):
    return [
        i for b in [[a, '{}_{}'.format(a, i)] 
        for i, a in enumerate(l, start=1)] 
        for i in b
    ]

def ajax_cs0(l):
    # suggested improvement to ajax solution
    return [j for i, a in enumerate(l, 1) for j in [a, '{}_{}'.format(a, i)]]

def chrisz(l):
    return [
        val 
        for pair in zip(l, [f'{k}_{j+1}' for j, k in enumerate(l)]) 
        for val in pair
    ]

相关文章