Skip to content Skip to sidebar Skip to footer

Match Length Of Two Python Lists

I have two Python lists of different length. One may assume that one of the lists is multiple times larger than the other one. Both lists contain the same physical data but capture

Solution 1:

Here's a shortened version of the code (not necessarily better performance):

a = [1,2,3,4,5,6,7,8,9,10]
b = [1,4.5,6.9]
order = 0# To determine a and b.iflen(b) > len(a):
    a, b = b, a  # swap the values so that 'a' is always larger.
    order = 1

div = len(a) / len(b)  # In Python2, this already gives the floor.
a = a[::div][:len(b)]

if order:
    print b
    print a
else:
    print a
    print b

Since you're ultimately discarding some of the latter elements of the larger list, an explicit for loop may increase performance, as then you don't have to "jump" to the values which will be discarded:

new_a = []
jump = len(b)
index = 0
for i in range(jump):
    new_a.append(a[index])
    index += jump
a = new_a

Solution 2:

First off, for performance you should be using numpy. The questions been tagged with numpy, so maybe you already are, and didn't show it, but in any case the lists can be converted to numpy arrays with

import numpy as np
a = np.array(a)
b = np.array(b)

Indexing is the same. It's possible to use len on arrays, but array.shape is more general, giving the following (very similar) code.

a[::a.shape[0] // b.shape[0]]

Performance wise, this should give a large boost in speed for most data. Testing with a much larger a and b array (10e6, and 1e6 elements respectively), shows that numpy can give a large increase in performance.

a = np.ones(10000000)
b = np.ones(1000000)

%timeit a[::a.shape[0] // b.shape[0]]  # Numpy arrays
1000000 loops, best of 3: 348 ns per loop

a = list(a); 
b = list(b);
%timeit a[::len(a) // len(b)]    # Plain old python lists
1000000 loops, best of 3: 29.5 ms per loop

Solution 3:

If you're iterating over the list, you could use a generator so you don't have to copy the whole thing to memory.

from __future__ import division

a = [1,2,3,4,5,6,7,8,9,10]
b = [1,4.5,6.9]

defzip_downsample(a, b):
    iflen(a) > len(b):
        b, a = a, b  # make b the longer listfor i in xrange(len(a)):
        yield a[i], b[i * len(b) // len(a)]

for z in zip_downsample(a, b):
    print z

Solution 4:

#a = [1,2,3,4,5,6,7,8,9,10]#b = [1,4.5,6.9]

a, b = zip(*zip(a, b))

# a = [1, 2, 3]# b = [1, 4.5, 6.9]

The inner zip combines the lists into pars, discarding the excess items from the larger list, returning something like [(1, 1), (2, 4.5), (3, 6.9)]. The outer zip then performs the inverse of this (since we unpack with the * operator), but since we have discarded the excess with the first zip, the lists should be the same size. This returns as [a, b] so we then unpack to the respective variables (a, b = ...).

See https://www.programiz.com/python-programming/methods/built-in/zip for more info on zip and using it as it's own inverse

Post a Comment for "Match Length Of Two Python Lists"