Python: Remove Elements From The List Which Are Prefix Of Other
Solution 1:
If your list is sorted, every element is either a prefix of the next one, or not a prefix of any of them. Therefore, you can write:
ls.sort()
[ls[i] for i in range(len(ls))[:-1] ifls[i] != ls[i+1][:len(ls[i])]] + [ls[-1]]
This will be n log(n)
sorting plus one pass through the list (n
).
For your current sorted list, it is marginally quicker as well, because it is linear, timeit gives 2.11 us.
A slightly quicker implementation (but not asymptotically), and more pythonic as well, using zip
:
[x for x, y in zip(ls[:-1], ls[1:]) if x != y[:len(x)]] + [ls[-1]]
timeit gives 1.77 us
Solution 2:
List comprehension (ls
is the name of your input list):
[x for x inlsif x not in [y[:len(x)] for y inlsif y != x]]
I doubt it is the quickest in terms of performance, but the idea is very straightforward. You are going through the list element by element and checking if it is the prefix of any element in a list of all the rest of elements.
timeit result: 11.9 us per loop (though the scaling is more important if you are going to use it for large lists)
Solution 3:
ls.sort()
first if your list is originally unordered.
use startswith
:
In[71]: [i for i, j in zip(ls[:-1], ls[1:]) ifnotj.startswith(i)]+[ls[-1]]
Out[71]: ['ABCDEFG', 'BCD', 'DEFGHI', 'EF', 'GKL', 'JKLM']
or enumerate
:
[v for i, v in enumerate(ls[:-1]) if not ls[i+1].startswith(v)]+[ls[-1]]
Compared with @sashkello's approach:
In [78]: timeit [v for i, v in enumerate(ls[:-1]) if not ls[i+1].startswith(v)]+[ls[-1]]
10000 loops, best of 3: 29.6 us per loop
In [79]: timeit [i for i, j in zip(ls[:-1], ls[1:]) if not j.startswith(i)]+[ls[-1]]
10000 loops, best of 3: 28.5 us per loop
In [80]: timeit [x for x inlsif x not in [y[:len(x)] for y inlsif y != x]]
1000 loops, best of 3: 1.77 ms per loop
Post a Comment for "Python: Remove Elements From The List Which Are Prefix Of Other"