Skip to content Skip to sidebar Skip to footer

Sort The Top Ten Results

I am getting a list in which I am saving the results in the following way City Percentage Mumbai 98.30 London 23.23 Agra 12.22 ..... List structure is [['Mumbai',98.30],['Lond

Solution 1:

If the list is fairly short then as others have suggested you can sort it and slice it. If the list is very large then you may be better using heapq.nlargest():

>>>import heapq>>>lis = [['Mumbai', 98.3], ['London', 23.23], ['Agra', 12.22]]>>>heapq.nlargest(2, lis, key=lambda x:x[1])
[['Mumbai', 98.3], ['London', 23.23]]

The difference is that nlargest only makes a single pass through the list and in fact if you are reading from a file or other generated source need not all be in memory at the same time.

You might also be interested to look at the source for nlargest() as it works in much the same way that you were trying to solve the problem: it keeps only the desired number of elements in a data structure known as a heap and each new value is pushed into the heap then the smallest value is popped from the heap.

Edit to show comparative timing:

>>> import random
>>> records = []
>>> for i inrange(100000):
    value = random.random() * 100
    records.append(('city {:2.4f}'.format(value), value))


>>> import heapq
>>> heapq.nlargest(10, records, key=lambda x:x[1])
[('city 99.9995', 99.99948904248298), ('city 99.9974', 99.99738898315216), ('city 99.9964', 99.99642759230214), ('city 99.9935', 99.99345173704319), ('city 99.9916', 99.99162694442714), ('city 99.9908', 99.99075084123544), ('city 99.9887', 99.98865134685201), ('city 99.9879', 99.98792632193258), ('city 99.9872', 99.98724339718686), ('city 99.9854', 99.98540548350132)]
>>> timeit.timeit('sorted(records, key=lambda x:x[1])[:10]', setup='from __main__ import records', number=10)
1.388942152229788>>> timeit.timeit('heapq.nlargest(10, records, key=lambda x:x[1])', setup='import heapq;from __main__ import records', number=10)
0.5476185073315492

On my system getting the top 10 from 100 records is fastest by sorting and slicing, but with 1,000 or more records it is faster to use nlargest.

Solution 2:

Sort the list first and then slice it:

>>> lis = [['Mumbai', 98.3], ['London', 23.23], ['Agra', 12.22]]
>>> printsorted(lis, key = lambda x : x[1], reverse = True)[:10] #[:10] returns first ten items
[['Mumbai', 98.3], ['London', 23.23], ['Agra', 12.22]]

To get data in list form from that file use this:

withopen('abc') as f:
    next(f)  #skip header 
    lis = [[city,float(val)]  for city, val in( line.split() for line in f)]
    print lis 
    #[['Mumbai', 98.3], ['London', 23.23], ['Agra', 12.22]]  

Update:

new_lis = sorted(sc_percentage, key = lambda x : x[1], reverse = True)[:10]
for item in new_lis:
   print item

sorted returns a new sorted list, as we need to sort the list based on the second item of each element so we used the key parameter.

key = lambda x : x[1] means use the value on the index 1(i.e 100.0, 75.0 etc) of each item for comparison.

reverse= True is used for reverse sorting.

Solution 3:

You have to convert your input into something Python can handle easily:

withopen('input.txt') as inputFile:
    lines = inputFile.readLines()
records = [ line.split() for line in lines ]
records = [ float(percentage), city for city, percentage in records ]

Now the records contain a list of the entries like this:

[ [ 98.3, 'Mumbai' ], [ 23.23, 'London' ], [ 12.22, Agra ] ]

You can sort that list in-place:

records.sort()

You can print the top ten by slicing:

print records[0:10]

If you have a huge list (e. g. millions of entries) and just want the top ten of these in a sorted way, there are better ways than sorting the whole list (which would be a waste of time then).

Solution 4:

For printing the top 10 cities you can use :

Sort the list first and then slice it:

>>> lis = [['Mumbai', 98.3], ['London', 23.23], ['Agra', 12.22]]
>>> [k[0] for k insorted(lis, key = lambda x : x[1], reverse = True)[:10]]
    ['Mumbai', 'London', 'Agra']

For the given list

 >>>: lis=[("<ServiceCenter: DELHI-DLC>", 100.0),("<ServiceCenter: DELHI-DLW>", 92.307692307692307),("<ServiceCenter: DELHI-DLE>", 75.0),("<ServiceCenter: DELHI-DLN>", 90.909090909090907),("<ServiceCenter: DELHI-DLS>", 83.333333333333343)]

 >>>:t=[k[0] for k insorted(lis, key = lambda x : x[1], reverse = True)[:10]]
 >>>:print t
['<ServiceCenter: DELHI-DLC>',
'<ServiceCenter: DELHI-DLW>',
'<ServiceCenter: DELHI-DLN>',
'<ServiceCenter: DELHI-DLS>',
'<ServiceCenter: DELHI-DLE>']

Sorted function returns the sorted list with key as the compare function .

Post a Comment for "Sort The Top Ten Results"