Skip to content Skip to sidebar Skip to footer

Aggregate Sets According To Keys With Defaultdict Python

I have a bunch of lines in text with names and teams in this format: Team (year)|Surname1, Name1 e.g. Yankees (1993)|Abbot, Jim Yankees (1994)|Abbot, Jim Yankees (1993)|Assenmache

Solution 1:

You can use a tuple as a key here, for eg. ('Yankees', '1994'):

from collections import defaultdict
dic = defaultdict(list)
withopen('abc') as f:
    for line in f:
        key,val  = line.split('|')
        keys = tuple(x.strip('()') for x in key.split())
        vals = [x.strip() for x in val.split(', ')]
        dic[keys].append(vals)
print dic
for k,v in dic.iteritems():
    print"{}({})|{}".format(k[0],k[1],"|".join([", ".join(x) for x in v]))

Output:

defaultdict(<type'list'>, 
{('Yankees', '1994'): [['Abbot', 'Jim']],
 ('Yankees', '2000'): [['Buddies', 'Mike'], ['Canseco', 'Jose']],
 ('Yankees', '1993'): [['Abbot', 'Jim'], ['Assenmacher', 'Paul']]})

Yankees(1994)|Abbot, Jim
Yankees(2000)|Buddies, Mike|Canseco, Jose
Yankees(1993)|Abbot, Jim|Assenmacher, Paul

Post a Comment for "Aggregate Sets According To Keys With Defaultdict Python"