Skip to content Skip to sidebar Skip to footer

Iteration Over The Dictionary And Extracting Values

I have a dictionary (result_dict) as follows. {'11333216@N05': {'person': {'can_buy_pro': 0, 'description': {'_content': ''}, 'has_stats': '1', 'iconfarm': 3, 'iconserv

Solution 1:

Iterate through the dictionarys list of keys which in this case are the usernames, then use each one to access each top level dict and from there dive through all the other layers to find the exact data you need. The mobileurl in your example.

Once you have these 2 variables, add them to your dataframe.

# Iterate through list of usersfor user in result_dict.keys():

    # use each username to find the mobileurl you need within
    mobileurl = result_dict[user]["person"]["mobileurl"]["_content"]

    # Add the variables 'user' and 'mobileurl' to dataframe as you see fit

Solution 2:

result_dict = {'11333216@N05': {'person': {'can_buy_pro': 0,
   'description': {'_content': ''},
   'has_stats': '1',
   'iconfarm': 3,
   'iconserver': '2214',
   'id': '11333216@N05',
   'ispro': 0,
   'location': {'_content': ''},
   'mbox_sha1sum': {'_content': '8eb2e248cbad94e2b4a5aae75eb653c7e061a90c'},
   'mobileurl': {'_content': 'https://m.flickr.com/photostream.gne?id=11327876'},
   'nsid': '11333216@N05',
   'path_alias': 'kishansamarasinghe',
   'photos': {'count': {'_content': 442},
    'firstdate': {'_content': '1193073180'},
    'firstdatetaken': {'_content': '2000-01-01 00:49:17'}},
   'photosurl': {'_content': 'https://www.flickr.com/photos/kishansamarasinghe/'},
   'profileurl': {'_content': 'https://www.flickr.com/people/kishansamarasinghe/'},
   'realname': {'_content': 'Kishan Samarasinghe'},
   'timezone': {'label': 'Sri Jayawardenepura',
    'offset': '+06:00',
    'timezone_id': 'Asia/Colombo'},
   'username': {'_content': 'Three Sixty Five Degrees'}},
  'stat': 'ok'},
 '117692977@N08': {'person': {'can_buy_pro': 0,
   'description': {'_content': ''},
   'has_stats': '0',
   'iconfarm': 1,
   'iconserver': '404',
   'id': '117692977@N08',
   'ispro': 0,
   'location': {'_content': 'Almere, The Nederlands'},
   'mobileurl': {'_content': 'https://m.flickr.com/photostream.gne?id=117600164'},
   'nsid': '117692977@N08',
   'path_alias': 'meijsvo',
   'photos': {'count': {'_content': 3237},
    'firstdate': {'_content': '1392469161'},
    'firstdatetaken': {'_content': '2013-06-23 14:39:30'}},
   'photosurl': {'_content': 'https://www.flickr.com/photos/meijsvo/'},
   'profileurl': {'_content': 'https://www.flickr.com/people/meijsvo/'},
   'realname': {'_content': 'Markéta Eijsvogelová'},
   'timezone': {'label': 'Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna',
    'offset': '+01:00',
    'timezone_id': 'Europe/Amsterdam'},
   'username': {'_content': 'meijsvo'}},
  'stat': 'ok'},
 '21539776@N02': {'person': {'can_buy_pro': 0,
   'description': {'_content': ''},
   'has_stats': '1',
   'iconfarm': 0,
   'iconserver': '0'}
}
}

For your use case better use iteritems() of dictionary:

for key, valuein result_dict.iteritems():
    print value.get("person", {}).get("mobileurl", {}).get("_content")

OUTPUT

https://m.flickr.com/photostream.gne?id=117600164
https://m.flickr.com/photostream.gne?id=11327876

Solution 3:

I think you could also try to do it more of a pandas way instead of pure dictionary iteration. it's not necessarily the fastest but given you are new to python and pandas, I think it's good thing to know that pandas can handle this well.

I am assuming you are using pandas DataFrame, not just dictionary. you could easily achieve the same purpose without converting your json to pandas DataFrame. i.e. other answers will work even if you are not a pandas DataFrame. they are also valid python dictionary syntax.

urls = result_dict[result_dict.index=='person'].apply(lambda x: x['mobileurl']['_content'])

here we have selected all rows that have the index as person and then we tried to apply a function (lambda is the anonymous function we'll be using) to each person. In this case, we are extracting out the urls using the lambda function, then pandas converted the result back to a pandas DataFrame (or Series) for you to use.

normally I would also care about how fast my iteration is.

(following are done in IPython, a nice tool you could use to do many things in python. %%timeit is a magic function provided by IPython for you to calculate the time your codes could take)

%timeiturls= result_dict[result_dict.index=='person'].apply(lambda x: x['mobileurl']['_content'])

1000 loops, best of 3: 133 us per loop(us = microsecond, 10e-6)

@SamC provided the fast solution here I can let you know. but like i said, you don't need a DataFrame to use his solution. it'll also work for plain dictionary.

Post a Comment for "Iteration Over The Dictionary And Extracting Values"