Matching Values From One Csv File To Another And Replace Entire Column Using Pandas/python
Consider the following example: I have a dataset of Movielens- u.item.csv ID|MOVIE NAME (YEAR)|REL.DATE|NULL|IMDB LINK|A|B|C|D|E|F|G|H|I|J|K|L|M|N|O|P|Q|R|S| 1|Toy Story (1995)|01-
Solution 1:
I think you need map
by Series
created by set_index
:
print (df1.set_index('ID')['MOVIE NAME (YEAR)'])
ID
1 Toy Story(1995)2 GoldenEye (1995)
3 Four Rooms(1995)
Name: MOVIE NAME(YEAR), dtype: object
df2['movie_id'] = df2['movie_id'].map(df1.set_index('ID')['MOVIE NAME (YEAR)'])
print (df2)
user_id movie_id rating unix_timestamp
01 Toy Story(1995)587496575811 GoldenEye (1995) 387689317121 Four Rooms(1995)4878542960
Or use replace
:
df2['movie_id'] = df2['movie_id'].replace(df1.set_index('ID')['MOVIE NAME (YEAR)'])
print (df2)
user_id movie_id rating unix_timestamp
01 Toy Story(1995)587496575811 GoldenEye (1995) 387689317121 Four Rooms(1995)4878542960
Difference is if not match, map
create NaN
and replace let original value:
print (df2)
user_id movie_id rating unix_timestamp
011587496575811238768931712154878542960 <- 5 not match
df2['movie_id'] = df2['movie_id'].map(df1.set_index('ID')['MOVIE NAME (YEAR)'])
print (df2)
user_id movie_id rating unix_timestamp
01 Toy Story (1995) 587496575811 GoldenEye (1995) 387689317121 NaN 4878542960
df2['movie_id'] = df2['movie_id'].replace(df1.set_index('ID')['MOVIE NAME (YEAR)'])
print (df2)
user_id movie_id rating unix_timestamp
0 1 Toy Story (1995) 5 874965758
1 1 GoldenEye (1995) 3 876893171
2 1 5 4 878542960
Post a Comment for "Matching Values From One Csv File To Another And Replace Entire Column Using Pandas/python"