Retrieving Matching Word Count On A Datacolumn Using Pandas In Python
I have a df, Name Description Ram Ram is one of the good cricketer Sri Sri is one of the member Kumar Kumar is a keeper and a list, my_list=['one','good','ravi',
Solution 1:
Use str.findall
+ str.join
+ str.len
:
extracted = df['Description'].str.findall('(' + '|'.join(my_list) + ')')
df['keys'] = extracted.str.join(',')
df['count'] = extracted.str.len()
print (df)
Name Description keys count
0 Ram Ram is one of the good cricketer one,good 21 Sri Sri is one of the member one 1
EDIT:
import re
my_list=["ONE","good"]
extracted = df['Description'].str.findall('(' + '|'.join(my_list) + ')', flags=re.IGNORECASE)
df['keys'] = extracted.str.join(',')
df['count'] = extracted.str.len()
print (df)
Name Description keys count
0 Ram Ram is one of the good cricketer one,good 21 Sri Sri is one of the member one 1
Solution 2:
Took a shot at this with str.findall
.
c = df.Description.str.findall('({})'.format('|'.join(my_list)))
df['keys'] = c.apply(','.join) # or c.str.join(',')
df['count'] = c.str.len()
df[df['count'] >0]
Name Description keys count
0 Ram Ram isoneof the good cricketer one,good 21 Sri Sri isoneof the memberone1
Post a Comment for "Retrieving Matching Word Count On A Datacolumn Using Pandas In Python"