How To Compare Two Dataframes On A Column And Replace With Other Column Value
I am having two data frames that are df1 and df2 id first last size A 1978-01-01 1979-01-01 2 B 2000-01-01 2000-01-01 1 C 1998-01-01 2000-01-01 3 D
Solution 1:
Using Merge
and combine_first
:
df = df1.merge(df2,how='outer')
df['id'] = df['token'].combine_first(df['id'] )
df.drop('token',inplace=True,axis=1)
Another way is to use replace
with dictionary of df2.values
, here the df1 dataframe changes.:
df1.id.replace(dict(df2.values),inplace=True)
idfirstlastsize0ZA.001978-01-01 1979-01-01 21As.112000-01-01 2000-01-01 12SD.341998-01-01 2000-01-01 33D1998-01-01 1998-01-01 14E1999-01-01 2000-01-01 2
Solution 2:
Use map
and fillna
:
df1['id'] = df1['id'].map(df2.set_index('id')['token']).fillna(df1['id'])
df1
Output:
idfirstlastsize0ZA.001978-01-01 1979-01-01 21As.112000-01-01 2000-01-01 12SD.341998-01-01 2000-01-01 33D1998-01-01 1998-01-01 14E1999-01-01 2000-01-01 2
You can use map
with a series as an argument.
Solution 3:
If you do not wish to merge your DataFrame, you could use apply function to solve this. Change your small DataFrame to dictionary and map it to the other DataFrame.
from io import StringIO #used to get string to dfimport pandas as pd
id_ =list('ABC')
token = 'ZA.00 As.11 SD.34'.split()
dt = pd.DataFrame(list(zip(id_,token)),columns=['id','token'])
a ='''
id first last size
A 1978-01-01 1979-01-01 2
B 2000-01-01 2000-01-01 1
C 1998-01-01 2000-01-01 3
D 1998-01-01 1998-01-01 1
E 1999-01-01 2000-01-01 2
'''
df =pd.read_csv(StringIO(a), sep=' ')
# This last two lines are all you need
mp= {x:y for x,y inzip(dt.id.tolist(),dt.token.tolist())}
df.id.apply(lambda x: mp[x] if x in mp.keys() else x)
Post a Comment for "How To Compare Two Dataframes On A Column And Replace With Other Column Value"