How To Turn A Pandas Dataframe Row Into A Comma Separated String

March 31, 2024 Post a Comment

I need to iterate over each row of a pandas df and turn this into a comma separated string. example: df3 = DataFrame(np.random.randn(10, 5), columns=['a', 'b', 'c', '

Solution 1:

You could use pandas.DataFrame.to_string with some optional arguments set to False and then split on newline characters to get a list of your strings. This feels a little dirty though.

x = df3.to_string(header=False,
                  index=False,
                  index_names=False).split('\n')
vals = [','.join(ele.split()) for ele in x]
print(vals)

Outputs:

['1.221365,0.923175,-1.286149,-0.153414,-0.005078', '-0.231824,-1.131186,0.853728,0.160349,1.000170', '-0.147145,0.310587,-0.388535,0.957730,-0.185315', '-1.658463,-1.114204,0.760424,-1.504126,0.206909', '-0.734571,0.908569,-0.698583,-0.692417,-0.768087', '0.000029,0.204140,-0.483123,-1.064851,-0.835931', '-0.108869,0.426260,0.107286,-1.184402,0.434607', '-0.692160,-0.376433,0.567188,-0.171867,-0.822502', '-0.564726,-1.084698,-1.065283,-2.335092,-0.083357', '-1.429049,0.790535,-0.547701,-0.684346,2.048081']

Solution 2:

Use to_csv:

df = pd.DataFrame(np.random.randn(10, 5),
                  columns=['a', 'b', 'c', 'd', 'e'])
df.to_csv(header=None, index=False).strip('\n').split('\n')

['-1.60092768589,-0.746496859432,0.662527724304,-0.677984969682,1.70656657572',
 '-0.432306620615,-0.396499851892,0.564494290965,-1.01196068617,-0.630576490671',
 '-3.28916785414,0.627240166663,-0.359262938883,0.344156143177,-0.911269843378',
 '-0.272741450301,0.0594234886507,-2.72800253986,-0.821610087419,-0.0668212419497',
 '0.303490090149,-1.61344483051,0.117046351282,-1.46936429231,-0.66018613208',
 '-1.18157229705,-0.766519504863,0.386180129978,0.945274532852,-0.783459830884',
 '-1.27118723107,-1.12478330038,-0.625470220821,-0.453053132109,0.0641830786961',
 '-1.02657336234,-1.01556460318,0.445282883845,0.589873985417,-0.833648685855',
 '0.742343897524,-1.69644542886,-1.03886940911,0.511317569685,1.87084848086',
 '-0.159125435887,1.02522202275,0.254459603867,-0.487187861352,2.31900012693']

Note: this needs to be improved if you have \n in your cells.

Solution 3:

Another solution would be,

df3.astype(str).values.flatten().tolist()

O/P:

['1.1298859039670908', '-1.1777990747688836', '-0.6863185575934238', '0.5728124523079394', '-1.7233889745416526', '1.2666884675345114', '-1.3370517489515568', '-1.1573192462004067', '-0.290889463035692', '0.7013992501326347', '-0.09235695278417168', '1.3398108023557909', '0.9348249877283498', '-1.420127356751191', '-0.23280615612717087', '-1.513041006340331', '0.06922064806964501', '0.5021357843647933', '0.4959105452630504', '0.23892842483496426', '0.332581693920347', '-0.9182302226268196', '0.4043812352905833', '1.2214146329445081', '-1.875277093248708', '0.3102747423859147', '-0.12406718601423607', '0.5281816415364707', '-1.9067143330181668', '0.8256856659897251', '2.294853355922203', '0.43835574399588956', '-1.1421958903284741', '1.1281755826789093', '-1.6942129677694633', '2.0015273318589077', '0.22546177660127778', '0.8744192315520689', '0.9149788977962425', '0.03312768429116076', '-0.8790198630064502', '1.1123149455982901', '1.0360823000160735', '0.3897776338002864', '1.6653054797315376', '-0.7959569835943457', '0.48684356819991087', '-0.1753603906083526', '1.3546473604252465', '0.8654506220249256']

If quotes only required for each row use,

r = [' '.join(val) for val in df3.astype(str).values.tolist()]

O/P:

['0.3453242505851785 0.8361952965566127 1.2140062332333457 -0.8449248124906361 -0.6596860872608944', '-1.9416389611147358 -0.4633998192182761 1.3156114084151638 0.31541640373981894 0.10017585641945598', '0.019222312957353865 -0.11572754659609137 -0.7475957688634534 1.732958781671217 0.8924926838936247', '1.2809958570913833 -0.5157436785751306 -0.2568307974248332 1.6223279831092197 1.4686281000013306', '0.2487576796276271 0.8129564817069422 0.8887583094926109 -0.8716446795448696 0.3920966638278787', '0.8033846996636256 -0.6320480733526924 0.17875269847270434 -0.5659865172511531 0.2259891796497471', '-1.6220463818040864 0.690201620286483 -0.7124446718694878 -0.271001366710889 1.1809699288238422', '1.800615079476972 0.04891756117369832 -1.1063732305386178 0.13042352385167277 0.5329078065025347', '0.00021395065919010197 -0.6429306637453445 -0.4281903648631154 0.2640659501478122 -0.3906892322707482', '-0.4159606749623029 0.7992377301053033 -0.8126018881734699 -1.2516267025391803 -0.17085205523095087']

Solution 4:

Here is a one liner:

df3['Combo'] = df3[df3.columns].astype(str).apply(lambda x: ', '.join(x), axis = 1)

This creates a new dataframe column whose rows are csv strings (containing the contents of all the other columns)

before:

import numpy as np
import pandas as pd

df3 = pd.DataFrame(np.random.randn(10, 5),
              columns=['a', 'b', 'c', 'd', 'e'])


    a            b           c           d           e
00.870579    -1.356070   -0.169689   -0.148766   -1.52096510.292316    -1.703772   -1.245149   -1.565364   -1.8968582   -2.204210   -0.073636   -0.457303   -0.5478520.87687431.021075    -1.227874   -0.7925601.628169    -0.68546140.6995790.7368211.1430531.1891831.5533245   -2.1667491.011902    -0.6058161.184308    -0.42720561.965086    -0.0538220.1006141.045595    -0.46447472.3857800.5409200.7905061.148555    -1.1393258   -0.581308   -0.5759560.285963    -0.535575   -0.19598091.5359280.927238    -0.5138970.7118121.172479

the code:

df3['Combo'] = df3[df3.columns].astype(str).apply(lambda x: ', '.join(x), axis = 1)

after

ab           c           d           e          Combo
00.870579    -1.356070   -0.169689   -0.148766   -1.5209650.8705793801134621, -1.356070467974009, -0.169...
10.292316    -1.703772   -1.245149   -1.565364   -1.8968580.29231630010496074, -1.7037715557607054, -1.2...
2   -2.204210   -0.073636   -0.457303   -0.5478520.876874    -2.2042103265194823, -0.07363572968327593, -0....
31.021075    -1.227874   -0.7925601.628169    -0.6854611.0210749768623664, -1.227874362438, -0.792560...
40.6995790.7368211.1430531.1891831.5533240.6995791505249452, 0.7368206760352145, 1.1430...
5   -2.1667491.011902    -0.6058161.184308    -0.427205   -2.166749201299601, 1.0119015881974436, -0.605...
61.965086    -0.0538220.1006141.045595    -0.4644741.9650863537016798, -0.05382210788746324, 0.10...
72.3857800.5409200.7905061.148555    -1.1393252.3857802491033384, 0.5409195922501099, 0.7905...
8   -0.581308   -0.5759560.285963    -0.535575   -0.195980   -0.5813081184052638, -0.5759559119431503, 0.28...
91.5359280.927238    -0.5138970.7118121.1724791.5359276629230108, 0.927237601422893, -0.5138...

Solution 5:

You can canvert DataFrame to numpy.array by values and then generate strings:

b = '\n'.join(','.join('%0.3f' %x for x in y) for y in df.values)
print (b)
-1.245,-0.397,-0.374,0.698,-0.057
-1.695,-1.593,0.992,-1.839,0.980
1.154,-0.322,-0.583,1.022,1.800
-1.705,0.148,-0.670,0.164,0.902
1.573,-1.082,-0.243,-1.190,0.832
2.535,-1.168,-0.258,-2.617,-0.766
1.990,0.607,-0.115,0.114,0.175
-0.652,0.245,-1.501,0.145,-0.079
-1.977,3.543,-0.454,1.697,-0.648
-0.756,0.561,-1.294,-0.747,-0.323

If need strings in list:

b = list(','.join('%0.3f' %x for x in y) for y in df.values)
print (b)
['-1.139,0.257,-1.132,-0.987,1.194', '0.799,-1.061,-1.073,-0.176,0.528', '0.527,0.333,-0.185,-0.496,0.115', '-1.567,0.268,-1.457,2.121,-0.065', '-0.854,-2.344,0.747,0.208,-0.403', '1.850,0.084,1.890,-1.458,0.427', '1.649,0.134,-2.314,1.618,0.658', '2.178,-0.823,-0.499,0.083,-0.269', '-0.781,-0.212,1.623,-0.053,0.436', '0.842,-0.167,1.914,-0.087,0.717']

howtostartbloggingformoney