Spark: How To Transform Json String With Multiple Keys, From Data Frame Rows?
I'm looking for a help, how to parse json string with multiple keys to json struct, see required output. Answer below shows how to transform JSON string with one Id : jstr1 = '{'i
Solution 1:
An explode
of the map column will do the job:
import pyspark.sql.functions as F
df.select(F.explode('json').alias('id', 'json')).show()
+----+--------------------+
| id| json|
+----+--------------------+
|id_1| [[1, 2], [3, 4]]|
|id_2| [[5, 6], [7, 8]]|
|id_3| [[9, 10], [11, 12]]|
|id_4|[[12, 14], [15, 16]]|
|id_5|[[17, 18], [19, 10]]|
+----+--------------------+
To achieve the other desired output in your previous question, you can explode one more time. This time you explode the array column, which came from the value of the map.
df.select(
F.explode('json').alias('id', 'json')
).select(
'id', F.explode('json').alias('json')
).select(
'id', 'json.*'
).show()
+----+---+---+
| id| a| b|
+----+---+---+
|id_1| 1| 2|
|id_1| 3| 4|
|id_2| 5| 6|
|id_2| 7| 8|
|id_3| 9| 10|
|id_3| 11| 12|
|id_4| 12| 14|
|id_4| 15| 16|
|id_5| 17| 18|
|id_5| 19| 10|
+----+---+---+
Post a Comment for "Spark: How To Transform Json String With Multiple Keys, From Data Frame Rows?"