lunes, 19 de febrero de 2018

Notepad for Spark (I): Adding to your entries an UUID

Create a DataFrame:

var in=spark.read.json("sample.json");

Register the UDF (User Defined Function)

import java.util.UUID
val generateUUID = udf(() => UUID.randomUUID().toString)

Generate a new Column

val withUUID= in.withColumn("uuid",generateUUID());