jueves, 1 de marzo de 2018

Spark full transformation cycle.

spark.read.format("csv").option("header","true").load("in/file.csv"); def p(row: Row) = Row(row.getString(0).toUpperCase,row.getString(1),row.getString(2)); rddData.map(p).foreach(println); var transform = inFile.map(row=>(row.getString(0),row.getString(1),row.getString(2),"XXXXXXXXXXXXXXXX")). transform.write.format("csv").option("header","false").save("out/names.csv");

Jenkins: trigger a job via remote request

We can use curl to trigger the job execution (with curl and some hooks or with incron) https://ci.tcpip.tech/view/Experimentos/job/CI-Template/build?token=Al-fa1fa_316 https://ci.tcpip.tech/view/Experimentos/job/CI-Template/buildWithParameters?token=Al-fa1fa_316

lunes, 19 de febrero de 2018

Notepad for Spark (I): Adding to your entries an UUID

Create a DataFrame:

var in=spark.read.json("sample.json");

Register the UDF (User Defined Function)

import java.util.UUID
val generateUUID = udf(() => UUID.randomUUID().toString)

Generate a new Column

val withUUID= in.withColumn("uuid",generateUUID());