Thursday, March 30, 2017

PySpark Basic Commands


rddRead.first() : Return the first element from the dataset.
rddRead.take(5) : Return the first n lines from the dataset and display them on the console.
rddRead.count() : Return number of lines in a RDDt
table.columns : Display columns in a table
rdd.distinct().count() : Count distinct records

TableName.columns : Show list of columns on that table




No comments:

Post a Comment