Read CSV file to DataFrame

It is also very easy to load CSV file to Spark's DataFrame.

Let's load titanic.csv file, which contains information about passengers of Titanic:

import org.apache.spark.sql.SparkSession

object ReadCSVExample {

  def main(args: Array[String]): Unit = {

    val inputFile = args(0);

    //Initialize SparkSession
    val sparkSession = SparkSession
      .builder()
      .appName("spark-read-csv")
      .master("local[*]")
      .getOrCreate();

    import sparkSession.implicits;

    //Read json file to DF
    val passengers = sparkSession.read
      .option("header", "true")
      .option("delimiter", "\t")
      .option("nullValue", "")
      .option("treatEmptyValuesAsNulls", "true")
      .option("inferSchema", "true")
      .csv(inputFile)

    passengers.show(100)
    passengers.printSchema()
  }
}

As a result you will see the following DataFrame:

results matching ""

    No results matching ""