Read CSV file to DataFrame
It is also very easy to load CSV file to Spark's DataFrame.
Let's load titanic.csv file, which contains information about passengers of Titanic:
import org.apache.spark.sql.SparkSession
object ReadCSVExample {
def main(args: Array[String]): Unit = {
val inputFile = args(0);
//Initialize SparkSession
val sparkSession = SparkSession
.builder()
.appName("spark-read-csv")
.master("local[*]")
.getOrCreate();
import sparkSession.implicits;
//Read json file to DF
val passengers = sparkSession.read
.option("header", "true")
.option("delimiter", "\t")
.option("nullValue", "")
.option("treatEmptyValuesAsNulls", "true")
.option("inferSchema", "true")
.csv(inputFile)
passengers.show(100)
passengers.printSchema()
}
}
As a result you will see the following DataFrame: