Practice 4: Spark MLlib - Logistic Regression

Input Dataset

The main goal of this practice is to learn Logistic Regression Model on the diabets.csv data.

Load diabets.csv data into Spark's DataFrame and transform features into features are transformed and put into Feature Vectors (by using VectorAssembler)
Once the set of features defined, split it to the training and testing datasets
Train Logistic Regression on Training portion of data
Predict outcome variable (diabet) on Testing dataset
Test the model