ML Evaluator
An important task in ML is model selection, or using data to find the best model or parameters for a given task. This is also called tuning. Tuning may be done for individual Estimators such as LogisticRegression, or for entire Pipelines which include multiple algorithms, featurization, and other steps. Users can tune an entire Pipeline at once, rather than tuning each element in the Pipeline separately.
- Evaluator is the contract in Spark MLlib for ML Pipeline components that can evaluate models for given parameters.
- Evaluator is used to evaluate models and is usually (if not always) used for best model selection by CrossValidator and TrainValidationSplit.
Example of evaluators: |
---|
Evaluator | Description |
---|---|
BinaryClassificationEvaluator | Evaluator of binary classification models |
ClusteringEvaluator | Evaluator of clustering models |
MulticlassClassificationEvaluator | Evaluator of multiclass classification models |
RegressionEvaluator | Evaluator of regression models |
MulticlassClassificationEvaluator:
val evaluator = new MulticlassClassificationEvaluator()
.setLabelCol("CategoryIndex")
.setPredictionCol("prediction")
.setMetricName("accuracy")