ML Estimators

An estimator is an abstraction of a learning algorithm that fits a model on a dataset.

Technically, an Estimator produces a Model (i.e. a Transformer) for a given DataFrame and parameters (as ParamMap). It fits a model to the input DataFrame and ParamMap to produce a Transformer (a Model) that can calculate predictions for any DataFrame-based input datasets.

  • Estimator is the contract in Spark MLlib for estimators that fit models to a dataset.
  • Estimator accepts parameters that you can set through dedicated setter methods upon creating an Estimator. You could also fit a model with extra parameters.
import org.apache.spark.ml.classification.LogisticRegression

// Define parameters upon creating an Estimator
val lr = new LogisticRegression().
  setMaxIter(5).
  setRegParam(0.01)

val training: DataFrame = ...
val model1 = lr.fit(training)

results matching ""

    No results matching ""