How to set cutoff while training the data in Random Forest in Spark -
i using spark mlib train data classification using random forest algorithm. mlib provides randomforest class has trainclassifier method required.
can set threshold value while training data set, similar cutoff option provided in r's randomforest package.
http://cran.r-project.org/web/packages/randomforest/randomforest.pdf
i found randomforest class of mlib provides options pass number of trees, impurity, number of classes etc there nothing threshold or cut off option available. can done way.
the short version no, if @ randomforestclassifier.scala
can see selects max. override predict function if, not super clean. i've added jira track adding this.
Comments
Post a Comment