LinearRegressionModel¶
- 
class pyspark.mllib.regression.LinearRegressionModel(weights: pyspark.mllib.linalg.Vector, intercept: float)[source]¶
- A linear regression model derived from a least-squares fit. - New in version 0.9.0. - Examples - >>> from pyspark.mllib.linalg import SparseVector >>> from pyspark.mllib.regression import LabeledPoint >>> data = [ ... LabeledPoint(0.0, [0.0]), ... LabeledPoint(1.0, [1.0]), ... LabeledPoint(3.0, [2.0]), ... LabeledPoint(2.0, [3.0]) ... ] >>> lrm = LinearRegressionWithSGD.train(sc.parallelize(data), iterations=10, ... initialWeights=np.array([1.0])) >>> abs(lrm.predict(np.array([0.0])) - 0) < 0.5 True >>> abs(lrm.predict(np.array([1.0])) - 1) < 0.5 True >>> abs(lrm.predict(SparseVector(1, {0: 1.0})) - 1) < 0.5 True >>> abs(lrm.predict(sc.parallelize([[1.0]])).collect()[0] - 1) < 0.5 True >>> import os, tempfile >>> path = tempfile.mkdtemp() >>> lrm.save(sc, path) >>> sameModel = LinearRegressionModel.load(sc, path) >>> abs(sameModel.predict(np.array([0.0])) - 0) < 0.5 True >>> abs(sameModel.predict(np.array([1.0])) - 1) < 0.5 True >>> abs(sameModel.predict(SparseVector(1, {0: 1.0})) - 1) < 0.5 True >>> from shutil import rmtree >>> try: ... rmtree(path) ... except BaseException: ... pass >>> data = [ ... LabeledPoint(0.0, SparseVector(1, {0: 0.0})), ... LabeledPoint(1.0, SparseVector(1, {0: 1.0})), ... LabeledPoint(3.0, SparseVector(1, {0: 2.0})), ... LabeledPoint(2.0, SparseVector(1, {0: 3.0})) ... ] >>> lrm = LinearRegressionWithSGD.train(sc.parallelize(data), iterations=10, ... initialWeights=np.array([1.0])) >>> abs(lrm.predict(np.array([0.0])) - 0) < 0.5 True >>> abs(lrm.predict(SparseVector(1, {0: 1.0})) - 1) < 0.5 True >>> lrm = LinearRegressionWithSGD.train(sc.parallelize(data), iterations=10, step=1.0, ... miniBatchFraction=1.0, initialWeights=np.array([1.0]), regParam=0.1, regType="l2", ... intercept=True, validateData=True) >>> abs(lrm.predict(np.array([0.0])) - 0) < 0.5 True >>> abs(lrm.predict(SparseVector(1, {0: 1.0})) - 1) < 0.5 True - Methods - load(sc, path)- Load a LinearRegressionModel. - predict(x)- Predict the value of the dependent variable given a vector or an RDD of vectors containing values for the independent variables. - save(sc, path)- Save a LinearRegressionModel. - Attributes - Intercept computed for this model. - Weights computed for every feature. - Methods Documentation - 
classmethod load(sc: pyspark.context.SparkContext, path: str) → pyspark.mllib.regression.LinearRegressionModel[source]¶
- Load a LinearRegressionModel. - New in version 1.4.0. 
 - 
predict(x: Union[VectorLike, pyspark.rdd.RDD[VectorLike]]) → Union[float, pyspark.rdd.RDD[float]]¶
- Predict the value of the dependent variable given a vector or an RDD of vectors containing values for the independent variables. - New in version 0.9.0. 
 - 
save(sc: pyspark.context.SparkContext, path: str) → None[source]¶
- Save a LinearRegressionModel. - New in version 1.4.0. 
 - Attributes Documentation - 
intercept¶
- Intercept computed for this model. - New in version 1.0.0. 
 - 
weights¶
- Weights computed for every feature. - New in version 1.0.0. 
 
- 
classmethod