3.1.3.2 多元回归: 包含多因素

考虑用2个变量x和y来解释变量z的线性模型:

$z = x \, c_1 + y \, c_2 + i + e$

这个模型可以被视为在3D世界中用一个平面去拟合 (x, y, z) 的点云。

3.1.3.2 多元回归: 包含多因素 - 图1

实例: 鸢尾花数据 (examples/iris.csv)

萼片和花瓣的大小似乎是相关的: 越大的花越大! 但是,在不同的种之间是否有额外的系统效应?

3.1.3.2 多元回归: 包含多因素 - 图2

In [33]:

  1. data = pandas.read_csv('examples/iris.csv')
  2. model = ols('sepal_width ~ name + petal_length', data).fit()
  3. print(model.summary())
  1. OLS Regression Results
  2.  ==============================================================================
  3. Dep. Variable: sepal_width R-squared: 0.478
  4. Model: OLS Adj. R-squared: 0.468
  5. Method: Least Squares F-statistic: 44.63
  6. Date: Thu, 19 Nov 2015 Prob (F-statistic): 1.58e-20
  7. Time: 09:56:04 Log-Likelihood: -38.185
  8. No. Observations: 150 AIC: 84.37
  9. Df Residuals: 146 BIC: 96.41
  10. Df Model: 3
  11. Covariance Type: nonrobust
  12.  ======================================================================================
  13. coef std err t P>|t| [95.0% Conf. Int.]
  14.  ------------- ------------- ------------- ------------- ------------- ---------------------
  15. Intercept 2.9813 0.099 29.989 0.000 2.785 3.178
  16. name[T.versicolor] -1.4821 0.181 -8.190 0.000 -1.840 -1.124
  17. name[T.virginica] -1.6635 0.256 -6.502 0.000 -2.169 -1.158
  18. petal_length 0.2983 0.061 4.920 0.000 0.178 0.418
  19.  ==============================================================================
  20. Omnibus: 2.868 Durbin-Watson: 1.753
  21. Prob(Omnibus): 0.238 Jarque-Bera (JB): 2.885
  22. Skew: -0.082 Prob(JB): 0.236
  23. Kurtosis: 3.659 Cond. No. 54.0
  24.  ==============================================================================
  25. Warnings:
  26. [1] Standard Errors assume that the covariance matrix of the errors is correctly specified.