Difference between revisions of "Tutorial: Playing with the Boston Housing Data"

From dftwiki3
Jump to: navigation, search
(TensorFlow NN with Hidden Layers: Regression on Boston Data)
(TensorFlow NN with Hidden Layers, programmable number of hidden layers, batched, with dropout)
 
(4 intermediate revisions by the same user not shown)
Line 15: Line 15:
 
<br />
 
<br />
 
* [[Media:DNNRegressorBostonData.pdf | pdf]]
 
* [[Media:DNNRegressorBostonData.pdf | pdf]]
* [[Media:DNNRegressorBostonData.ipynb.zip | Jupyter Notebook]] (Unzip before using)
+
* [[Media:DNNRegressorBostonData.ipynb.zip | Jupyter Notebook]] (Zipped)
 
<br />
 
<br />
 
<br />
 
<br />
Line 30: Line 30:
 
<br />
 
<br />
 
* [[Media:SKLearnLinearRegression_BostonData.pdf | pdf]]
 
* [[Media:SKLearnLinearRegression_BostonData.pdf | pdf]]
* [[Media:SKLearnLinearRegressionBostonData.ipynb.zip | Jupyter Notebook]] (Unzip before using)
+
* [[Media:SKLearnLinearRegressionBostonData.ipynb.zip | Jupyter Notebook]] (Zipped)
  
 
<br />
 
<br />
Line 38: Line 38:
  
 
The result is quite good, as illustrated in the figures below, showing prediction versus test data, and residuals.
 
The result is quite good, as illustrated in the figures below, showing prediction versus test data, and residuals.
The R2 coefficients obtained for a network taking 13 features and feeding them into a  52x39x26x13 architecture of layers is r<sup>2</sup> = 0.8150.
+
The R<sup>2</sup> coefficient of correlation obtained for a network taking 13 features and feeding them into a  52x39x26x13 architecture of layers is R<sup>2</sup> = 0.8150.
  
 
<br />
 
<br />
 
* [[Media:TFLinearRegression_BostonData.pdf | pdf]]
 
* [[Media:TFLinearRegression_BostonData.pdf | pdf]]
* [[Media:TFLearnLinearRegressionBostonData.ipynb.zip | Jupyter Notebook]] (Unzip before using)
+
* [[Media:TFLearnLinearRegressionBostonData.ipynb.zip | Jupyter Notebook]] (Zipped)
 
<br />
 
<br />
[[Image:TensorFlowNNBostonHousingPredVsReal.png|250px]]
+
[[Image:TensorFlowNNBostonHousingPredVsReal.png|350px]]
[[Image:TensorFlowNNBostonHousingResiduals.png|250px]]
+
[[Image:TensorFlowNNBostonHousingResiduals.png|350px]]
[[Image:TensorFlowNNBostonHousingCostVsEpochs.png|250px]]
+
[[Image:TensorFlowNNBostonHousingCostVsEpochs.png|350px]]
  
 +
<br />
 +
 +
=TensorFlow NN with programmable number of Hidden Layers, Batch Mode, and Dropout=
 +
<br />
 +
Here we take the previous Jupyter notebook, and add batches of data, i.e. the training set is given to the NN in batches of size set by the user, and where the training allows for a dropout probability, i.e. one that allows a proportion of neurons to be excluded from the training at every step of the operation.
 +
<br />
 +
<br />
 +
* [[Media:TFRegressionBatchParameterizedDropout_BostonData.pdf | pdf]]
 +
* [[Media:TFLearnRegressionBatchParameterizedDropout_BostonData.ipynb.zip | Jupyter Notebook]] (Zipped)
 
<br />
 
<br />
 
<br />
 
<br />
 
<br />
 
<br />
 
[[Category:Tutorials]][[Category:TensorFlow]]
 
[[Category:Tutorials]][[Category:TensorFlow]]

Latest revision as of 14:20, 11 August 2016

--D. Thiebaut (talk) 16:06, 8 August 2016 (EDT)


Deep Neural-Network Regressor (DNNRegressor from Tensorflow)


This tutorial uses SKFlow and TensorFlow, and follows very closely two other good tutorials and merges elements from both:


We use the Boston housing prices data for this tutorial.
The tutorial is best viewed as a Jupyter notebook (available in zipped form below), or as a static pdf (you'll have to retype all the commands...)



SKLearn Linear Regression Model on the Boston Data


This tutorial also uses SKFlow and follows very closely two other good tutorials and merges elements from both:


We also use the Boston housing prices data for this tutorial.
The tutorial is best viewed as a Jupyter notebook (available in zipped form below), or as a static pdf (you'll have to retype all the commands...)


TensorFlow NN with Hidden Layers: Regression on Boston Data


Here we take the same approach, but use the TensorFlow library to solve the problem of predicting the housing prices using the 13 features present in the Boston data. The code is longer, but offers insight into the "behind the scene" aspect of sklearn.

The result is quite good, as illustrated in the figures below, showing prediction versus test data, and residuals. The R2 coefficient of correlation obtained for a network taking 13 features and feeding them into a 52x39x26x13 architecture of layers is R2 = 0.8150.



TensorFlowNNBostonHousingPredVsReal.png TensorFlowNNBostonHousingResiduals.png TensorFlowNNBostonHousingCostVsEpochs.png


TensorFlow NN with programmable number of Hidden Layers, Batch Mode, and Dropout


Here we take the previous Jupyter notebook, and add batches of data, i.e. the training set is given to the NN in batches of size set by the user, and where the training allows for a dropout probability, i.e. one that allows a proportion of neurons to be excluded from the training at every step of the operation.