Regularization addresses overfitting problem by penalizing complex
learning algorithm – it selects one complex hypothesis class and adds regularizer function to empirical error to prevent overfitting.
Our goal is to choose best learning algorithm among many possible
hypothesis space – the regularization parameter can be chosen best in case of ridge regration or best in .
If we have large dataset – we can split the dataset into three
Training dataset – where we model our functions using different
Development or Validation dataset – on this we tune the parameters
on held out dataset (development or validation set.)
dataset – to test best model we have developed on the training dataset.