机器学习(六) — 评估模型

发布时间:2024年01月18日

Evaluate model

1 test set

  1. split the training set into training set and a test set
  2. the test set is used to evaluate the model

1. linear regression

compute test error

J t e s t ( w ? , b ) = 1 2 m t e s t ∑ i = 1 m t e s t [ ( f ( x t e s t ( i ) ) ? y t e s t ( i ) ) 2 ] J_{test}(\vec w, b) = \frac{1}{2m_{test}}\sum_{i=1}^{m_{test}} \left [ (f(x_{test}^{(i)}) - y_{test}^{(i)})^2 \right ] Jtest?(w ,b)=2mtest?1?i=1mtest??[(f(xtest(i)?)?ytest(i)?)2]

2. classification regression

compute test error

J t e s t ( w ? , b ) = ? 1 m t e s t ∑ i = 1 m t e s t [ y t e s t ( i ) l o g ( f ( x t e s t ( i ) ) ) + ( 1 ? y t e s t ( i ) ) l o g ( 1 ? f ( x t e s t ( i ) ) ] J_{test}(\vec w, b) = -\frac{1}{m_{test}}\sum_{i=1}^{m_{test}} \left [ y_{test}^{(i)}log(f(x_{test}^{(i)})) + (1 - y_{test}^{(i)})log(1 - f(x_{test}^{(i)}) \right ] Jtest?(w ,b)=?mtest?1?i=1mtest??[ytest(i)?log(f(xtest(i)?))+(1?ytest(i)?)log(1?f(xtest(i)?)]

2 cross-validation set

  1. split the training set into training set, cross-validation set and test set
  2. the cross-validation set is used to automatically choose the better model, and the test set is used to evaluate the model that chosed

3 bias and variance

  1. high bias: J t r a i n J_{train} Jtrain? and J c v J_{cv} Jcv? is both high
  2. high variance: J t r a i n J_{train} Jtrain? is low, but J c v J_{cv} Jcv? is high

在这里插入图片描述

  1. if high bias: get more training set is helpless
  2. if high variance: get more training set is helpful

4 regularization

  1. if λ \lambda λ is too small, it will lead to overfitting(high variance)
  2. if λ \lambda λ is too large, it will lead to underfitting(high bias)

在这里插入图片描述

5 method

  1. fix high variance:
    • get more training set
    • try smaller set of features
    • reduce some of the higher-order terms
    • increase λ \lambda λ
  2. fix high bias:
    • get more addtional features
    • add polynomial features
    • decrease λ \lambda λ

6 neural network and bias variance

  1. a bigger network means a more complex model, so it will solve the high bias
  2. more data is helpful to solve high variance

在这里插入图片描述

  1. it turns out that a bigger(may be overfitting) and well regularized neural network is better than a small neural network
文章来源:https://blog.csdn.net/m0_65591847/article/details/135641692
本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。