QDA模型
In [1]: from sklearn.discriminant_analysis import QuadraticDiscriminantAnalysis as QDA
In [2]: from sklearn.metrics import accuracy_score,confusion_matrix
In [3]: import pandas as pd
In [4]: data = pd.read_csv('Smarket.csv',index_col=0)
In [5]: data.shape #数据维度
Out[5]: (1250, 9)
In [6]: predictors = ['Lag1','Lag2']
In [7]: train = data[data['Year']<2005] #训练集
In [8]: test = data[data['Year'] == 2005] #测试集
In [9]: X_train = train[predictors] #训练集中的观测
In [10]: X_test = test[predictors] #测试集中的观测
In [11]: y_train = train['Direction'] #训练集中的响应值
In [12]: y_test = test['Direction'] #测试集中的响应值
In [13]: qda = QDA()
In [14]: qda.fit(X_train,y_train)
Out[14]:
QuadraticDiscriminantAnalysis(priors=None, reg_param=0.0,
store_covariance=False, tol=0.0001)
In [15]: print('Mean for class 0 is - ',qda.means_[0])
Mean for class 0 is - [0.04279022 0.03389409]
In [16]: print('Mean for class 1 is - ',qda.means_[1])
Mean for class 1 is - [-0.03954635 -0.03132544]
In [17]: print('Prior probalbilities - ',qda.priors_)
Prior probalbilities - [0.49198397 0.50801603]
In [18]: pred = qda.predict(X_test)
In [19]: cm = confusion_matrix(y_test,pred)
In [20]: print(cm)
[[ 30 81]
[ 20 121]]
In [21]: print('Accuracy using QDA is ',accuracy_score(y_test,pred))
Accuracy using QDA is 0.5992063492063492
QDA模型是指二次判别分析(Quadratic Discriminant Analysis),它是一种用于模式识别和分类的统计方法。该模型假设每个类别的特征数据都符合多元正态分布,并且每个类别都有自己的协方差矩阵。通过建立每个类别的概率密度函数,QDA模型可以用来对新的样本进行分类。需要注意的是,QDA模型与线性判别分析(LDA)相似,但不同之处在于QDA假设每个类别有自己的协方差矩阵,而LDA假设所有类别共享相同的协方差矩阵。上面是一个简单例子