machine learning - Why is Multi-label classification (Binary relevance) is acting up? -
i'm new multi-label classification using binary relevance , having issues explaining result:
the result is: [[ 0. 0.] [ 2. 2.]]
does mean first case classified [0,0] , 2nd [2,2]? not @ all. or missing else?
after gentelmen answers i'm getting following error because of y_train label [2**,0,**3,4] because of zero
traceback (most recent call last): file "driver.py", line 22, in <module> clf_dict[i] = clf.fit(x_train, y_tmp) file "c:\users\baderex\anaconda22\lib\site-packages\sklearn\linear_model\logistic.py", line 1154, in fit self.max_iter, self.tol, self.random_state) file "c:\users\baderex\anaconda22\lib\site-packages\sklearn\svm\base.py", line 885, in _fit_liblinear " class: %r" % classes_[0]) valueerror: solver needs samples of @ least 2 classes in data, data contains 1 class: 1
updated code:
import numpy np sklearn.linear_model import logisticregression sklearn.metrics import * numer_classes = 5 x_train = np.array([[1,2,3,4],[0,1,2,1],[1,2,0,3]]) y_train = [[0],[1,0,3],[2,0,3,4]] x_test = np.array([[1,2,3,4],[0,1,2,1],[1,2,0,3]]) y_test = [[0],[1,0,3],[2,0,3,4]] clf_dict = {} in range(numer_classes): y_tmp = [] j in range(len(y_train)): if in y_train[j]: y_tmp.append(1) else: y_tmp.append(0) clf = logisticregression() clf_dict[i] = clf.fit(x_train, y_tmp) prediction_matrix = np.zeros((len(x_test),numer_classes)) in range(numer_classes): prediction = clf_dict[i].predict(x_test) prediction_matrix[:,i] = prediction print('predicted') print(prediction_matrix)
thanks
i think made mistake in implementation. binary relevance, need separate classifier each of labels. there 3 labels, there should 3 classifiers. each classifier tell weather instance belongs class or not. example, classifier corresponds class 1 (clf[1]) tell weather instance belongs class 1 or not.
thus, if want manually implement binary relevance, in loop creates classifiers, label should binarized:
for in range(numer_classes): y_tmp = [] j in range(len(y_train)): if in y_train[j]: y_tmp.append(1) else: y_tmp.append(0) clf = logisticregression() clf_dict[i] = clf.fit(x_train, y_tmp)
however, if use sklearn, things more convenient:
from sklearn.multiclass import onevsrestclassifier sklearn.preprocessing import multilabelbinarizer binarizer = multilabelbinarizer() y_train_binarized = binarizer.fit_transform(y_train) y_test_binarized = binarizer.fit_transform(y_test) cls = onevsrestclassifier(estimator=logisticregression()) cls.fit(x_train,y_train_binarized) y_predict = cls.predict(x_test)
the results like: [[1 0 1] [0 1 1]] means first case predicted as: [0,2] , second case predicted [1,2]
Comments
Post a Comment