Я новичок в sklearn и сталкивается с проблемой при подгонке x_train
и y_train
. Вот код с ошибкойsklearn MultinomialNB плохая форма ввода в python
def naive_bayes(x_train, y_train):
clf = MultinomialNB()
clf.fit(x_train, y_train)
joblib.dump(clf, '%s/NB/naive_bayes.pkl' %model_direc)
if __name__ == "__main__":
train_df = pd.read_json('%s/train.json' %data_direc, orient='index')
y_train = train_df[['*', '**', '***']].astype(np.float64)
x_train = pd.read_json('%s/features.json' %feature_direc, orient='columns')
x_train = x_train.sort_index()
print x_train.shape
print y_train.shape
naive_bayes(x_train, y_train)
И вот вывод.
(80, 1500)
(80, 3)
Traceback (most recent call last):
File "src/NBtrain.py", line 50, in <module>
naive_bayes(x_train, y_train)
File "src/NBtrain.py", line 37, in naive_bayes
clf.fit(x_train, y_train)
File "/Library/Python/2.7/site-packages/sklearn/naive_bayes.py", line 474, in fit
X, y = check_X_y(X, y, 'csr')
File "/Library/Python/2.7/site-packages/sklearn/utils/validation.py", line 444, in check_X_y
y = column_or_1d(y, warn=True)
File "/Library/Python/2.7/site-packages/sklearn/utils/validation.py", line 480, in column_or_1d
raise ValueError("bad input shape {0}".format(shape))
ValueError: bad input shape (80, 3)
features.json
создается с помощью tf-idf
.
Размер x_train
есть (80, 1500). Размер y_train
есть (80, 3).
Я не уверен, почему y_train
находится в плохом состоянии?