验证损失和准确性保持恒定


12

我想实现这个上一所集医疗图像的纸。我在Keras上做。该网络主要由4个conv和max-pool层组成,然后是一个完全连接的层和s​​oft max分类器。

据我所知,我遵循了本文提到的架构。但是,验证损失和准确性始终保持不变。准确性似乎固定为〜57.5%。

对于我可能会出错的任何帮助,将不胜感激。

我的代码:

from keras.models import Sequential
from keras.layers import Activation, Dropout, Dense, Flatten  
from keras.layers import Convolution2D, MaxPooling2D
from keras.optimizers import SGD
from keras.utils import np_utils
from PIL import Image
import numpy as np
from sklearn.utils import shuffle
from sklearn.cross_validation import train_test_split
import theano
import os
import glob as glob
import cv2
from matplotlib import pyplot as plt

nb_classes = 2
img_rows, img_cols = 100,100
img_channels = 3


#################### DATA DIRECTORY SETTING######################

data = '/home/raghuram/Desktop/data'
os.chdir(data)
file_list = os.listdir(data)
##################################################################

## Test lines
#I = cv2.imread(file_list[1000])
#print np.shape(I)
####
non_responder_file_list = glob.glob('0_*FLAIR_*.png')
responder_file_list = glob.glob('1_*FLAIR_*.png')
print len(non_responder_file_list),len(responder_file_list)

labels = np.ones((len(file_list)),dtype = int)
labels[0:len(non_responder_file_list)] = 0
immatrix = np.array([np.array(cv2.imread(data+'/'+image)).flatten() for image in file_list])
#img = immatrix[1000].reshape(100,100,3)
#plt.imshow(img,cmap = 'gray')


data,Label = shuffle(immatrix,labels, random_state=2)
train_data = [data,Label]
X,y = (train_data[0],train_data[1])
# Also need to look at how to preserve spatial extent in the conv network
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=4)
X_train = X_train.reshape(X_train.shape[0], 3, img_rows, img_cols)
X_test = X_test.reshape(X_test.shape[0], 3, img_rows, img_cols)
X_train = X_train.astype('float32')
X_test = X_test.astype('float32')

X_train /= 255
X_test /= 255

Y_train = np_utils.to_categorical(y_train, nb_classes)
Y_test = np_utils.to_categorical(y_test, nb_classes)

model = Sequential()

## First conv layer and its activation followed by the max-pool layer#
model.add(Convolution2D(16,5,5, border_mode = 'valid', subsample = (1,1), init = 'glorot_normal',input_shape = (3,100,100))) # Glorot normal is similar to Xavier initialization
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size = (2,2),strides = None))
# Output is 48x48

print 'First layer setup'
###########################Second conv layer#################################
model.add(Convolution2D(32,3,3,border_mode = 'same', subsample = (1,1),init = 'glorot_normal'))
model.add(Activation('relu'))
model.add(Dropout(0.6))
model.add(MaxPooling2D(pool_size = (2,2),strides = None))
#############################################################################

print ' Second layer setup'
# Output is 2x24

##########################Third conv layer###################################
model.add(Convolution2D(64,3,3, border_mode = 'same', subsample = (1,1), init = 'glorot_normal'))
model.add(Activation('relu'))
model.add(Dropout(0.6))
model.add(MaxPooling2D(pool_size = (2,2),strides = None))
#############################################################################
# Output is 12x12

print ' Third layer setup'
###############################Fourth conv layer#############################
model.add(Convolution2D(128,3,3, border_mode = 'same', subsample = (1,1), init = 'glorot_normal'))
model.add(Activation('relu'))
model.add(Dropout(0.6))
model.add(MaxPooling2D(pool_size = (2,2),strides = None))
############################################################################# 

print 'Fourth layer setup'

# Output is 6x6x128
# Create the FC layer of size 128x6x6#
model.add(Flatten()) 
model.add(Dense(2,init = 'glorot_normal',input_dim = 128*6*6))
model.add(Dropout(0.6))
model.add(Activation('softmax'))

print 'Setting up fully connected layer'
print 'Now compiling the network'
sgd = SGD(lr=0.01, decay=1e-4, momentum=0.6, nesterov=True)
model.compile(loss = 'mse',optimizer = 'sgd', metrics=['accuracy'])

# Fit the network to the data#
print 'Network setup successfully. Now fitting the network to the data'
model. fit(X_train,Y_train,batch_size = 100, nb_epoch = 20, validation_split = None,verbose = 1)
print 'Testing'
loss,accuracy = model.evaluate(X_test,Y_test,batch_size = 32,verbose = 1)
print "Test fraction correct (Accuracy) = {:.2f}".format(accuracy)

培训损失在减少吗?
Jan Van der Vegt,2016年

不,培训损失在整个过程中也保持不变。
拉古拉姆

您尚未在fit调用中设置任何验证数据或validation_split,它将对哪些数据进行验证?还是说测试?
Jan van der Vegt '16

那是在试验之后。我将validation_split = 0.2设置为None,然后进行了尝试。
拉古拉姆

2
您是否可以适合很多批次以查看是否可以降低培训损失?
Jan van der Vegt '16

Answers:


4

似乎您将MSE用作损失函数,从纸上的一瞥看来,他们似乎使用了NLL(交叉熵),MSE被认为易于对数据不平衡等其他问题敏感,这可能是造成问题的原因经验,我会尝试在您的情况下使用categorical_crossentropy损失进行训练,而且学习率0.01似乎太大,我会尝试使用它并尝试0.001甚至0.0001


2

尽管我来晚了一点,但我还是想投入两分钱,因为这有助于我最近解决类似的问题。除了分类交叉熵损失之外,我的救助方法是将要素缩放到(0,1)范围内。尽管如此,值得一提的是,特征缩放仅在特征属于不同度量标准且相对于彼此具有更多的变化(数量级)的情况下才有用,就像我的情况一样。同样,如果使用hinge损耗,则缩放比例可能非常有用,因为最大余量分类器通常对特征值之间的距离敏感。希望这对将来的访客有所帮助!

By using our site, you acknowledge that you have read and understand our Cookie Policy and Privacy Policy.
Licensed under cc by-sa 3.0 with attribution required.