Answers:
当您的类是互斥的时(例如,当每个样本完全属于一个类时),请使用稀疏分类交叉熵;而当一个样本可以有多个类,或者标签是软概率(例如[0.5、0.3、0.2])时,请使用分类交叉熵。
分类交叉熵的公式(S-样本,C-类, -属于c)类的示例是:
对于类是互斥的情况,您无需对其求和-对于每个样本,仅非零值只是 对于真正的c。
This allows to conserve time and memory. Consider case of 10000 classes when they are mutually exclusive - just 1 log instead of summing up 10000 for each sample, just one integer instead of 10000 floats.
Formula is the same in both cases, so no impact on accuracy should be there.
o1,o2,o3
and each one have 167,11,7
classes respectively. I've read your answer that it'll make no difference but is there any difference if I use sparse__
or not. Can I go for categorical
for the last 2 and sparse
for the first one as there are 167 classes in the first class?
The Answer, In a Nutshell
If your targets are one-hot encoded, use categorical_crossentropy. Examples of one-hot encodings:
[1,0,0]
[0,1,0]
[0,0,1]
But if your targets are integers, use sparse_categorical_crossentropy. Examples of integer encodings (for the sake of completion):
1
2
3
sparse_categorical_crossentropy
? And what does the from_logits
argument mean?