我也认为由于@ noob333解释的原因,第一个答案是错误的。
但是Bert也不能直接用作语言模型。伯特为您提供,p(word|context(both left and right) )
而您想要的是计算p(word|previous tokens(only left contex))
。作者在这里解释:https : //github.com/google-research/bert/issues/35为什么不能将其用作lm。
但是,您可以调整Bert并将其用作语言模型,如下所述:https : //arxiv.org/pdf/1902.04094.pdf
但是您可以使用来自同一仓库的开放式AI gpt或gpt-2预置模型(https://github.com/huggingface/pytorch-pretrained-BERT)
这是您可以使用gpt模型计算困惑的方法(https://github.com/huggingface/pytorch-pretrained-BERT/issues/473)
import math
from pytorch_pretrained_bert import OpenAIGPTTokenizer, OpenAIGPTModel, OpenAIGPTLMHeadModel
# Load pre-trained model (weights)
model = OpenAIGPTLMHeadModel.from_pretrained('openai-gpt')
model.eval()
# Load pre-trained model tokenizer (vocabulary)
tokenizer = OpenAIGPTTokenizer.from_pretrained('openai-gpt')
def score(sentence):
tokenize_input = tokenizer.tokenize(sentence)
tensor_input = torch.tensor([tokenizer.convert_tokens_to_ids(tokenize_input)])
loss=model(tensor_input, lm_labels=tensor_input)
return math.exp(loss)
a=['there is a book on the desk',
'there is a plane on the desk',
'there is a book in the desk']
print([score(i) for i in a])
21.31652459381952, 61.45907380241148, 26.24923942649312