Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sentence-order prediction #42

Open
qiunlp opened this issue Feb 21, 2020 · 4 comments
Open

sentence-order prediction #42

qiunlp opened this issue Feb 21, 2020 · 4 comments

Comments

@qiunlp
Copy link

qiunlp commented Feb 21, 2020

BERT的NextSentencePr任务过于简单。ALBERT中,为了只保留一致性任务去除主题识别的影响,提出了一个新的任务 sentence-order prediction(SOP)
请问: 这个任务在您程序的哪个部分?

@lonePatient
Copy link
Owner

@yzgdjqwh 博文里面有介绍,对应可以找到代码https://lonepatient.top/2019/10/20/ALBERT.html

@qiunlp
Copy link
Author

qiunlp commented Feb 21, 2020

在您博文中看到了下面,它在您程序的哪个模块?我初学,请多包涵

NSP:是否下一句预测, true = 上下相邻的2个句子,false=随机2个句子

SOP:句间连贯预测,true=正常顺序的2个相邻句子,false=调换顺序的2个相邻句子

if random.random() < 0.5: # 交换一下tokens_a和tokens_b
is_random_next = True
temp = tokens_a
tokens_a = tokens_b
tokens_b = temp
else:
is_random_next = False

@lonePatient
Copy link
Owner

@yzgdjqwh 在prepare_lm_data_ngram.py里面

@qiunlp
Copy link
Author

qiunlp commented Feb 22, 2020

谢谢您及时解答我的菜鸟问题。
我想直接调用您预训练好的model去做下游英语句子顺序判断,请问是下载哪个链接?
我用下面的代码调用您在网上提供的model,效果与人为判断不一. 请赐教。

from pytorch_pretrained_bert import BertTokenizer
from model.modeling_albert import AlbertConfig, AlbertForNextSentencePrediction
加载词典  生成模型标准输入格式,和bert相同
tokenizer = BertTokenizer.from_pretrained('bert-base-uncased')
input_ids = convert_examples_to_feature(text_a, text_b,  tokenizer)
tokens_tensor = torch.tensor([input_ids]) 
#加载模型
config = AlbertConfig.from_pretrained("./prev_trained_model/albert_base/config.json")
model =AlbertForNextSentencePrediction.from_pretrained("./prev_trained_model/albert_base/pytorch_model.bin",config=config)
model.eval()
#预测
out = model(tokens_tensor)
seq_relationship_scores = out[0]
sample = seq_relationship_scores.detach().numpy()
pred = np.argmax(sample, axis=1)
print(pred)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants