tag-based-multi-span-extraction
代码:https://github.com/eladsegal/tag-based-multi-span-extraction
论文:A Simple and Effective Model for Answering Multi-span Questions
- 配置环境变量添加代理
scp -r zhaoxiaofeng@219.216.64.175:~/.proxychains ./
修改~/.bashrc,在末尾添加指令别名
alias proxy=/data0/zhaoxiaofeng/usr/bin/proxychains4 # 77, 175, 206只添加这条
alias aliasproxy=/home/zhaoxiaofeng/usr/bin/proxychains4 # 154只添加这条
- 下载代码:
git clone https://github.com/eladsegal/tag-based-multi-span-extraction
- 配置环境
proxy conda create -n allennlp python=3.6.9
proxy pip install -r requirements.txt
proxy conda install pytorch torchvision torchaudio cudatoolkit=10.1 -c pytorch
pip install en_core_web_sm-2.1.0.tar.gz
加载本地预训练模型(参考 预训练模型——替换为本地文件)
-
训练模型
可以使用nohup + &在后台训练
tail -f nohup.txt 可以实时查看日志
nohup command >> nohup.out 2>&1 &
- 2>&1的意思是将标准错误(2)也定向到标准输出(1)的输出文件中。
RoBERTa TASE_IO + SSE
allennlp train configs/drop/roberta/drop_roberta_large_TASE_IO_SSE.jsonnet -s training_directory -f --include-package src
服务器运行:
nohup allennlp train configs/drop/roberta/drop_roberta_large_TASE_IO_SSE.jsonnet -s training_directory_base -f --include-package src >> base_log.out 2>&1 &
或:
allennlp train download_data/config.json -s training_directory --include-package src
Bert_large TASE_BIO + SSE
-f :可以清空训练数据文件夹,重新训练
-r:可以从之前的训练状态恢复
allennlp train configs/drop/bert/drop_bert_large_TASE_BIO_SSE.jsonnet -s training_directory_bert -f --include-package src
服务器运行:
nohup allennlp train configs/drop/bert/drop_bert_large_TASE_BIO_SSE.jsonnet -s training_directory_bert -f --include-package src >> bertlog.out 2>&1 &
-
预测模型
cuda-device 只能使用一个GPU
后文详细介绍
-
评估模型
后文详细介绍
预训练模型——替换为本地文件
定位技巧:
find 路径 | grep -ri “字符串” -l
下载文件:
proxy wget https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-vocab.txt
快速下载文件技巧:
由于python中无法执行proxy,报错
sh: 1: proxy: not found
只能通过手动方式进行,故编写如下脚本,实现命令生成,打印,复制打印的内容并执行可实现批量下载。
import os
BERT_PRETRAINED_MODEL_ARCHIVE_MAP = {
'bert-base-uncased': "/data0/maqi/pretrained_model/modeling_bert/bert-base-uncased-pytorch_model.bin",
'bert-large-uncased': "/data0/maqi/pretrained_model/modeling_bert/bert-large-uncased-pytorch_model.bin",
'bert-base-cased': "/data0/maqi/pretrained_model/modeling_bert/bert-base-cased-pytorch_model.bin",
'bert-large-cased': "/data0/maqi/pretrained_model/modeling_bert/bert-large-cased-pytorch_model.bin"
}
for url in BERT_PRETRAINED_MODEL_ARCHIVE_MAP.values():
urls= 'proxy wget '+url
print(urls)
执行结果:
proxy wget https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-config.json
proxy wget https://s3.amazonaws.com/models.huggingface.co/bert/bert-large-uncased-config.json
proxy wget https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-cased-config.json
proxy wget https://s3.amazonaws.com/models.huggingface.co/bert/bert-large-cased-config.json
proxy wget https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-multilingual-uncased-config.json
涉及文件:
/data0/maqi/.conda/envs/allennlp/lib/python3.6/site-packages/transformers/tokenization_roberta.py
/data0/maqi/.conda/envs/allennlp/lib/python3.6/site-packages/transformers/modeling_roberta.py
/data0/maqi/.conda/envs/allennlp/lib/python3.6/site-packages/transformers/configuration_roberta.py
/data0/maqi/.conda/envs/allennlp/lib/python3.6/site-packages/transformers/tokenization_utils.py
/data0/maqi/.conda/envs/allennlp/lib/python3.6/site-packages/transformers/tokenization_bert.py
/data0/maqi/.conda/envs/allennlp/lib/python3.6/site-packages/transformers/configuration_bert.py
/data0/maqi/.conda/envs/allennlp/lib/python3.6/site-packages/transformers/modeling_bert.py
- 各服务器路径
服务器 | 路径 |
---|---|
202.199.6.77 | /data0/maqi |
219.216.64.206 | /data0/maqi |
219.216.64.175 | |
219.216.64.154 |
scp /data0/maqi/.conda/envs/allennlp/lib/python3.6/site-packages/transformers/tokenization_roberta.py maqi@202.199.6.77:/data0/maqi/.conda/envs/allennlp/lib/python3.6/site-packages/transformers/tokenization_roberta.py
scp /data0/maqi/.conda/envs/allennlp/lib/python3.6/site-packages/transformers/modeling_roberta.py maqi@202.199.6.77:/data0/maqi/.conda/envs/allennlp/lib/python3.6/site-packages/transformers/modeling_roberta.py
scp /data0/maqi/.conda/envs/allennlp/lib/python3.6/site-packages/transformers/configuration_roberta.py maqi@202.199.6.77:/data0/maqi/.conda/envs/allennlp/lib/python3.6/site-packages/transformers/configuration_roberta.py
scp /data0/maqi/.conda/envs/allennlp/lib/python3.6/site-packages/transformers/tokenization_utils.py maqi@202.199.6.77:/data0/maqi/.conda/envs/allennlp/lib/python3.6/site-packages/transformers/tokenization_utils.py
scp /data0/maqi/.conda/envs/allennlp/lib/python3.6/site-packages/transformers/tokenization_bert.py maqi@202.199.6.77:/data0/maqi/.conda/envs/allennlp/lib/python3.6/site-packages/transformers/tokenization_bert.py
scp /data0/maqi/.conda/envs/allennlp/lib/python3.6/site-packages/transformers/configuration_bert.py maqi@202.199.6.77:/data0/maqi/.conda/envs/allennlp/lib/python3.6/site-packages/transformers/configuration_bert.py
scp /data0/maqi/.conda/envs/allennlp/lib/python3.6/site-packages/transformers/modeling_bert.py maqi@202.199.6.77:/data0/maqi/.conda/envs/allennlp/lib/python3.6/site-packages/transformers/modeling_bert.py
- tokenization_roberta.py
vim /data0/maqi/.conda/envs/allennlp/lib/python3.6/site-packages/transformers/tokenization_roberta.py
原始:
PRETRAINED_VOCAB_FILES_MAP = {
'vocab_file':
{
'roberta-base': "https://s3.amazonaws.com/models.huggingface.co/bert/roberta-base-vocab.json",
'roberta-large': "https://s3.amazonaws.com/models.huggingface.co/bert/roberta-large-vocab.json",
'roberta-large-mnli': "https://s3.amazonaws.com/models.huggingface.co/bert/roberta-large-mnli-vocab.json",
'distilroberta-base': "https://s3.amazonaws.com/models.huggingface.co/bert/distilroberta-base-vocab.json",
'roberta-base-openai-detector': "https://s3.amazonaws.com/models.huggingface.co/bert/roberta-base-vocab.json",
'roberta-large-openai-detector': "https://s3.amazonaws.com/models.huggingface.co/bert/roberta-large-vocab.json",
},
'merges_file':
{
'roberta-base': "https://s3.amazonaws.com/models.huggingface.co/bert/roberta-base-merges.txt",
'roberta-large': "https://s3.amazonaws.com/models.huggingface.co/bert/roberta-large-merges.txt",
'roberta-large-mnli': "https://s3.amazonaws.com/models.huggingface.co/bert/roberta-large-mnli-merges.txt",
'distilroberta-base': "https://s3.amazonaws.com/models.huggingface.co/bert/distilroberta-base-merges.txt",
'roberta-base-openai-detector': "https://s3.amazonaws.com/models.huggingface.co/bert/roberta-base-merges.txt",
'roberta-large-openai-detector': "https://s3.amazonaws.com/models.huggingface.co/bert/roberta-large-merges.txt",
},
}
替换:
PRETRAINED_VOCAB_FILES_MAP = {
'vocab_file':
{
'roberta-base': "/data0/maqi/pretrained_model/tokenization_roberta/roberta-base-vocab.json",
'roberta-large': "/data0/maqi/pretrained_model/tokenization_roberta/roberta-large-vocab.json",
'roberta-large-mnli': "/data0/maqi/pretrained_model/tokenization_roberta/roberta-large-mnli-vocab.json",
'distilroberta-base': "/data0/maqi/pretrained_model/tokenization_roberta/distilroberta-base-vocab.json",
'roberta-base-openai-detector': "/data0/maqi/pretrained_model/tokenization_roberta/roberta-base-vocab.json",
'roberta-large-openai-detector': "/data0/maqi/pretrained_model/tokenization_roberta/roberta-large-vocab.json",
},
'merges_file':
{
'roberta-base': "/data0/maqi/pretrained_model/tokenization_roberta/roberta-base-merges.txt",
'roberta-large': "/data0/maqi/pretrained_model/tokenization_roberta/roberta-large-merges.txt",
'roberta-large-mnli': "/data0/maqi/pretrained_model/tokenization_roberta/roberta-large-mnli-merges.txt",
'distilroberta-base': "/data0/maqi/pretrained_model/tokenization_roberta/distilroberta-base-merges.txt",
'roberta-base-openai-detector': "/data0/maqi/pretrained_model/tokenization_roberta/roberta-base-merges.txt",
'roberta-large-openai-detector': "/data0/maqi/pretrained_model/tokenization_roberta/roberta-large-merges.txt",
},
}
- modeling_roberta.py
vim /data0/maqi/.conda/envs/allennlp/lib/python3.6/site-packages/transformers/modeling_roberta.py
原始:
ROBERTA_PRETRAINED_MODEL_ARCHIVE_MAP = { 'roberta-base': "https://s3.amazonaws.com/models.huggingface.co/bert/roberta-base-pytorch_model.bin", 'roberta-large': "https://s3.amazonaws.com/models.huggingface.co/bert/roberta-large-pytorch_model.bin", 'roberta-large-mnli': "https://s3.amazonaws.com/models.huggingface.co/bert/roberta-large-mnli-pytorch_model.bin", 'distilroberta-base': "https://s3.amazonaws.com/models.huggingface.co/bert/distilroberta-base-pytorch_model.bin", 'roberta-base-openai-detector': "https://s3.amazonaws.com/models.huggingface.co/bert/roberta-base-openai-detector-pytorch_model.bin", 'roberta-large-openai-detector': "https://s3.amazonaws.com/models.huggingface.co/bert/roberta-large-openai-detector-pytorch_model.bin",}
替换:
ROBERTA_PRETRAINED_MODEL_ARCHIVE_MAP = { 'roberta-base': "/data0/maqi/pretrained_model/modeling_roberta/roberta-base-pytorch_model.bin", 'roberta-large': "/data0/maqi/pretrained_model/modeling_roberta/roberta-large-pytorch_model.bin", 'roberta-large-mnli': "/data0/maqi/pretrained_model/modeling_roberta/roberta-large-mnli-pytorch_model.bin", 'distilroberta-base': "/data0/maqi/pretrained_model/modeling_roberta/distilroberta-base-pytorch_model.bin", 'roberta-base-openai-detector': "/data0/maqi/pretrained_model/modeling_roberta/roberta-base-openai-detector-pytorch_model.bin", 'roberta-large-openai-detector': "/data0/maqi/pretrained_model/modeling_roberta/roberta-large-openai-detector-pytorch_model.bin",}
- configuration_roberta.py
vim /data0/maqi/.conda/envs/allennlp/lib/python3.6/site-packages/transformers/configuration_roberta.py
原始:
ROBERTA_PRETRAINED_CONFIG_ARCHIVE_MAP = { 'roberta-base': "https://s3.amazonaws.com/models.huggingface.co/bert/roberta-base-config.json", 'roberta-large': "https://s3.amazonaws.com/models.huggingface.co/bert/roberta-large-config.json", 'roberta-large-mnli': "https://s3.amazonaws.com/models.huggingface.co/bert/roberta-large-mnli-config.json", 'distilroberta-base': "https://s3.amazonaws.com/models.huggingface.co/bert/distilroberta-base-config.json", 'roberta-base-openai-detector': "https://s3.amazonaws.com/models.huggingface.co/bert/roberta-base-openai-detector-config.json", 'roberta-large-openai-detector': "https://s3.amazonaws.com/models.huggingface.co/bert/roberta-large-openai-detector-config.json",}
替换:
ROBERTA_PRETRAINED_CONFIG_ARCHIVE_MAP = { 'roberta-base': "/data0/maqi/pretrained_model/configuration_roberta/roberta-base-config.json", 'roberta-large': "/data0/maqi/pretrained_model/configuration_roberta/roberta-large-config.json", 'roberta-large-mnli': "/data0/maqi/pretrained_model/configuration_roberta/roberta-large-mnli-config.json", 'distilroberta-base': "/data0/maqi/pretrained_model/configuration_roberta/distilroberta-base-config.json", 'roberta-base-openai-detector': "/data0/maqi/pretrained_model/configuration_roberta/roberta-base-openai-detector-config.json", 'roberta-large-openai-detector': "/data0/maqi/pretrained_model/configuration_roberta/roberta-large-openai-detector-config.json",}
- tokenization_bert.py
原始:
PRETRAINED_VOCAB_FILES_MAP = { 'vocab_file': { 'bert-base-uncased': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-vocab.txt", 'bert-large-uncased': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-large-uncased-vocab.txt", 'bert-base-cased': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-cased-vocab.txt", 'bert-large-cased': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-large-cased-vocab.txt", 'bert-base-multilingual-uncased': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-multilingual-uncased-vocab.txt", 'bert-base-multilingual-cased': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-multilingual-cased-vocab.txt", 'bert-base-chinese': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-chinese-vocab.txt", 'bert-base-german-cased': "https://int-deepset-models-bert.s3.eu-central-1.amazonaws.com/pytorch/bert-base-german-cased-vocab.txt", 'bert-large-uncased-whole-word-masking': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-large-uncased-whole-word-masking-vocab.txt", 'bert-large-cased-whole-word-masking': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-large-cased-whole-word-masking-vocab.txt", 'bert-large-uncased-whole-word-masking-finetuned-squad': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-large-uncased-whole-word-masking-finetuned-squad-vocab.txt", 'bert-large-cased-whole-word-masking-finetuned-squad': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-large-cased-whole-word-masking-finetuned-squad-vocab.txt", 'bert-base-cased-finetuned-mrpc': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-cased-finetuned-mrpc-vocab.txt", 'bert-base-german-dbmdz-cased': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-german-dbmdz-cased-vocab.txt", 'bert-base-german-dbmdz-uncased': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-german-dbmdz-uncased-vocab.txt", 'bert-base-finnish-cased-v1': "https://s3.amazonaws.com/models.huggingface.co/bert/TurkuNLP/bert-base-finnish-cased-v1/vocab.txt", 'bert-base-finnish-uncased-v1': "https://s3.amazonaws.com/models.huggingface.co/bert/TurkuNLP/bert-base-finnish-uncased-v1/vocab.txt", }}
替换:
本地文件路径:/data0/maqi/pretrained_model/tokenization_bert
PRETRAINED_VOCAB_FILES_MAP = { 'vocab_file': { 'bert-base-uncased': "/data0/maqi/pretrained_model/tokenization_bert/bert-base-uncased-vocab.txt", 'bert-large-uncased': "/data0/maqi/pretrained_model/tokenization_bert/bert-large-uncased-vocab.txt", 'bert-base-cased': "/data0/maqi/pretrained_model/tokenization_bert/bert-base-cased-vocab.txt", 'bert-large-cased': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-large-cased-vocab.txt", 'bert-base-multilingual-uncased': "/data0/maqi/pretrained_model/tokenization_bert/bert-base-multilingual-uncased-vocab.txt", 'bert-base-multilingual-cased': "/data0/maqi/pretrained_model/tokenization_bert/bert-base-multilingual-cased-vocab.txt", 'bert-base-chinese': "/data0/maqi/pretrained_model/tokenization_bert/bert-base-chinese-vocab.txt", 'bert-base-german-cased': "https://int-deepset-models-bert.s3.eu-central-1.amazonaws.com/pytorch/bert-base-german-cased-vocab.txt", 'bert-large-uncased-whole-word-masking': "/data0/maqi/pretrained_model/tokenization_bert/bert-large-uncased-whole-word-masking-vocab.txt", 'bert-large-cased-whole-word-masking': "/data0/maqi/pretrained_model/tokenization_bert/bert-large-cased-whole-word-masking-vocab.txt", 'bert-large-uncased-whole-word-masking-finetuned-squad': "/data0/maqi/pretrained_model/tokenization_bert/bert-large-uncased-whole-word-masking-finetuned-squad-vocab.txt", 'bert-large-cased-whole-word-masking-finetuned-squad': "/data0/maqi/pretrained_model/tokenization_bert/bert-large-cased-whole-word-masking-finetuned-squad-vocab.txt", 'bert-base-cased-finetuned-mrpc': "/data0/maqi/pretrained_model/tokenization_bert/bert-base-cased-finetuned-mrpc-vocab.txt", 'bert-base-german-dbmdz-cased': "/data0/maqi/pretrained_model/tokenization_bert/bert-base-german-dbmdz-cased-vocab.txt", 'bert-base-german-dbmdz-uncased': "/data0/maqi/pretrained_model/tokenization_bert/bert-base-german-dbmdz-uncased-vocab.txt", 'bert-base-finnish-cased-v1': "/data0/maqi/pretrained_model/tokenization_bert/TurkuNLP/bert-base-finnish-cased-v1/vocab.txt", 'bert-base-finnish-uncased-v1': "/data0/maqi/pretrained_model/tokenization_bert/TurkuNLP/bert-base-finnish-uncased-v1/vocab.txt", }}
- configuration_bert.py
vim /data0/maqi/.conda/envs/allennlp/lib/python3.6/site-packages/transformers/configuration_bert.py
原始:
BERT_PRETRAINED_CONFIG_ARCHIVE_MAP = { 'bert-base-uncased': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-config.json", 'bert-large-uncased': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-large-uncased-config.json", 'bert-base-cased': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-cased-config.json", 'bert-large-cased': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-large-cased-config.json", 'bert-base-multilingual-uncased': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-multilingual-uncased-config.json", 'bert-base-multilingual-cased': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-multilingual-cased-config.json", 'bert-base-chinese': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-chinese-config.json", 'bert-base-german-cased': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-german-cased-config.json", 'bert-large-uncased-whole-word-masking': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-large-uncased-whole-word-masking-config.json", 'bert-large-cased-whole-word-masking': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-large-cased-whole-word-masking-config.json", 'bert-large-uncased-whole-word-masking-finetuned-squad': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-large-uncased-whole-word-masking-finetuned-squad-config.json", 'bert-large-cased-whole-word-masking-finetuned-squad': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-large-cased-whole-word-masking-finetuned-squad-config.json", 'bert-base-cased-finetuned-mrpc': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-cased-finetuned-mrpc-config.json", 'bert-base-german-dbmdz-cased': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-german-dbmdz-cased-config.json", 'bert-base-german-dbmdz-uncased': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-german-dbmdz-uncased-config.json", 'bert-base-japanese': "https://s3.amazonaws.com/models.huggingface.co/bert/cl-tohoku/bert-base-japanese-config.json", 'bert-base-japanese-whole-word-masking': "https://s3.amazonaws.com/models.huggingface.co/bert/cl-tohoku/bert-base-japanese-whole-word-masking-config.json", 'bert-base-japanese-char': "https://s3.amazonaws.com/models.huggingface.co/bert/cl-tohoku/bert-base-japanese-char-config.json", 'bert-base-japanese-char-whole-word-masking': "https://s3.amazonaws.com/models.huggingface.co/bert/cl-tohoku/bert-base-japanese-char-whole-word-masking-config.json", 'bert-base-finnish-cased-v1': "https://s3.amazonaws.com/models.huggingface.co/bert/TurkuNLP/bert-base-finnish-cased-v1/config.json", 'bert-base-finnish-uncased-v1': "https://s3.amazonaws.com/models.huggingface.co/bert/TurkuNLP/bert-base-finnish-uncased-v1/config.json",}
替换:
本地文件:/data0/maqi/pretrained_model/configuration_bert
BERT_PRETRAINED_CONFIG_ARCHIVE_MAP = { 'bert-base-uncased': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-config.json", 'bert-large-uncased': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-large-uncased-config.json", 'bert-base-cased': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-cased-config.json", 'bert-large-cased': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-large-cased-config.json", 'bert-base-multilingual-uncased': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-multilingual-uncased-config.json", 'bert-base-multilingual-cased': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-multilingual-cased-config.json", 'bert-base-chinese': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-chinese-config.json", 'bert-base-german-cased': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-german-cased-config.json", 'bert-large-uncased-whole-word-masking': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-large-uncased-whole-word-masking-config.json", 'bert-large-cased-whole-word-masking': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-large-cased-whole-word-masking-config.json", 'bert-large-uncased-whole-word-masking-finetuned-squad': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-large-uncased-whole-word-masking-finetuned-squad-config.json", 'bert-large-cased-whole-word-masking-finetuned-squad': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-large-cased-whole-word-masking-finetuned-squad-config.json", 'bert-base-cased-finetuned-mrpc': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-cased-finetuned-mrpc-config.json", 'bert-base-german-dbmdz-cased': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-german-dbmdz-cased-config.json", 'bert-base-german-dbmdz-uncased': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-german-dbmdz-uncased-config.json", 'bert-base-japanese': "https://s3.amazonaws.com/models.huggingface.co/bert/cl-tohoku/bert-base-japanese-config.json", 'bert-base-japanese-whole-word-masking': "https://s3.amazonaws.com/models.huggingface.co/bert/cl-tohoku/bert-base-japanese-whole-word-masking-config.json", 'bert-base-japanese-char': "https://s3.amazonaws.com/models.huggingface.co/bert/cl-tohoku/bert-base-japanese-char-config.json", 'bert-base-japanese-char-whole-word-masking': "https://s3.amazonaws.com/models.huggingface.co/bert/cl-tohoku/bert-base-japanese-char-whole-word-masking-config.json", 'bert-base-finnish-cased-v1': "https://s3.amazonaws.com/models.huggingface.co/bert/TurkuNLP/bert-base-finnish-cased-v1/config.json", 'bert-base-finnish-uncased-v1': "https://s3.amazonaws.com/models.huggingface.co/bert/TurkuNLP/bert-base-finnish-uncased-v1/config.json",}
- modeling_bert.py
vim /data0/maqi/.conda/envs/allennlp/lib/python3.6/site-packages/transformers/modeling_bert.py
原始:
BERT_PRETRAINED_MODEL_ARCHIVE_MAP = { 'bert-base-uncased': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-pytorch_model.bin", 'bert-large-uncased': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-large-uncased-pytorch_model.bin", 'bert-base-cased': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-cased-pytorch_model.bin", 'bert-large-cased': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-large-cased-pytorch_model.bin", 'bert-base-multilingual-uncased': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-multilingual-uncased-pytorch_model.bin", 'bert-base-multilingual-cased': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-multilingual-cased-pytorch_model.bin", 'bert-base-chinese': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-chinese-pytorch_model.bin", 'bert-base-german-cased': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-german-cased-pytorch_model.bin", 'bert-large-uncased-whole-word-masking': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-large-uncased-whole-word-masking-pytorch_model.bin", 'bert-large-cased-whole-word-masking': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-large-cased-whole-word-masking-pytorch_model.bin", 'bert-large-uncased-whole-word-masking-finetuned-squad': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-large-uncased-whole-word-masking-finetuned-squad-pytorch_model.bin", 'bert-large-cased-whole-word-masking-finetuned-squad': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-large-cased-whole-word-masking-finetuned-squad-pytorch_model.bin", 'bert-base-cased-finetuned-mrpc': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-cased-finetuned-mrpc-pytorch_model.bin", 'bert-base-german-dbmdz-cased': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-german-dbmdz-cased-pytorch_model.bin", 'bert-base-german-dbmdz-uncased': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-german-dbmdz-uncased-pytorch_model.bin", 'bert-base-japanese': "https://s3.amazonaws.com/models.huggingface.co/bert/cl-tohoku/bert-base-japanese-pytorch_model.bin", 'bert-base-japanese-whole-word-masking': "https://s3.amazonaws.com/models.huggingface.co/bert/cl-tohoku/bert-base-japanese-whole-word-masking-pytorch_model.bin", 'bert-base-japanese-char': "https://s3.amazonaws.com/models.huggingface.co/bert/cl-tohoku/bert-base-japanese-char-pytorch_model.bin", 'bert-base-japanese-char-whole-word-masking': "https://s3.amazonaws.com/models.huggingface.co/bert/cl-tohoku/bert-base-japanese-char-whole-word-masking-pytorch_model.bin", 'bert-base-finnish-cased-v1': "https://s3.amazonaws.com/models.huggingface.co/bert/TurkuNLP/bert-base-finnish-cased-v1/pytorch_model.bin", 'bert-base-finnish-uncased-v1': "https://s3.amazonaws.com/models.huggingface.co/bert/TurkuNLP/bert-base-finnish-uncased-v1/pytorch_model.bin",}
替换:
本地路径:/data0/maqi/pretrained_model/modeling_bert
BERT_PRETRAINED_MODEL_ARCHIVE_MAP = { 'bert-base-uncased': "/data0/maqi/pretrained_model/modeling_bert/bert-base-uncased-pytorch_model.bin", 'bert-large-uncased': "/data0/maqi/pretrained_model/modeling_bert/bert-large-uncased-pytorch_model.bin", 'bert-base-cased': "/data0/maqi/pretrained_model/modeling_bert/bert-base-cased-pytorch_model.bin", 'bert-large-cased': "/data0/maqi/pretrained_model/modeling_bert/bert-large-cased-pytorch_model.bin", 'bert-base-multilingual-uncased': "/data0/maqi/pretrained_model/modeling_bert/bert-base-multilingual-uncased-pytorch_model.bin", 'bert-base-multilingual-cased': "/data0/maqi/pretrained_model/modeling_bert/bert-base-multilingual-cased-pytorch_model.bin", 'bert-base-chinese': "/data0/maqi/pretrained_model/modeling_bert/bert-base-chinese-pytorch_model.bin", 'bert-base-german-cased': "/data0/maqi/pretrained_model/modeling_bert/bert-base-german-cased-pytorch_model.bin", 'bert-large-uncased-whole-word-masking': "/data0/maqi/pretrained_model/modeling_bert/bert-large-uncased-whole-word-masking-pytorch_model.bin", 'bert-large-cased-whole-word-masking': "/data0/maqi/pretrained_model/modeling_bert/bert-large-cased-whole-word-masking-pytorch_model.bin", 'bert-large-uncased-whole-word-masking-finetuned-squad': "/data0/maqi/pretrained_model/modeling_bert/bert-large-uncased-whole-word-masking-finetuned-squad-pytorch_model.bin", 'bert-large-cased-whole-word-masking-finetuned-squad': "/data0/maqi/pretrained_model/modeling_bert/bert-large-cased-whole-word-masking-finetuned-squad-pytorch_model.bin", 'bert-base-cased-finetuned-mrpc': "/data0/maqi/pretrained_model/modeling_bert/bert-base-cased-finetuned-mrpc-pytorch_model.bin", 'bert-base-german-dbmdz-cased': "/data0/maqi/pretrained_model/modeling_bert/bert-base-german-dbmdz-cased-pytorch_model.bin", 'bert-base-german-dbmdz-uncased': "/data0/maqi/pretrained_model/modeling_bert/bert-base-german-dbmdz-uncased-pytorch_model.bin", 'bert-base-japanese': "/data0/maqi/pretrained_model/modeling_bert/cl-tohoku/bert-base-japanese-pytorch_model.bin", 'bert-base-japanese-whole-word-masking': "/data0/maqi/pretrained_model/modeling_bert/cl-tohoku/bert-base-japanese-whole-word-masking-pytorch_model.bin", 'bert-base-japanese-char': "/data0/maqi/pretrained_model/modeling_bert/cl-tohoku/bert-base-japanese-char-pytorch_model.bin", 'bert-base-japanese-char-whole-word-masking': "/data0/maqi/pretrained_model/modeling_bert/cl-tohoku/bert-base-japanese-char-whole-word-masking-pytorch_model.bin", 'bert-base-finnish-cased-v1': "/data0/maqi/pretrained_model/modeling_bert/TurkuNLP/bert-base-finnish-cased-v1/pytorch_model.bin", 'bert-base-finnish-uncased-v1': "/data0/maqi/pretrained_model/modeling_bert/TurkuNLP/bert-base-finnish-uncased-v1/pytorch_model.bin",}
执行流程
针对tag-based-multi-span-extraction/configs/drop/roberta/drop_roberta_large_TASE_IO_SSE.jsonnet分析
-
dataset_reader
“is_training”: true,设置为训练模式
"dataset_reader": { "type": "tbmse_drop",//选择src/data/dataset_readers/drop/drop_reader.py "answer_field_generators": { "arithmetic_answer": { "type": "arithmetic_answer_generator",//选择src/data/dataset_readers/answer_field_generators/arithmetic_answer_generator.py "special_numbers": [ 100, 1 ] }, "count_answer": { "type": "count_answer_generator"//选择src/data/dataset_readers/answer_field_generators/count_answer_generator.py }, "passage_span_answer": { "type": "span_answer_generator",//选择src/data/dataset_readers/answer_field_generators/span_answer_generator.py "text_type": "passage"//参数 }, "question_span_answer": { "type": "span_answer_generator",//选择src/data/dataset_readers/answer_field_generators/span_answer_generator.py "text_type": "question"//参数 }, "tagged_answer": { "type": "tagged_answer_generator",//选择src/data/dataset_readers/answer_field_generators/tagged_answer_generator.py "ignore_question": false, "labels": { "I": 1, "O": 0 } } }, "answer_generator_names_per_type": {//drop_reader.py的参数 "date": [ "arithmetic_answer", "passage_span_answer", "question_span_answer", "tagged_answer" ], "multiple_span": [ "tagged_answer" ], "number": [ "arithmetic_answer", "count_answer", "passage_span_answer", "question_span_answer", "tagged_answer" ], "single_span": [ "tagged_answer", "passage_span_answer", "question_span_answer" ] }, "is_training": true, "lazy": true, "old_reader_behavior": true, "pickle": { "action": "load", "file_name": "all_heads_IO_roberta-large", "path": "../pickle/drop" }, "tokenizer": { "type": "huggingface_transformers",//选择src/data/tokenizers/huggingface_transformers_tokenizer.py "pretrained_model": "roberta-large"//参数 } },
- model
"model": { "type": "multi_head",//选择src/models/multi_head_model.py "dataset_name": "drop", "head_predictor": { "activations": [ "relu", "linear" ], "dropout": [ 0.1, 0 ], "hidden_dims": [ 1024, 5 ], "input_dim": 2048, "num_layers": 2 }, "heads": { "arithmetic": { "type": "arithmetic_head",//选择src/modules/heads/arithmetic_head.py "output_layer": { "activations": [ "relu", "linear" ], "dropout": [ 0.1, 0 ], "hidden_dims": [ 1024, 3 ], "input_dim": 2048, "num_layers": 2 }, "special_embedding_dim": 1024, "special_numbers": [ 100, 1 ], "training_style": "soft_em" }, "count": { "type": "count_head",//选择src/modules/heads/count_head.py "max_count": 10, "output_layer": { "activations": [ "relu", "linear" ], "dropout": [ 0.1, 0 ], "hidden_dims": [ 1024, 11 ], "input_dim": 1024, "num_layers": 2 } }, "multi_span": { "type": "multi_span_head",//选择src/modules/heads/multi_span_head.py "decoding_style": "at_least_one", "ignore_question": false, "labels": { "I": 1, "O": 0 }, "output_layer": { "activations": [ "relu", "linear" ], "dropout": [ 0.1, 0 ], "hidden_dims": [ 1024, 2 ], "input_dim": 1024, "num_layers": 2 }, "prediction_method": "viterbi", "training_style": "soft_em" }, "passage_span": {//继承了src/modules/heads/single_span_head.py "type": "passage_span_head",//选择src/modules/heads/passage_span_head.py "end_output_layer": { "activations": "linear", "hidden_dims": 1, "input_dim": 1024, "num_layers": 1 }, "start_output_layer": { "activations": "linear", "hidden_dims": 1, "input_dim": 1024, "num_layers": 1 }, "training_style": "soft_em" }, "question_span": {//继承了src/modules/heads/single_span_head.py "type": "question_span_head",//选择src/modules/heads/question_span_head.py "end_output_layer": { "activations": [ "relu", "linear" ], "dropout": [ 0.1, 0 ], "hidden_dims": [ 1024, 1 ], "input_dim": 2048, "num_layers": 2 }, "start_output_layer": { "activations": [ "relu", "linear" ], "dropout": [ 0.1, 0 ], "hidden_dims": [ 1024, 1 ], "input_dim": 2048, "num_layers": 2 }, "training_style": "soft_em" } }, "passage_summary_vector_module": { "activations": "linear", "hidden_dims": 1, "input_dim": 1024, "num_layers": 1 }, "pretrained_model": "roberta-large", "question_summary_vector_module": { "activations": "linear", "hidden_dims": 1, "input_dim": 1024, "num_layers": 1 } },
- 数据集
"train_data_path": "drop_data/drop_dataset_train.json", "validation_data_path": "drop_data/drop_dataset_dev.json",
-
trainer
“cuda_device”: -1,表示使用cpu
"trainer": { "cuda_device": 0, "keep_serialized_model_every_num_seconds": 3600, "num_epochs": 35, "num_steps_to_accumulate": 6, "optimizer": { "type": "adamw", "lr": 5e-06 }, "patience": 10, "summary_interval": 100, "validation_metric": "+f1" },
-
validation_dataset_reader
“is_training”: false,设置为评估
"validation_dataset_reader": { "type": "tbmse_drop",//选择src/data/dataset_readers/drop/drop_reader.py "answer_field_generators": { "arithmetic_answer": { "type": "arithmetic_answer_generator", "special_numbers": [ 100, 1 ] }, "count_answer": { "type": "count_answer_generator" }, "passage_span_answer": { "type": "span_answer_generator", "text_type": "passage" }, "question_span_answer": { "type": "span_answer_generator", "text_type": "question" }, "tagged_answer": { "type": "tagged_answer_generator", "ignore_question": false, "labels": { "I": 1, "O": 0 } } }, "answer_generator_names_per_type": { "date": [ "arithmetic_answer", "passage_span_answer", "question_span_answer", "tagged_answer" ], "multiple_span": [ "tagged_answer" ], "number": [ "arithmetic_answer", "count_answer", "passage_span_answer", "question_span_answer", "tagged_answer" ], "single_span": [ "tagged_answer", "passage_span_answer", "question_span_answer" ] }, "is_training": false,//设置为评估 "lazy": true, "old_reader_behavior": true, "pickle": { "action": "load", "file_name": "all_heads_IO_roberta-large", "path": "../pickle/drop" }, "tokenizer": { "type": "huggingface_transformers", "pretrained_model": "roberta-large" } }
预测
训练模型打包为model.tar.gz
allennlp predict training_directory/model.tar.gz drop_data/drop_dataset_dev.json --predictor machine-comprehension --cuda-device 0 --output-file predictions.jsonl --use-dataset-reader --include-package src
预测结果保存在根目录的predictions.jsonl
评估
DROP
allennlp evaluate training_directory/model.tar.gz drop_data/drop_dataset_dev.json --cuda-device 3 --output-file eval.json --include-package src
BERT
allennlp evaluate training_directory_bert/model.tar.gz drop_data/drop_dataset_dev.json --cuda-device 1 --output-file eval_bert.json --include-package src
预测结果保存在根目录的eval.json
评估结果——DROP
TASE_IO+SSE
em_all_spans | f1_all_spans | em_multi_span | f1_multi_span | em_span | f1_span |
---|---|---|---|---|---|
80.6 | 87.8 | 60.8 | 82.6 | 84.2 | 89.0 |
TASE_IO+SSE(BLOCK)
em_all_spans | f1_all_spans | em_multi_span | f1_multi_span | em_span | f1_span |
---|---|---|---|---|---|
55.3 | 62.8 | 0 | 0 | 56.5 | 64.2 |
TASE_IO+SSE(BERT_large)
em_all_spans | f1_all_spans | em_multi_span | f1_multi_span | em_span | f1_span |
---|---|---|---|---|---|
76.4 | 83.9 | 54.5 | 80.1 | 80.7 | 85.2 |
TASE_IO+SSE(只对包含答案的句子做IO标记)
em_all_spans | f1_all_spans | em_multi_span | f1_multi_span | em_span | f1_span |
---|---|---|---|---|---|
57.8 | 64.5 | 16.7 | 23.3 | 58.1 | 64.2 |
论文结果: