【代码实现】tag-based-multi-span-extraction

tag-based-multi-span-extraction

代码:https://github.com/eladsegal/tag-based-multi-span-extraction

论文:A Simple and Effective Model for Answering Multi-span Questions

  • 配置环境变量添加代理
 scp -r zhaoxiaofeng@219.216.64.175:~/.proxychains ./

修改~/.bashrc,在末尾添加指令别名

alias proxy=/data0/zhaoxiaofeng/usr/bin/proxychains4 # 77, 175, 206只添加这条 
alias aliasproxy=/home/zhaoxiaofeng/usr/bin/proxychains4 # 154只添加这条
  • 下载代码:
git clone https://github.com/eladsegal/tag-based-multi-span-extraction
  • 配置环境
proxy conda create -n allennlp python=3.6.9
proxy pip install -r requirements.txt
proxy conda install pytorch torchvision torchaudio cudatoolkit=10.1 -c pytorch
pip install en_core_web_sm-2.1.0.tar.gz
加载本地预训练模型(参考 预训练模型——替换为本地文件)
  • 训练模型

    可以使用nohup + &在后台训练

    tail -f nohup.txt 可以实时查看日志

    nohup command >> nohup.out 2>&1 &

    • 2>&1的意思是将标准错误(2)也定向到标准输出(1)的输出文件中。

RoBERTa TASE_IO + SSE

allennlp train configs/drop/roberta/drop_roberta_large_TASE_IO_SSE.jsonnet -s training_directory -f --include-package src

服务器运行:

nohup allennlp train configs/drop/roberta/drop_roberta_large_TASE_IO_SSE.jsonnet -s training_directory_base -f --include-package src >> base_log.out 2>&1 &

或:

allennlp train download_data/config.json -s training_directory --include-package src

Bert_large TASE_BIO + SSE

-f :可以清空训练数据文件夹,重新训练

-r:可以从之前的训练状态恢复

allennlp train configs/drop/bert/drop_bert_large_TASE_BIO_SSE.jsonnet -s training_directory_bert -f --include-package src

服务器运行:

nohup allennlp train configs/drop/bert/drop_bert_large_TASE_BIO_SSE.jsonnet -s training_directory_bert -f --include-package src >> bertlog.out 2>&1 &
  • 预测模型

    cuda-device 只能使用一个GPU

    后文详细介绍

  • 评估模型

    后文详细介绍

预训练模型——替换为本地文件

定位技巧:

find 路径 | grep -ri  “字符串” -l

下载文件:

proxy wget https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-vocab.txt

快速下载文件技巧:

由于python中无法执行proxy,报错sh: 1: proxy: not found只能通过手动方式进行,故编写如下脚本,实现命令生成,打印,复制打印的内容并执行可实现批量下载。

import os
BERT_PRETRAINED_MODEL_ARCHIVE_MAP = {
    'bert-base-uncased': "/data0/maqi/pretrained_model/modeling_bert/bert-base-uncased-pytorch_model.bin",
    'bert-large-uncased': "/data0/maqi/pretrained_model/modeling_bert/bert-large-uncased-pytorch_model.bin",
    'bert-base-cased': "/data0/maqi/pretrained_model/modeling_bert/bert-base-cased-pytorch_model.bin",
    'bert-large-cased': "/data0/maqi/pretrained_model/modeling_bert/bert-large-cased-pytorch_model.bin"
}
for url in BERT_PRETRAINED_MODEL_ARCHIVE_MAP.values():
    urls= 'proxy wget '+url
    print(urls)

执行结果:

proxy wget https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-config.json
proxy wget https://s3.amazonaws.com/models.huggingface.co/bert/bert-large-uncased-config.json
proxy wget https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-cased-config.json
proxy wget https://s3.amazonaws.com/models.huggingface.co/bert/bert-large-cased-config.json
proxy wget https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-multilingual-uncased-config.json

涉及文件:

/data0/maqi/.conda/envs/allennlp/lib/python3.6/site-packages/transformers/tokenization_roberta.py

/data0/maqi/.conda/envs/allennlp/lib/python3.6/site-packages/transformers/modeling_roberta.py

/data0/maqi/.conda/envs/allennlp/lib/python3.6/site-packages/transformers/configuration_roberta.py

/data0/maqi/.conda/envs/allennlp/lib/python3.6/site-packages/transformers/tokenization_utils.py 

/data0/maqi/.conda/envs/allennlp/lib/python3.6/site-packages/transformers/tokenization_bert.py

/data0/maqi/.conda/envs/allennlp/lib/python3.6/site-packages/transformers/configuration_bert.py

/data0/maqi/.conda/envs/allennlp/lib/python3.6/site-packages/transformers/modeling_bert.py
  • 各服务器路径
服务器 路径
202.199.6.77 /data0/maqi
219.216.64.206 /data0/maqi
219.216.64.175
219.216.64.154
scp  /data0/maqi/.conda/envs/allennlp/lib/python3.6/site-packages/transformers/tokenization_roberta.py  maqi@202.199.6.77:/data0/maqi/.conda/envs/allennlp/lib/python3.6/site-packages/transformers/tokenization_roberta.py

scp  /data0/maqi/.conda/envs/allennlp/lib/python3.6/site-packages/transformers/modeling_roberta.py  maqi@202.199.6.77:/data0/maqi/.conda/envs/allennlp/lib/python3.6/site-packages/transformers/modeling_roberta.py 

scp  /data0/maqi/.conda/envs/allennlp/lib/python3.6/site-packages/transformers/configuration_roberta.py  maqi@202.199.6.77:/data0/maqi/.conda/envs/allennlp/lib/python3.6/site-packages/transformers/configuration_roberta.py 

scp  /data0/maqi/.conda/envs/allennlp/lib/python3.6/site-packages/transformers/tokenization_utils.py  maqi@202.199.6.77:/data0/maqi/.conda/envs/allennlp/lib/python3.6/site-packages/transformers/tokenization_utils.py

scp  /data0/maqi/.conda/envs/allennlp/lib/python3.6/site-packages/transformers/tokenization_bert.py  maqi@202.199.6.77:/data0/maqi/.conda/envs/allennlp/lib/python3.6/site-packages/transformers/tokenization_bert.py

scp  /data0/maqi/.conda/envs/allennlp/lib/python3.6/site-packages/transformers/configuration_bert.py  maqi@202.199.6.77:/data0/maqi/.conda/envs/allennlp/lib/python3.6/site-packages/transformers/configuration_bert.py


scp  /data0/maqi/.conda/envs/allennlp/lib/python3.6/site-packages/transformers/modeling_bert.py  maqi@202.199.6.77:/data0/maqi/.conda/envs/allennlp/lib/python3.6/site-packages/transformers/modeling_bert.py

  • tokenization_roberta.py
vim /data0/maqi/.conda/envs/allennlp/lib/python3.6/site-packages/transformers/tokenization_roberta.py

原始:

PRETRAINED_VOCAB_FILES_MAP = {
    'vocab_file':
    {
        'roberta-base': "https://s3.amazonaws.com/models.huggingface.co/bert/roberta-base-vocab.json",
        'roberta-large': "https://s3.amazonaws.com/models.huggingface.co/bert/roberta-large-vocab.json",
        'roberta-large-mnli': "https://s3.amazonaws.com/models.huggingface.co/bert/roberta-large-mnli-vocab.json",
        'distilroberta-base': "https://s3.amazonaws.com/models.huggingface.co/bert/distilroberta-base-vocab.json",
        'roberta-base-openai-detector': "https://s3.amazonaws.com/models.huggingface.co/bert/roberta-base-vocab.json",
        'roberta-large-openai-detector': "https://s3.amazonaws.com/models.huggingface.co/bert/roberta-large-vocab.json",
    },
    'merges_file':
    {
        'roberta-base': "https://s3.amazonaws.com/models.huggingface.co/bert/roberta-base-merges.txt",
        'roberta-large': "https://s3.amazonaws.com/models.huggingface.co/bert/roberta-large-merges.txt",
        'roberta-large-mnli': "https://s3.amazonaws.com/models.huggingface.co/bert/roberta-large-mnli-merges.txt",
        'distilroberta-base': "https://s3.amazonaws.com/models.huggingface.co/bert/distilroberta-base-merges.txt",
        'roberta-base-openai-detector': "https://s3.amazonaws.com/models.huggingface.co/bert/roberta-base-merges.txt",
        'roberta-large-openai-detector': "https://s3.amazonaws.com/models.huggingface.co/bert/roberta-large-merges.txt",
    },
}

替换:

PRETRAINED_VOCAB_FILES_MAP = {
    'vocab_file':
    {
        'roberta-base': "/data0/maqi/pretrained_model/tokenization_roberta/roberta-base-vocab.json",
        'roberta-large': "/data0/maqi/pretrained_model/tokenization_roberta/roberta-large-vocab.json",
        'roberta-large-mnli': "/data0/maqi/pretrained_model/tokenization_roberta/roberta-large-mnli-vocab.json",
        'distilroberta-base': "/data0/maqi/pretrained_model/tokenization_roberta/distilroberta-base-vocab.json",
        'roberta-base-openai-detector': "/data0/maqi/pretrained_model/tokenization_roberta/roberta-base-vocab.json",
        'roberta-large-openai-detector': "/data0/maqi/pretrained_model/tokenization_roberta/roberta-large-vocab.json",
    },
    'merges_file':
    {
        'roberta-base': "/data0/maqi/pretrained_model/tokenization_roberta/roberta-base-merges.txt",
        'roberta-large': "/data0/maqi/pretrained_model/tokenization_roberta/roberta-large-merges.txt",
        'roberta-large-mnli': "/data0/maqi/pretrained_model/tokenization_roberta/roberta-large-mnli-merges.txt",
        'distilroberta-base': "/data0/maqi/pretrained_model/tokenization_roberta/distilroberta-base-merges.txt",
        'roberta-base-openai-detector': "/data0/maqi/pretrained_model/tokenization_roberta/roberta-base-merges.txt",
        'roberta-large-openai-detector': "/data0/maqi/pretrained_model/tokenization_roberta/roberta-large-merges.txt",
    },
}
  • modeling_roberta.py
vim /data0/maqi/.conda/envs/allennlp/lib/python3.6/site-packages/transformers/modeling_roberta.py

原始:

ROBERTA_PRETRAINED_MODEL_ARCHIVE_MAP = {    'roberta-base': "https://s3.amazonaws.com/models.huggingface.co/bert/roberta-base-pytorch_model.bin",    'roberta-large': "https://s3.amazonaws.com/models.huggingface.co/bert/roberta-large-pytorch_model.bin",    'roberta-large-mnli': "https://s3.amazonaws.com/models.huggingface.co/bert/roberta-large-mnli-pytorch_model.bin",    'distilroberta-base': "https://s3.amazonaws.com/models.huggingface.co/bert/distilroberta-base-pytorch_model.bin",    'roberta-base-openai-detector': "https://s3.amazonaws.com/models.huggingface.co/bert/roberta-base-openai-detector-pytorch_model.bin",    'roberta-large-openai-detector': "https://s3.amazonaws.com/models.huggingface.co/bert/roberta-large-openai-detector-pytorch_model.bin",}

替换:

ROBERTA_PRETRAINED_MODEL_ARCHIVE_MAP = {    'roberta-base': "/data0/maqi/pretrained_model/modeling_roberta/roberta-base-pytorch_model.bin",    'roberta-large': "/data0/maqi/pretrained_model/modeling_roberta/roberta-large-pytorch_model.bin",    'roberta-large-mnli': "/data0/maqi/pretrained_model/modeling_roberta/roberta-large-mnli-pytorch_model.bin",    'distilroberta-base': "/data0/maqi/pretrained_model/modeling_roberta/distilroberta-base-pytorch_model.bin",    'roberta-base-openai-detector': "/data0/maqi/pretrained_model/modeling_roberta/roberta-base-openai-detector-pytorch_model.bin",    'roberta-large-openai-detector': "/data0/maqi/pretrained_model/modeling_roberta/roberta-large-openai-detector-pytorch_model.bin",}
  • configuration_roberta.py
vim /data0/maqi/.conda/envs/allennlp/lib/python3.6/site-packages/transformers/configuration_roberta.py

原始:

ROBERTA_PRETRAINED_CONFIG_ARCHIVE_MAP = {    'roberta-base': "https://s3.amazonaws.com/models.huggingface.co/bert/roberta-base-config.json",    'roberta-large': "https://s3.amazonaws.com/models.huggingface.co/bert/roberta-large-config.json",    'roberta-large-mnli': "https://s3.amazonaws.com/models.huggingface.co/bert/roberta-large-mnli-config.json",    'distilroberta-base': "https://s3.amazonaws.com/models.huggingface.co/bert/distilroberta-base-config.json",    'roberta-base-openai-detector': "https://s3.amazonaws.com/models.huggingface.co/bert/roberta-base-openai-detector-config.json",    'roberta-large-openai-detector': "https://s3.amazonaws.com/models.huggingface.co/bert/roberta-large-openai-detector-config.json",}

替换:

ROBERTA_PRETRAINED_CONFIG_ARCHIVE_MAP = {    'roberta-base': "/data0/maqi/pretrained_model/configuration_roberta/roberta-base-config.json",    'roberta-large': "/data0/maqi/pretrained_model/configuration_roberta/roberta-large-config.json",    'roberta-large-mnli': "/data0/maqi/pretrained_model/configuration_roberta/roberta-large-mnli-config.json",    'distilroberta-base': "/data0/maqi/pretrained_model/configuration_roberta/distilroberta-base-config.json",    'roberta-base-openai-detector': "/data0/maqi/pretrained_model/configuration_roberta/roberta-base-openai-detector-config.json",    'roberta-large-openai-detector': "/data0/maqi/pretrained_model/configuration_roberta/roberta-large-openai-detector-config.json",}
  • tokenization_bert.py

原始:

PRETRAINED_VOCAB_FILES_MAP = {    'vocab_file':    {        'bert-base-uncased': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-vocab.txt",        'bert-large-uncased': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-large-uncased-vocab.txt",        'bert-base-cased': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-cased-vocab.txt",        'bert-large-cased': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-large-cased-vocab.txt",        'bert-base-multilingual-uncased': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-multilingual-uncased-vocab.txt",        'bert-base-multilingual-cased': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-multilingual-cased-vocab.txt",        'bert-base-chinese': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-chinese-vocab.txt",        'bert-base-german-cased': "https://int-deepset-models-bert.s3.eu-central-1.amazonaws.com/pytorch/bert-base-german-cased-vocab.txt",        'bert-large-uncased-whole-word-masking': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-large-uncased-whole-word-masking-vocab.txt",        'bert-large-cased-whole-word-masking': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-large-cased-whole-word-masking-vocab.txt",        'bert-large-uncased-whole-word-masking-finetuned-squad': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-large-uncased-whole-word-masking-finetuned-squad-vocab.txt",        'bert-large-cased-whole-word-masking-finetuned-squad': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-large-cased-whole-word-masking-finetuned-squad-vocab.txt",        'bert-base-cased-finetuned-mrpc': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-cased-finetuned-mrpc-vocab.txt",        'bert-base-german-dbmdz-cased': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-german-dbmdz-cased-vocab.txt",        'bert-base-german-dbmdz-uncased': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-german-dbmdz-uncased-vocab.txt",        'bert-base-finnish-cased-v1': "https://s3.amazonaws.com/models.huggingface.co/bert/TurkuNLP/bert-base-finnish-cased-v1/vocab.txt",        'bert-base-finnish-uncased-v1': "https://s3.amazonaws.com/models.huggingface.co/bert/TurkuNLP/bert-base-finnish-uncased-v1/vocab.txt",    }}

替换:

本地文件路径:/data0/maqi/pretrained_model/tokenization_bert

PRETRAINED_VOCAB_FILES_MAP = {    'vocab_file':    {        'bert-base-uncased': "/data0/maqi/pretrained_model/tokenization_bert/bert-base-uncased-vocab.txt",        'bert-large-uncased': "/data0/maqi/pretrained_model/tokenization_bert/bert-large-uncased-vocab.txt",        'bert-base-cased': "/data0/maqi/pretrained_model/tokenization_bert/bert-base-cased-vocab.txt",        'bert-large-cased': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-large-cased-vocab.txt",        'bert-base-multilingual-uncased': "/data0/maqi/pretrained_model/tokenization_bert/bert-base-multilingual-uncased-vocab.txt",        'bert-base-multilingual-cased': "/data0/maqi/pretrained_model/tokenization_bert/bert-base-multilingual-cased-vocab.txt",        'bert-base-chinese': "/data0/maqi/pretrained_model/tokenization_bert/bert-base-chinese-vocab.txt",        'bert-base-german-cased': "https://int-deepset-models-bert.s3.eu-central-1.amazonaws.com/pytorch/bert-base-german-cased-vocab.txt",        'bert-large-uncased-whole-word-masking': "/data0/maqi/pretrained_model/tokenization_bert/bert-large-uncased-whole-word-masking-vocab.txt",        'bert-large-cased-whole-word-masking': "/data0/maqi/pretrained_model/tokenization_bert/bert-large-cased-whole-word-masking-vocab.txt",        'bert-large-uncased-whole-word-masking-finetuned-squad': "/data0/maqi/pretrained_model/tokenization_bert/bert-large-uncased-whole-word-masking-finetuned-squad-vocab.txt",        'bert-large-cased-whole-word-masking-finetuned-squad': "/data0/maqi/pretrained_model/tokenization_bert/bert-large-cased-whole-word-masking-finetuned-squad-vocab.txt",        'bert-base-cased-finetuned-mrpc': "/data0/maqi/pretrained_model/tokenization_bert/bert-base-cased-finetuned-mrpc-vocab.txt",        'bert-base-german-dbmdz-cased': "/data0/maqi/pretrained_model/tokenization_bert/bert-base-german-dbmdz-cased-vocab.txt",        'bert-base-german-dbmdz-uncased': "/data0/maqi/pretrained_model/tokenization_bert/bert-base-german-dbmdz-uncased-vocab.txt",        'bert-base-finnish-cased-v1': "/data0/maqi/pretrained_model/tokenization_bert/TurkuNLP/bert-base-finnish-cased-v1/vocab.txt",        'bert-base-finnish-uncased-v1': "/data0/maqi/pretrained_model/tokenization_bert/TurkuNLP/bert-base-finnish-uncased-v1/vocab.txt",    }}
  • configuration_bert.py
vim /data0/maqi/.conda/envs/allennlp/lib/python3.6/site-packages/transformers/configuration_bert.py

原始:

BERT_PRETRAINED_CONFIG_ARCHIVE_MAP = {    'bert-base-uncased': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-config.json",    'bert-large-uncased': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-large-uncased-config.json",    'bert-base-cased': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-cased-config.json",    'bert-large-cased': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-large-cased-config.json",    'bert-base-multilingual-uncased': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-multilingual-uncased-config.json",    'bert-base-multilingual-cased': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-multilingual-cased-config.json",    'bert-base-chinese': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-chinese-config.json",    'bert-base-german-cased': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-german-cased-config.json",    'bert-large-uncased-whole-word-masking': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-large-uncased-whole-word-masking-config.json",    'bert-large-cased-whole-word-masking': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-large-cased-whole-word-masking-config.json",    'bert-large-uncased-whole-word-masking-finetuned-squad': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-large-uncased-whole-word-masking-finetuned-squad-config.json",    'bert-large-cased-whole-word-masking-finetuned-squad': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-large-cased-whole-word-masking-finetuned-squad-config.json",    'bert-base-cased-finetuned-mrpc': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-cased-finetuned-mrpc-config.json",    'bert-base-german-dbmdz-cased': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-german-dbmdz-cased-config.json",    'bert-base-german-dbmdz-uncased': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-german-dbmdz-uncased-config.json",    'bert-base-japanese': "https://s3.amazonaws.com/models.huggingface.co/bert/cl-tohoku/bert-base-japanese-config.json",    'bert-base-japanese-whole-word-masking': "https://s3.amazonaws.com/models.huggingface.co/bert/cl-tohoku/bert-base-japanese-whole-word-masking-config.json",    'bert-base-japanese-char': "https://s3.amazonaws.com/models.huggingface.co/bert/cl-tohoku/bert-base-japanese-char-config.json",    'bert-base-japanese-char-whole-word-masking': "https://s3.amazonaws.com/models.huggingface.co/bert/cl-tohoku/bert-base-japanese-char-whole-word-masking-config.json",    'bert-base-finnish-cased-v1': "https://s3.amazonaws.com/models.huggingface.co/bert/TurkuNLP/bert-base-finnish-cased-v1/config.json",    'bert-base-finnish-uncased-v1': "https://s3.amazonaws.com/models.huggingface.co/bert/TurkuNLP/bert-base-finnish-uncased-v1/config.json",}

替换:

本地文件:/data0/maqi/pretrained_model/configuration_bert

BERT_PRETRAINED_CONFIG_ARCHIVE_MAP = {    'bert-base-uncased': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-config.json",    'bert-large-uncased': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-large-uncased-config.json",    'bert-base-cased': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-cased-config.json",    'bert-large-cased': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-large-cased-config.json",    'bert-base-multilingual-uncased': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-multilingual-uncased-config.json",    'bert-base-multilingual-cased': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-multilingual-cased-config.json",    'bert-base-chinese': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-chinese-config.json",    'bert-base-german-cased': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-german-cased-config.json",    'bert-large-uncased-whole-word-masking': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-large-uncased-whole-word-masking-config.json",    'bert-large-cased-whole-word-masking': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-large-cased-whole-word-masking-config.json",    'bert-large-uncased-whole-word-masking-finetuned-squad': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-large-uncased-whole-word-masking-finetuned-squad-config.json",    'bert-large-cased-whole-word-masking-finetuned-squad': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-large-cased-whole-word-masking-finetuned-squad-config.json",    'bert-base-cased-finetuned-mrpc': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-cased-finetuned-mrpc-config.json",    'bert-base-german-dbmdz-cased': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-german-dbmdz-cased-config.json",    'bert-base-german-dbmdz-uncased': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-german-dbmdz-uncased-config.json",    'bert-base-japanese': "https://s3.amazonaws.com/models.huggingface.co/bert/cl-tohoku/bert-base-japanese-config.json",    'bert-base-japanese-whole-word-masking': "https://s3.amazonaws.com/models.huggingface.co/bert/cl-tohoku/bert-base-japanese-whole-word-masking-config.json",    'bert-base-japanese-char': "https://s3.amazonaws.com/models.huggingface.co/bert/cl-tohoku/bert-base-japanese-char-config.json",    'bert-base-japanese-char-whole-word-masking': "https://s3.amazonaws.com/models.huggingface.co/bert/cl-tohoku/bert-base-japanese-char-whole-word-masking-config.json",    'bert-base-finnish-cased-v1': "https://s3.amazonaws.com/models.huggingface.co/bert/TurkuNLP/bert-base-finnish-cased-v1/config.json",    'bert-base-finnish-uncased-v1': "https://s3.amazonaws.com/models.huggingface.co/bert/TurkuNLP/bert-base-finnish-uncased-v1/config.json",}
  • modeling_bert.py
vim /data0/maqi/.conda/envs/allennlp/lib/python3.6/site-packages/transformers/modeling_bert.py

原始:

BERT_PRETRAINED_MODEL_ARCHIVE_MAP = {    'bert-base-uncased': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-pytorch_model.bin",    'bert-large-uncased': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-large-uncased-pytorch_model.bin",    'bert-base-cased': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-cased-pytorch_model.bin",    'bert-large-cased': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-large-cased-pytorch_model.bin",    'bert-base-multilingual-uncased': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-multilingual-uncased-pytorch_model.bin",    'bert-base-multilingual-cased': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-multilingual-cased-pytorch_model.bin",    'bert-base-chinese': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-chinese-pytorch_model.bin",    'bert-base-german-cased': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-german-cased-pytorch_model.bin",    'bert-large-uncased-whole-word-masking': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-large-uncased-whole-word-masking-pytorch_model.bin",    'bert-large-cased-whole-word-masking': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-large-cased-whole-word-masking-pytorch_model.bin",    'bert-large-uncased-whole-word-masking-finetuned-squad': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-large-uncased-whole-word-masking-finetuned-squad-pytorch_model.bin",    'bert-large-cased-whole-word-masking-finetuned-squad': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-large-cased-whole-word-masking-finetuned-squad-pytorch_model.bin",    'bert-base-cased-finetuned-mrpc': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-cased-finetuned-mrpc-pytorch_model.bin",    'bert-base-german-dbmdz-cased': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-german-dbmdz-cased-pytorch_model.bin",    'bert-base-german-dbmdz-uncased': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-german-dbmdz-uncased-pytorch_model.bin",    'bert-base-japanese': "https://s3.amazonaws.com/models.huggingface.co/bert/cl-tohoku/bert-base-japanese-pytorch_model.bin",    'bert-base-japanese-whole-word-masking': "https://s3.amazonaws.com/models.huggingface.co/bert/cl-tohoku/bert-base-japanese-whole-word-masking-pytorch_model.bin",    'bert-base-japanese-char': "https://s3.amazonaws.com/models.huggingface.co/bert/cl-tohoku/bert-base-japanese-char-pytorch_model.bin",    'bert-base-japanese-char-whole-word-masking': "https://s3.amazonaws.com/models.huggingface.co/bert/cl-tohoku/bert-base-japanese-char-whole-word-masking-pytorch_model.bin",    'bert-base-finnish-cased-v1': "https://s3.amazonaws.com/models.huggingface.co/bert/TurkuNLP/bert-base-finnish-cased-v1/pytorch_model.bin",    'bert-base-finnish-uncased-v1': "https://s3.amazonaws.com/models.huggingface.co/bert/TurkuNLP/bert-base-finnish-uncased-v1/pytorch_model.bin",}

替换:

本地路径:/data0/maqi/pretrained_model/modeling_bert

BERT_PRETRAINED_MODEL_ARCHIVE_MAP = {    'bert-base-uncased': "/data0/maqi/pretrained_model/modeling_bert/bert-base-uncased-pytorch_model.bin",    'bert-large-uncased': "/data0/maqi/pretrained_model/modeling_bert/bert-large-uncased-pytorch_model.bin",    'bert-base-cased': "/data0/maqi/pretrained_model/modeling_bert/bert-base-cased-pytorch_model.bin",    'bert-large-cased': "/data0/maqi/pretrained_model/modeling_bert/bert-large-cased-pytorch_model.bin",    'bert-base-multilingual-uncased': "/data0/maqi/pretrained_model/modeling_bert/bert-base-multilingual-uncased-pytorch_model.bin",    'bert-base-multilingual-cased': "/data0/maqi/pretrained_model/modeling_bert/bert-base-multilingual-cased-pytorch_model.bin",    'bert-base-chinese': "/data0/maqi/pretrained_model/modeling_bert/bert-base-chinese-pytorch_model.bin",    'bert-base-german-cased': "/data0/maqi/pretrained_model/modeling_bert/bert-base-german-cased-pytorch_model.bin",    'bert-large-uncased-whole-word-masking': "/data0/maqi/pretrained_model/modeling_bert/bert-large-uncased-whole-word-masking-pytorch_model.bin",    'bert-large-cased-whole-word-masking': "/data0/maqi/pretrained_model/modeling_bert/bert-large-cased-whole-word-masking-pytorch_model.bin",    'bert-large-uncased-whole-word-masking-finetuned-squad': "/data0/maqi/pretrained_model/modeling_bert/bert-large-uncased-whole-word-masking-finetuned-squad-pytorch_model.bin",    'bert-large-cased-whole-word-masking-finetuned-squad': "/data0/maqi/pretrained_model/modeling_bert/bert-large-cased-whole-word-masking-finetuned-squad-pytorch_model.bin",    'bert-base-cased-finetuned-mrpc': "/data0/maqi/pretrained_model/modeling_bert/bert-base-cased-finetuned-mrpc-pytorch_model.bin",    'bert-base-german-dbmdz-cased': "/data0/maqi/pretrained_model/modeling_bert/bert-base-german-dbmdz-cased-pytorch_model.bin",    'bert-base-german-dbmdz-uncased': "/data0/maqi/pretrained_model/modeling_bert/bert-base-german-dbmdz-uncased-pytorch_model.bin",    'bert-base-japanese': "/data0/maqi/pretrained_model/modeling_bert/cl-tohoku/bert-base-japanese-pytorch_model.bin",    'bert-base-japanese-whole-word-masking': "/data0/maqi/pretrained_model/modeling_bert/cl-tohoku/bert-base-japanese-whole-word-masking-pytorch_model.bin",    'bert-base-japanese-char': "/data0/maqi/pretrained_model/modeling_bert/cl-tohoku/bert-base-japanese-char-pytorch_model.bin",    'bert-base-japanese-char-whole-word-masking': "/data0/maqi/pretrained_model/modeling_bert/cl-tohoku/bert-base-japanese-char-whole-word-masking-pytorch_model.bin",    'bert-base-finnish-cased-v1': "/data0/maqi/pretrained_model/modeling_bert/TurkuNLP/bert-base-finnish-cased-v1/pytorch_model.bin",    'bert-base-finnish-uncased-v1': "/data0/maqi/pretrained_model/modeling_bert/TurkuNLP/bert-base-finnish-uncased-v1/pytorch_model.bin",}

执行流程

针对tag-based-multi-span-extraction/configs/drop/roberta/drop_roberta_large_TASE_IO_SSE.jsonnet​分析

  • dataset_reader

    “is_training”: true,设置为训练模式

    "dataset_reader": {        "type": "tbmse_drop",//选择src/data/dataset_readers/drop/drop_reader.py        "answer_field_generators": {            "arithmetic_answer": {                "type": "arithmetic_answer_generator",//选择src/data/dataset_readers/answer_field_generators/arithmetic_answer_generator.py                "special_numbers": [                    100,                    1                ]            },            "count_answer": {                "type": "count_answer_generator"//选择src/data/dataset_readers/answer_field_generators/count_answer_generator.py            },            "passage_span_answer": {                "type": "span_answer_generator",//选择src/data/dataset_readers/answer_field_generators/span_answer_generator.py                "text_type": "passage"//参数            },            "question_span_answer": {                "type": "span_answer_generator",//选择src/data/dataset_readers/answer_field_generators/span_answer_generator.py                "text_type": "question"//参数            },            "tagged_answer": {                "type": "tagged_answer_generator",//选择src/data/dataset_readers/answer_field_generators/tagged_answer_generator.py                "ignore_question": false,                "labels": {                    "I": 1,                    "O": 0                }            }        },        "answer_generator_names_per_type": {//drop_reader.py的参数            "date": [                "arithmetic_answer",                "passage_span_answer",                "question_span_answer",                "tagged_answer"            ],            "multiple_span": [                "tagged_answer"            ],            "number": [                "arithmetic_answer",                "count_answer",                "passage_span_answer",                "question_span_answer",                "tagged_answer"            ],            "single_span": [                "tagged_answer",                "passage_span_answer",                "question_span_answer"            ]        },        "is_training": true,        "lazy": true,        "old_reader_behavior": true,        "pickle": {            "action": "load",            "file_name": "all_heads_IO_roberta-large",            "path": "../pickle/drop"        },        "tokenizer": {            "type": "huggingface_transformers",//选择src/data/tokenizers/huggingface_transformers_tokenizer.py            "pretrained_model": "roberta-large"//参数        }    },
  • model
    "model": {        "type": "multi_head",//选择src/models/multi_head_model.py        "dataset_name": "drop",        "head_predictor": {            "activations": [                "relu",                "linear"            ],            "dropout": [                0.1,                0            ],            "hidden_dims": [                1024,                5            ],            "input_dim": 2048,            "num_layers": 2        },        "heads": {            "arithmetic": {                "type": "arithmetic_head",//选择src/modules/heads/arithmetic_head.py                "output_layer": {                    "activations": [                        "relu",                        "linear"                    ],                    "dropout": [                        0.1,                        0                    ],                    "hidden_dims": [                        1024,                        3                    ],                    "input_dim": 2048,                    "num_layers": 2                },                "special_embedding_dim": 1024,                "special_numbers": [                    100,                    1                ],                "training_style": "soft_em"            },            "count": {                "type": "count_head",//选择src/modules/heads/count_head.py                "max_count": 10,                "output_layer": {                    "activations": [                        "relu",                        "linear"                    ],                    "dropout": [                        0.1,                        0                    ],                    "hidden_dims": [                        1024,                        11                    ],                    "input_dim": 1024,                    "num_layers": 2                }            },            "multi_span": {                "type": "multi_span_head",//选择src/modules/heads/multi_span_head.py                "decoding_style": "at_least_one",                "ignore_question": false,                "labels": {                    "I": 1,                    "O": 0                },                "output_layer": {                    "activations": [                        "relu",                        "linear"                    ],                    "dropout": [                        0.1,                        0                    ],                    "hidden_dims": [                        1024,                        2                    ],                    "input_dim": 1024,                    "num_layers": 2                },                "prediction_method": "viterbi",                "training_style": "soft_em"            },            "passage_span": {//继承了src/modules/heads/single_span_head.py                 "type": "passage_span_head",//选择src/modules/heads/passage_span_head.py                "end_output_layer": {                    "activations": "linear",                    "hidden_dims": 1,                    "input_dim": 1024,                    "num_layers": 1                },                "start_output_layer": {                    "activations": "linear",                    "hidden_dims": 1,                    "input_dim": 1024,                    "num_layers": 1                },                "training_style": "soft_em"            },            "question_span": {//继承了src/modules/heads/single_span_head.py                 "type": "question_span_head",//选择src/modules/heads/question_span_head.py                  "end_output_layer": {                    "activations": [                        "relu",                        "linear"                    ],                    "dropout": [                        0.1,                        0                    ],                    "hidden_dims": [                        1024,                        1                    ],                    "input_dim": 2048,                    "num_layers": 2                },                "start_output_layer": {                    "activations": [                        "relu",                        "linear"                    ],                    "dropout": [                        0.1,                        0                    ],                    "hidden_dims": [                        1024,                        1                    ],                    "input_dim": 2048,                    "num_layers": 2                },                "training_style": "soft_em"            }        },        "passage_summary_vector_module": {            "activations": "linear",            "hidden_dims": 1,            "input_dim": 1024,            "num_layers": 1        },        "pretrained_model": "roberta-large",        "question_summary_vector_module": {            "activations": "linear",            "hidden_dims": 1,            "input_dim": 1024,            "num_layers": 1        }    },
  • 数据集
    "train_data_path": "drop_data/drop_dataset_train.json",    "validation_data_path": "drop_data/drop_dataset_dev.json",
  • trainer

    “cuda_device”: -1,表示使用cpu

    "trainer": {        "cuda_device": 0,        "keep_serialized_model_every_num_seconds": 3600,        "num_epochs": 35,        "num_steps_to_accumulate": 6,        "optimizer": {            "type": "adamw",            "lr": 5e-06        },        "patience": 10,        "summary_interval": 100,        "validation_metric": "+f1"    },
  • validation_dataset_reader

    “is_training”: false,设置为评估

    "validation_dataset_reader": {        "type": "tbmse_drop",//选择src/data/dataset_readers/drop/drop_reader.py        "answer_field_generators": {            "arithmetic_answer": {                "type": "arithmetic_answer_generator",                "special_numbers": [                    100,                    1                ]            },            "count_answer": {                "type": "count_answer_generator"            },            "passage_span_answer": {                "type": "span_answer_generator",                "text_type": "passage"            },            "question_span_answer": {                "type": "span_answer_generator",                "text_type": "question"            },            "tagged_answer": {                "type": "tagged_answer_generator",                "ignore_question": false,                "labels": {                    "I": 1,                    "O": 0                }            }        },        "answer_generator_names_per_type": {            "date": [                "arithmetic_answer",                "passage_span_answer",                "question_span_answer",                "tagged_answer"            ],            "multiple_span": [                "tagged_answer"            ],            "number": [                "arithmetic_answer",                "count_answer",                "passage_span_answer",                "question_span_answer",                "tagged_answer"            ],            "single_span": [                "tagged_answer",                "passage_span_answer",                "question_span_answer"            ]        },        "is_training": false,//设置为评估        "lazy": true,        "old_reader_behavior": true,        "pickle": {            "action": "load",            "file_name": "all_heads_IO_roberta-large",            "path": "../pickle/drop"        },        "tokenizer": {            "type": "huggingface_transformers",            "pretrained_model": "roberta-large"        }    }

预测

训练模型打包为model.tar.gz

allennlp predict training_directory/model.tar.gz drop_data/drop_dataset_dev.json --predictor machine-comprehension --cuda-device 0 --output-file predictions.jsonl --use-dataset-reader --include-package src

预测结果保存在根目录的predictions.jsonl

评估

DROP

allennlp evaluate training_directory/model.tar.gz drop_data/drop_dataset_dev.json --cuda-device 3 --output-file eval.json --include-package src

BERT

allennlp evaluate training_directory_bert/model.tar.gz drop_data/drop_dataset_dev.json --cuda-device 1 --output-file eval_bert.json --include-package src

预测结果保存在根目录的eval.json

评估结果——DROP

TASE_IO+SSE

em_all_spans f1_all_spans em_multi_span f1_multi_span em_span f1_span
80.6 87.8 60.8 82.6 84.2 89.0

TASE_IO+SSE(BLOCK)

em_all_spans f1_all_spans em_multi_span f1_multi_span em_span f1_span
55.3 62.8 0 0 56.5 64.2

TASE_IO+SSE(BERT_large)

em_all_spans f1_all_spans em_multi_span f1_multi_span em_span f1_span
76.4 83.9 54.5 80.1 80.7 85.2

TASE_IO+SSE(只对包含答案的句子做IO标记)

em_all_spans f1_all_spans em_multi_span f1_multi_span em_span f1_span
57.8 64.5 16.7 23.3 58.1 64.2

论文结果:

【代码实现】tag-based-multi-span-extraction

上一篇:如何掌握C#的核心技术


下一篇:elemenu二级动态菜单收起来显示有问题解决方式,vue站位页面不渲染标签