尝试使用transformers库提供的各种pipeline
首先安装transformers
pip install transformers
import warnings
warnings.filterwarnings('ignore')
from transformers import pipeline
1–情感分类(Sequence Classification)
classifier = pipeline('sentiment-analysis')
result = classifier("I hate you")[0]
print(f"label: {result['label']}, with score: {round(result['score'], 4)}")
classifier("I love cyx")[0]
label: NEGATIVE, with score: 0.9991
{‘label’: ‘POSITIVE’, ‘score’: 0.9996902942657471}
classifier('这部电影真的很垃圾,浪费我的时间!!!')
[{‘label’: ‘NEGATIVE’, ‘score’: 0.6208301782608032}]
2–智能填词(Masked Language Modeling)
unmask = pipeline('fill-mask')
尝试输入一段有空缺的句子,观察被填补的空缺是否符合真实情况
from pprint import pprint
results1 = unmask(f'{unmask.tokenizer.mask_token} is the most beatiful woman in Harry Potter!')
results2 = unmask(f'{unmask.tokenizer.mask_token} is the best player in the NBA!')
pprint(results1)
pprint(results2)
3–文本生成Text Generation
from transformers import pipeline, set_seed
generator = pipeline('text-generation', model='gpt2')
set_seed(42)
All model checkpoint layers were used when initializing TFGPT2LMHeadModel.
All the layers of TFGPT2LMHeadModel were initialized from the model checkpoint at gpt2.
If your task is similar to the task the model of the checkpoint was trained on, you can already use TFGPT2LMHeadModel for predictions without further training.
generator("Xiao ming loves Xiao Hong secretly,",max_length=100, num_return_sequences=5)
Setting pad_token_id
to 50256 (first eos_token_id
) to generate sequence [{‘generated_text’: ‘Xiao ming loves Xiao Hong secretly, so why shouldn’t she talk to Xiao Ming? That said, they have no idea what Xiao Hong is thinking now as she is not in any way a master of such a secret or anything!\n\n"Don’t be stubborn. That is not your intention……" A little shocked, then, Xiao Ming began to stir, "I told you this before, that you must not try to convince me any more. Why can’t you just go with the’},
4–抽取式问答Extractive Question Answering
从给定问题的文本中提取答案的任务
question_answerer = pipeline( "question-answering" )
# Extractive Question Answering 是从给定问题的文本中提取答案的任务。
# 问答数据集的示例是 SQuAD 数据集,它完全基于该任务。
# 如果您想微调SQuAD 任务上的模型,您可以利用 examples/pytorch/question-回答/run_squad.py 脚本。
context = r"""Extractive Question Answering is the task of extracting an answer from a text given a question. An example of a
question answering dataset is the SQuAD dataset, which is entirely based on that task. If you would like to fine-tune
a model on a SQuAD task, you may leverage the examples/pytorch/question-answering/run_squad.py script."""
result = question_answerer(question="What is extractive question answering?", context=context)
print(f"Answer: '{result['answer']}', score: {round(result['score'], 4)}, start: {result['start']}, end: {result['end']}")
Answer: ‘the task of extracting an answer from a text given a question’, score: 0.6177, start: 33, end: 94
5–Translation翻译
WMT 英语到德语数据集,它以英语句子作为输入数据,将相应的德语句子作为目标数据
translator = pipeline("translation_en_to_de")
print(translator("Hugging Face is a technology company based in New York and Paris", max_length=40))
[{‘translation_text’: ‘Hugging Face ist ein Technologieunternehmen mit Sitz in New York und Paris.’}]
更多信息可参考transformers官方网站https://huggingface.co/transformers/task_summary.html