SoFunction
Updated on 2025-04-25

Python Transformers library (NLP processing library) case code explanation

Here is a commenttransformersA comprehensive explanation of the library includes basic knowledge, advanced usage, case code and learning paths. The content is organized and suitable for learners at different stages.

1. Basic knowledge

1. Introduction to Transformers Library

  • effect: Provides pre-trained models (such as BERT, GPT, RoBERTa) and tools for NLP tasks (text classification, translation, generation, etc.).
  • Core Components
    • Tokenizer: Text participle and encoding
    • Model: Neural Network Model Architecture
    • Pipeline: A fast inference encapsulation interface

2. Installation and Environment Configuration

pip install transformers torch datasets

3. Quick Start Example

from transformers import pipeline
# Use sentiment analysis assembly lineclassifier = pipeline("sentiment-analysis")
result = classifier("I love programming with Transformers!")
print(result)  # [{'label': 'POSITIVE', 'score': 0.9998}]

2. Detailed explanation of the core module

1. Tokenizer (word participle)

from transformers import AutoTokenizer
tokenizer = AutoTokenizer.from_pretrained("bert-base-uncased")
text = "Hello, world!"
encoded = tokenizer(text, 
                    padding=True, 
                    truncation=True, 
                    return_tensors="pt")  # Return PyTorch tensorprint(encoded)
# {'input_ids': tensor([[101, 7592, 1010, 2088, 999, 102]]), 
#  'attention_mask': tensor([[1, 1, 1, 1, 1, 1]])}

2. Model (model loading)

from transformers import AutoModel
model = AutoModel.from_pretrained("bert-base-uncased")
outputs = model(**encoded)  # Forward communicationlast_hidden_states = outputs.last_hidden_state

3. Advanced usage

1. Custom model training (PyTorch example)

from transformers import BertForSequenceClassification, Trainer, TrainingArguments
from datasets import load_dataset
# Load the datasetdataset = load_dataset("imdb")
tokenized_datasets = (
    lambda x: tokenizer(x["text"], padding=True, truncation=True),
    batched=True
)
# Define the modelmodel = BertForSequenceClassification.from_pretrained("bert-base-uncased", num_labels=2)
# Training parameter configurationtraining_args = TrainingArguments(
    output_dir="./results",
    num_train_epochs=3,
    per_device_train_batch_size=8,
    evaluation_strategy="epoch"
)
# Trainer configurationtrainer = Trainer(
    model=model,
    args=training_args,
    train_dataset=tokenized_datasets["train"],
    eval_dataset=tokenized_datasets["test"]
)
# Start training()

2. Model saving and loading

model.save_pretrained("./my_model")
tokenizer.save_pretrained("./my_model")
# Load custom modelnew_model = AutoModel.from_pretrained("./my_model")

4. Deeply advance

1. Attention mechanism visualization

from transformers import BertModel, BertTokenizer
import torch
model = BertModel.from_pretrained("bert-base-uncased", output_attentions=True)
inputs = tokenizer("The cat sat on the mat", return_tensors="pt")
outputs = model(**inputs)
# Extract attention weights from layer 0attention = [0][0]
print()  # [num_heads, seq_len, seq_len]

2. Mixed precision training

from transformers import TrainingArguments
training_args = TrainingArguments(
    fp16=True,  # Enable mixing accuracy    ...
)

5. Complete case: Named entity recognition (NER)

from transformers import pipeline
# Load NER pipelinener_pipeline = pipeline("ner", model="dslim/bert-base-NER")
text = "Apple was founded by Steve Jobs in Cupertino."
results = ner_pipeline(text)
# Results visualizationfor entity in results:
    print(f"{entity['word']} -> {entity['entity']} (confidence: {entity['score']:.2f})")

6. Learning path suggestions

Beginner stage

  • Official documentation: /docs/transformers
  • studypipelineUse with basic models

Intermediate stage

  • Master the custom training process
  • Understand model architecture (Transformer, BERT principle)

Advanced stage

  • Model distillation and quantization
  • Custom model architecture development
  • Big model fine-tuning tips

7. Resource recommendations

Must-read papers

  • Attention Is All You Need (Transformer Original Paper)
  • 《BERT: Pre-training of Deep Bidirectional Transformers》

Practical Projects

  • Text summary generation
  • Multilingual translation system
  • Dialogue Robot Development

Community Resources

  • Hugging Face Model Hub
  • Kaggle NLP Competition Cases

8. Advanced training skills

1. Learning rate scheduling and gradient clipping

Dynamically adjust the learning rate during training to prevent gradient explosion:

from transformers import TrainingArguments
training_args = TrainingArguments(
    output_dir="./results",
    learning_rate=2e-5,
    weight_decay=0.01,
    warmup_steps=500,          # Number of learning rate warm-up steps    gradient_accumulation_steps=2,  # Gradient accumulation (save video memory)    gradient_clipping=1.0,     # Gradient crop threshold    ...
)

2. Custom loss function (PyTorch example)

import torch
from transformers import BertForSequenceClassification
class CustomModel(BertForSequenceClassification):
    def __init__(self, config):
        super().__init__(config)
    def forward(self, input_ids, attention_mask, labels=None):
        outputs = super().forward(input_ids, attention_mask)
        logits = 
        if labels is not None:
            loss_fct = (weight=([1.0, 2.0]))  # Category weight            loss = loss_fct((-1, 2), (-1))
            return {"loss": loss, "logits": logits}
        return outputs

9. Practical combat of complex tasks

1. Text generation (GPT-2 example)

from transformers import GPT2LMHeadModel, GPT2Tokenizer
tokenizer = GPT2Tokenizer.from_pretrained("gpt2")
model = GPT2LMHeadModel.from_pretrained("gpt2")
prompt = "In a world where AI dominates,"
input_ids = (prompt, return_tensors="pt")
# Generate text (configure generation parameters)output = (
    input_ids, 
    max_length=100, 
    temperature=0.7,        # Control randomness (low value is more certain)    top_k=50,               # Limit the number of candidate words    num_return_sequences=3  # Generate 3 different results)
for seq in output:
    print((seq, skip_special_tokens=True))

2. Question and Answer System (BERT-based)

from transformers import pipeline
qa_pipeline = pipeline("question-answering", model="deepset/roberta-base-squad2")
context = """
Hugging Face is a company based in New York City. 
Its Transformers library is widely used in NLP.
"""
question = "Where is Hugging Face located?"
result = qa_pipeline(question=question, context=context)
print(f"Answer: {result['answer']} (score: {result['score']:.2f})")
# Answer: New York City (score: 0.92)

10. Model optimization and deployment

1. Model quantization (reduce inference delay)

from transformers import BertModel, AutoTokenizer
import torch
model = BertModel.from_pretrained("bert-base-uncased")
quantized_model = .quantize_dynamic(
    model, 
    {},   # Quantify all linear layers    dtype=torch.qint8
)
# Increased inference speed after quantization2-4Double,Model volume reduction is about75%

2. ONNX format export (production deployment)

from transformers import BertTokenizer, BertForSequenceClassification
from  import export
model = BertForSequenceClassification.from_pretrained("bert-base-uncased")
tokenizer = BertTokenizer.from_pretrained("bert-base-uncased")
# Example inputdummy_input = tokenizer("This is a test", return_tensors="pt")
# Export as ONNXexport(
    model,
    (dummy_input["input_ids"], dummy_input["attention_mask"]),
    "",
    opset_version=13,
    input_names=["input_ids", "attention_mask"],
    output_names=["logits"],
    dynamic_axes={"input_ids": {0: "batch"}, "attention_mask": {0: "batch"}}
)

11. Debugging and Performance Analysis

1. Check the video memory usage

import torch
# Insert video memory monitoring in the training loopprint(f"Allocated: {.memory_allocated() / 1e9:.2f} GB")
print(f"Cached: {.memory_reserved() / 1e9:.2f} GB")

2. Use PyTorch Profiler

from  import profile, record_function, ProfilerActivity
with profile(activities=[], record_shapes=True) as prof:
    outputs = model(**inputs)
print(prof.key_averages().table(sort_by="cuda_time_total", row_limit=10))

12. Multilingual and cross-modal

1. Multilingual Translation (mBART)

from transformers import MBartForConditionalGeneration, MBart50TokenizerFast
model = MBartForConditionalGeneration.from_pretrained("facebook/mbart-large-50-many-to-many-mmt")
tokenizer = MBart50TokenizerFast.from_pretrained("facebook/mbart-large-50-many-to-many-mmt")
# Chinese to Englishtokenizer.src_lang = "zh_CN"
text = "Welcome to Transformers library"
encoded = tokenizer(text, return_tensors="pt")
generated_tokens = (**encoded, forced_bos_token_id=tokenizer.lang_code_to_id["en_XX"])
print(tokenizer.batch_decode(generated_tokens, skip_special_tokens=True))
# ['Welcome to the Transformers library']

2. Graphic and text multimodal (CLIP)

from PIL import Image
from transformers import CLIPProcessor, CLIPModel
model = CLIPModel.from_pretrained("openai/clip-vit-base-patch32")
processor = CLIPProcessor.from_pretrained("openai/clip-vit-base-patch32")
image = ("")
text = ["a photo of a cat", "a photo of a dog"]
inputs = processor(text=text, images=image, return_tensors="pt", padding=True)
outputs = model(**inputs)
# Calculate the similarity of the graphiclogits_per_image = outputs.logits_per_image
probs = logits_per_image.softmax(dim=1)  # Probability distribution

13. Learning path supplement

1. Understand the Transformer architecture

Implement a simplified version of Transformer:

import  as nn
class TransformerBlock():
    def __init__(self, d_model=512, nhead=8):
        super().__init__()
         = (d_model, nhead)
         = (d_model, d_model)
         = (d_model)
    def forward(self, x):
        attn_output, _ = (x, x, x)
        x = x + attn_output
        x = (x)
        x = x + (x)
        return x

2. Participate in open source projects

  • Contribute to the Hugging Face code library
  • Reproduce the latest paper models (such as LLaMA, BLOOM)

14. Frequently Asked Questions

1. OOM (insufficient video memory) error handling

Solution

  • Reducebatch_size
  • Enable gradient accumulation (gradient_accumulation_steps)
  • Use mixing accuracy (fp16=True)
  • Clean the cache:.empty_cache()

2. Special treatment of Chinese word participle

from transformers import BertTokenizer
tokenizer = BertTokenizer.from_pretrained("bert-base-chinese")
# Add special vocabulary manuallytokenizer.add_tokens(["【Special Word】"])
# Adjust the model embedding layermodel.resize_token_embeddings(len(tokenizer)) 

Continue to expand the following abouttransformersThe in-depth application content of the library covers more practical scenarios, cutting-edge technologies and industrial-level practical solutions.

15. Frontier technical practice

1. Big Language Model (LLM) fine-tuning (taking LLaMA as an example)

from transformers import LlamaForCausalLM, LlamaTokenizer, TrainingArguments
# Loading the model and word participle (requiring permission)model = LlamaForCausalLM.from_pretrained("decapoda-research/llama-7b-hf")
tokenizer = LlamaTokenizer.from_pretrained("decapoda-research/llama-7b-hf")
# Low-rank adaptation (LoRA) fine adjustmentfrom peft import get_peft_model, LoraConfig
lora_config = LoraConfig(
    r=8,  # Low rank dimension    lora_alpha=32,
    target_modules=["q_proj", "v_proj"],  # Only fine-tune some modules    lora_dropout=0.05,
    bias="none"
)
model = get_peft_model(model, lora_config)
model.print_trainable_parameters()  # Show the proportion of trainable parameters (usually <1%)# Continue to configure training parameters...

2. Reinforcement Learning and Human Feedback (RLHF)

# Use the TRL library for RLHF trainingfrom trl import PPOTrainer, AutoModelForCausalLMWithValueHead
model = AutoModelForCausalLMWithValueHead.from_pretrained("gpt2")
ppo_trainer = PPOTrainer(
    model=model,
    config=training_args,
    dataset=dataset,
    tokenizer=tokenizer
)
# Define the reward modelfor epoch in range(3):
    for batch in ppo_trainer.dataloader:
        # Generate a response        response_tensors = (batch["input_ids"])
        # Calculate rewards (need to custom reward functions)        rewards = calculate_rewards(response_tensors, batch)
        # PPO optimization steps        ppo_trainer.step(
            response_tensors,
            rewards,
            batch["attention_mask"]
        )

16. Industrial application solutions

1. Distributed training (multiple GPU/TPU)

from transformers import TrainingArguments
# Configure distributed trainingtraining_args = TrainingArguments(
    per_device_train_batch_size=4,
    gradient_accumulation_steps=8,
    fp16=True,
    tpu_num_cores=8,  # Specify the number of cores when using TPU    dataloader_num_workers=4,
    deepspeed="./configs/deepspeed_config.json"  # Optimize with DeepSpeed)
# DeepSpeed ​​configuration file example (ds_config.json):{
  "fp16": {
    "enabled": true
  },
  "optimizer": {
    "type": "AdamW",
    "params": {
      "lr": 3e-5
    }
  },
  "zero_optimization": {
    "stage": 3  # Enable ZeRO-3 optimization  }
}

2. Streaming Inference Service (FastAPI + Transformers)

from fastapi import FastAPI
from pydantic import BaseModel
from transformers import pipeline
app = FastAPI()
generator = pipeline("text-generation", model="gpt2")
class Request(BaseModel):
    text: str
    max_length: int = 100
@("/generate")
async def generate_text(request: Request):
    result = generator(, max_length=request.max_length)
    return {"generated_text": result[0]["generated_text"]}
# Start the service:uvicorn main:app --port 8000

17. Special scene processing

1. Long text processing (sliding window)

from transformers import AutoTokenizer, AutoModelForQuestionAnswering
tokenizer = AutoTokenizer.from_pretrained("bert-large-uncased-whole-word-masking-finetuned-squad")
model = AutoModelForQuestionAnswering.from_pretrained("bert-large-uncased-whole-word-masking-finetuned-squad")
def process_long_text(context, question, max_length=384, stride=128):
    # Process long text in chunks    inputs = tokenizer(
        question,
        context,
        max_length=max_length,
        truncation="only_second",
        stride=stride,
        return_overflowing_tokens=True,
        return_offsets_mapping=True
    )
    # Reason and merge results for each block    best_score = 0
    best_answer = ""
    for i in range(len(inputs["input_ids"])):
        outputs = model(**{k: ([v[i]]) for k, v in ()})
        answer_start = (outputs.start_logits)
        answer_end = (outputs.end_logits) + 1
        score = (outputs.start_logits[answer_start] + outputs.end_logits[answer_end-1]).item()
        if score &gt; best_score:
            best_score = score
            best_answer = (inputs["input_ids"][i][answer_start:answer_end])
    return best_answer

2. Low resource language processing

# Cross-language migration using XLM-RoBERTafrom transformers import XLMRobertaTokenizer, XLMRobertaForSequenceClassification
tokenizer = XLMRobertaTokenizer.from_pretrained("xlm-roberta-base")
model = XLMRobertaForSequenceClassification.from_pretrained("xlm-roberta-base")
# Fine-tuning with a small number of samples(Code andBERTTraining similar)

18. Model explanatory

1. Characteristic importance analysis (using Captum)

from  import LayerIntegratedGradients
from transformers import BertForSequenceClassification
model = BertForSequenceClassification.from_pretrained("bert-base-uncased")
def forward_func(input_ids, attention_mask):
    return model(input_ids, attention_mask).logits
lig = LayerIntegratedGradients(forward_func, )
# Calculate the importance of input wordsattributions, delta = (
    inputs=input_ids,
    baselines=tokenizer.pad_token_id * torch.ones_like(input_ids),
    additional_forward_args=attention_mask,
    return_convergence_delta=True
)
# Visualize the resultsimport  as plt
(range(len(attributions[0])), attributions[0].detach().numpy())
(ticks=range(len(tokens)), labels=tokens, rotation=90)
()

19. Ecosystem Integration

1. Integrate with spaCy

import spacy
from spacy_transformers import TransformersLanguage, TransformersWordPiecer
# Create a spacy pipelinenlp = TransformersLanguage(trf_name="bert-base-uncased")
# Custom components@("CustomClassifier.v1")
def create_classifier(transformer, tok2vec, n_classes):
    return TransformersTextCategorizer(transformer, tok2vec, n_classes)
# Use Transformer model directly in spacydoc = nlp("This is a text to analyze.")
print(doc._.trf_last_hidden_state.shape)  # [seq_len, hidden_dim]

2. Quickly build a demo interface with Gradio

import gradio as gr
from transformers import pipeline
ner_pipeline = pipeline("ner")
def extract_entities(text):
    results = ner_pipeline(text)
    return {"text": text, "entities": [
        {"entity": res["entity"], "start": res["start"], "end": res["end"]}
        for res in results
    ]}
(
    fn=extract_entities,
    inputs=(lines=5),
    outputs=()
).launch()

20. Continuous learning suggestions

Track the latest progress

  • Follow Hugging Face blogs and papers (such as T5, BLOOM, Stable Diffusion)
  • Participate in community activities (Hugging Face's Discord and forums)

Advanced practical projects

  • Build an end-to-end NLP system (data cleaning → model training → deployment monitoring)
  • Participate in Kaggle competitions (such as CommonLit Readability Prize)

System optimization direction

  • Model quantization and pruning
  • Server-side optimization (TensorRT acceleration, model parallelism)
  • Edge device deployment (ONNX Runtime, Core ML)

Continue to expand the following abouttransformersThe ultimate practical guide for the library covers production-level optimization, cutting-edge model architecture, domain-specific solutions and ethical considerations.

21. Production-level model optimization

1. Model pruning and knowledge distillation

# Use nn_pruning for structured pruningfrom transformers import BertForSequenceClassification
from nn_pruning import ModelPruning
model = BertForSequenceClassification.from_pretrained("bert-base-uncased")
pruner = ModelPruning(
    model,
    target_sparsity=0.5,  # Prune 50% attention head    pattern="block_sparse"  # Structured pruning pattern)
# Perform pruning and fine-tuningpruned_model = ()
pruned_model.save_pretrained("./pruned_bert")
# Knowledge distillation (Teacher → Student Model)from transformers import DistilBertForSequenceClassification, DistilBertTokenizer
teacher = BertForSequenceClassification.from_pretrained("bert-base-uncased")
student = DistilBertForSequenceClassification.from_pretrained("distilbert-base-uncased")
# Use the distillation trainerfrom transformers import DistillationTrainingArguments, DistillationTrainer
training_args = DistillationTrainingArguments(
    output_dir="./distilled",
    temperature=2.0,  # Softening probability distribution    alpha_ce=0.5,     # Cross entropy loss weight    alpha_mse=0.5     # Hide layer MSE loss weight)
trainer = DistillationTrainer(
    teacher=teacher,
    student=student,
    args=training_args,
    train_dataset=tokenized_datasets["train"],
    tokenizer=tokenizer
)
()

2. TensorRT accelerates reasoning

# Convert the model to TensorRT enginetrtexec --onnx= --saveEngine= --fp16
# Python calls TensorRT engineimport tensorrt as trt
import  as cuda
runtime = (())
with open("", "rb") as f:
    engine = runtime.deserialize_cuda_engine(())
context = engine.create_execution_context()
# Bind the input and output buffer for inference

22. Domain-specific model

1. Biomedical NLP (BioBERT)

from transformers import AutoTokenizer, AutoModelForTokenClassification
tokenizer = AutoTokenizer.from_pretrained("dmis-lab/biobert-v1.1")
model = AutoModelForTokenClassification.from_pretrained("dmis-lab/biobert-v1.1")
text = "The patient exhibited EGFR mutations and responded to osimertinib."
inputs = tokenizer(text, return_tensors="pt")
outputs = model(**inputs).logits
# Extract gene entitiespredictions = (outputs, dim=2)
print([([token]) for token in inputs.input_ids[0]])
print(())  # BIOMark the results

2. Legal Document Analysis (Legal-BERT)

# Contract Terms Classificationfrom transformers import BertTokenizer, BertForSequenceClassification
tokenizer = BertTokenizer.from_pretrained("nlpaueb/legal-bert-base-uncased")
model = BertForSequenceClassification.from_pretrained("nlpaueb/legal-bert-base-uncased")
clause = "The Parties hereby agree to arbitrate all disputes in accordance with ICC rules."
inputs = tokenizer(clause, return_tensors="pt", truncation=True, padding=True)
outputs = model(**inputs)
predicted_class = ().item()  # 0: Arbitration clause, 1: Confidentiality clauses, etc.

23. Edge device deployment

1. Core ML Conversion (iOS Deployment)

from transformers import BertForSequenceClassification
import coremltools as ct
model = BertForSequenceClassification.from_pretrained("bert-base-uncased")
tokenizer = BertTokenizer.from_pretrained("bert-base-uncased")
# Transformation Modeltraced_model = (model, (input_ids, attention_mask))
mlmodel = (
    traced_model,
    inputs=[
        (name="input_ids", shape=input_ids.shape),
        (name="attention_mask", shape=attention_mask.shape)
    ]
)
("")

2. TensorFlow Lite Quantification (Android Deployment)

from transformers import TFBertForSequenceClassification
import tensorflow as tf
model = TFBertForSequenceClassification.from_pretrained("bert-base-uncased")
# Convert to TFLiteconverter = .from_keras_model(model)
 = []  # Dynamic range quantizationtflite_model = ()
with open("model_quant.tflite", "wb") as f:
    (tflite_model)

24. Ethics and Security

1. Prejudice detection and mitigation

from transformers import pipeline
from fairness_metrics import demographic_parity
# Detect model biasclassifier = pipeline("text-classification", model="bert-base-uncased")
protected_groups = {
    "gender": ["she", "he"],
    "race": ["African", "European"]
}
bias_scores = {}
for category, terms in protected_groups.items():
    texts = [f"{term} is qualified for this position" for term in terms]
    results = classifier(texts)
    bias_scores[category] = demographic_parity(results)

2. Fight against sample defense

from textattack import AttackRecipe
from  import HuggingFaceModelWrapper
model_wrapper = HuggingFaceModelWrapper(model, tokenizer)
attack = ("bae")  # BAE attack method# Generate adversarial samplesattack_args = (num_examples=5)
attacker = (attack, model_wrapper, attack_args)
attack_results = attacker.attack_dataset(dataset)

25. Exploration of cutting-edge architecture

1. Sparse Transformer (processing ultra-long sequences)

from transformers import LongformerModel
model = LongformerModel.from_pretrained("allenai/longformer-base-4096")
inputs = tokenizer("This is a very long document..."*1000, return_tensors="pt")
outputs = model(**inputs)  # Longest support4096 tokens

2. Hybrid Expert Model (MoE)

# Use Switch Transformersfrom transformers import SwitchTransformersForConditionalGeneration
model = SwitchTransformersForConditionalGeneration.from_pretrained("google/switch-base-8")
outputs = (
    input_ids,
    expert_choice_mask=True,  # Track expert routing)
print(outputs.expert_choices)  # Show eachtokenExperts using it

26. Full-link project template

"""
 End-to-end text classification system architecture:
 1. Data acquisition → 2. Cleaning → 3. Annotation → 4. Model training → 5. Evaluation → 6. Deployment → 7. Monitoring
 """
# Enhanced training process in step 4from transformers import TrainerCallback
class CustomCallback(TrainerCallback):
    def on_log(self, args, state, control, logs=None, **kwargs):
        # Record metrics in real time to Prometheus        prometheus_logger.log_metrics(logs)
#Drift detection in step 7from alibi_detect.cd import MMDDrift
detector = MMDDrift(
    X_train, 
    backend="tensorflow", 
    p_val=0.05
)
drift_preds = (X_prod)

27. Lifelong learning advice

Technical tracking

  • Subscribe to arXiv's category
  • Participate in Hugging Face Community Weekly

Skill extension

  • Learning the quantitative theory of model ("Efficient Machine Learning")
  • Master the basics of CUDA programming

Cross-border integration

  • Explore the combination of LLM and knowledge graphs
  • Research multimodal large models (such as Flamingo, DALL·E 3)

Ethical Practice

  • Regular model fairness audits
  • Participate in the AI ​​for Social Good project

This is the article about this comprehensive explanation of Python Transformers library [NLP processing library]. For more related content of Python Transformers library, please search for my previous articles or continue browsing the related articles below. I hope everyone will support me in the future!