Here is a commenttransformers
A comprehensive explanation of the library includes basic knowledge, advanced usage, case code and learning paths. The content is organized and suitable for learners at different stages.
1. Basic knowledge
1. Introduction to Transformers Library
- effect: Provides pre-trained models (such as BERT, GPT, RoBERTa) and tools for NLP tasks (text classification, translation, generation, etc.).
-
Core Components:
-
Tokenizer
: Text participle and encoding -
Model
: Neural Network Model Architecture -
Pipeline
: A fast inference encapsulation interface
-
2. Installation and Environment Configuration
pip install transformers torch datasets
3. Quick Start Example
from transformers import pipeline # Use sentiment analysis assembly lineclassifier = pipeline("sentiment-analysis") result = classifier("I love programming with Transformers!") print(result) # [{'label': 'POSITIVE', 'score': 0.9998}]
2. Detailed explanation of the core module
1. Tokenizer (word participle)
from transformers import AutoTokenizer tokenizer = AutoTokenizer.from_pretrained("bert-base-uncased") text = "Hello, world!" encoded = tokenizer(text, padding=True, truncation=True, return_tensors="pt") # Return PyTorch tensorprint(encoded) # {'input_ids': tensor([[101, 7592, 1010, 2088, 999, 102]]), # 'attention_mask': tensor([[1, 1, 1, 1, 1, 1]])}
2. Model (model loading)
from transformers import AutoModel model = AutoModel.from_pretrained("bert-base-uncased") outputs = model(**encoded) # Forward communicationlast_hidden_states = outputs.last_hidden_state
3. Advanced usage
1. Custom model training (PyTorch example)
from transformers import BertForSequenceClassification, Trainer, TrainingArguments from datasets import load_dataset # Load the datasetdataset = load_dataset("imdb") tokenized_datasets = ( lambda x: tokenizer(x["text"], padding=True, truncation=True), batched=True ) # Define the modelmodel = BertForSequenceClassification.from_pretrained("bert-base-uncased", num_labels=2) # Training parameter configurationtraining_args = TrainingArguments( output_dir="./results", num_train_epochs=3, per_device_train_batch_size=8, evaluation_strategy="epoch" ) # Trainer configurationtrainer = Trainer( model=model, args=training_args, train_dataset=tokenized_datasets["train"], eval_dataset=tokenized_datasets["test"] ) # Start training()
2. Model saving and loading
model.save_pretrained("./my_model") tokenizer.save_pretrained("./my_model") # Load custom modelnew_model = AutoModel.from_pretrained("./my_model")
4. Deeply advance
1. Attention mechanism visualization
from transformers import BertModel, BertTokenizer import torch model = BertModel.from_pretrained("bert-base-uncased", output_attentions=True) inputs = tokenizer("The cat sat on the mat", return_tensors="pt") outputs = model(**inputs) # Extract attention weights from layer 0attention = [0][0] print() # [num_heads, seq_len, seq_len]
2. Mixed precision training
from transformers import TrainingArguments training_args = TrainingArguments( fp16=True, # Enable mixing accuracy ... )
5. Complete case: Named entity recognition (NER)
from transformers import pipeline # Load NER pipelinener_pipeline = pipeline("ner", model="dslim/bert-base-NER") text = "Apple was founded by Steve Jobs in Cupertino." results = ner_pipeline(text) # Results visualizationfor entity in results: print(f"{entity['word']} -> {entity['entity']} (confidence: {entity['score']:.2f})")
6. Learning path suggestions
Beginner stage:
- Official documentation: /docs/transformers
- study
pipeline
Use with basic models
Intermediate stage:
- Master the custom training process
- Understand model architecture (Transformer, BERT principle)
Advanced stage:
- Model distillation and quantization
- Custom model architecture development
- Big model fine-tuning tips
7. Resource recommendations
Must-read papers:
- Attention Is All You Need (Transformer Original Paper)
- 《BERT: Pre-training of Deep Bidirectional Transformers》
Practical Projects:
- Text summary generation
- Multilingual translation system
- Dialogue Robot Development
Community Resources:
- Hugging Face Model Hub
- Kaggle NLP Competition Cases
8. Advanced training skills
1. Learning rate scheduling and gradient clipping
Dynamically adjust the learning rate during training to prevent gradient explosion:
from transformers import TrainingArguments training_args = TrainingArguments( output_dir="./results", learning_rate=2e-5, weight_decay=0.01, warmup_steps=500, # Number of learning rate warm-up steps gradient_accumulation_steps=2, # Gradient accumulation (save video memory) gradient_clipping=1.0, # Gradient crop threshold ... )
2. Custom loss function (PyTorch example)
import torch from transformers import BertForSequenceClassification class CustomModel(BertForSequenceClassification): def __init__(self, config): super().__init__(config) def forward(self, input_ids, attention_mask, labels=None): outputs = super().forward(input_ids, attention_mask) logits = if labels is not None: loss_fct = (weight=([1.0, 2.0])) # Category weight loss = loss_fct((-1, 2), (-1)) return {"loss": loss, "logits": logits} return outputs
9. Practical combat of complex tasks
1. Text generation (GPT-2 example)
from transformers import GPT2LMHeadModel, GPT2Tokenizer tokenizer = GPT2Tokenizer.from_pretrained("gpt2") model = GPT2LMHeadModel.from_pretrained("gpt2") prompt = "In a world where AI dominates," input_ids = (prompt, return_tensors="pt") # Generate text (configure generation parameters)output = ( input_ids, max_length=100, temperature=0.7, # Control randomness (low value is more certain) top_k=50, # Limit the number of candidate words num_return_sequences=3 # Generate 3 different results) for seq in output: print((seq, skip_special_tokens=True))
2. Question and Answer System (BERT-based)
from transformers import pipeline qa_pipeline = pipeline("question-answering", model="deepset/roberta-base-squad2") context = """ Hugging Face is a company based in New York City. Its Transformers library is widely used in NLP. """ question = "Where is Hugging Face located?" result = qa_pipeline(question=question, context=context) print(f"Answer: {result['answer']} (score: {result['score']:.2f})") # Answer: New York City (score: 0.92)
10. Model optimization and deployment
1. Model quantization (reduce inference delay)
from transformers import BertModel, AutoTokenizer import torch model = BertModel.from_pretrained("bert-base-uncased") quantized_model = .quantize_dynamic( model, {}, # Quantify all linear layers dtype=torch.qint8 ) # Increased inference speed after quantization2-4Double,Model volume reduction is about75%
2. ONNX format export (production deployment)
from transformers import BertTokenizer, BertForSequenceClassification from import export model = BertForSequenceClassification.from_pretrained("bert-base-uncased") tokenizer = BertTokenizer.from_pretrained("bert-base-uncased") # Example inputdummy_input = tokenizer("This is a test", return_tensors="pt") # Export as ONNXexport( model, (dummy_input["input_ids"], dummy_input["attention_mask"]), "", opset_version=13, input_names=["input_ids", "attention_mask"], output_names=["logits"], dynamic_axes={"input_ids": {0: "batch"}, "attention_mask": {0: "batch"}} )
11. Debugging and Performance Analysis
1. Check the video memory usage
import torch # Insert video memory monitoring in the training loopprint(f"Allocated: {.memory_allocated() / 1e9:.2f} GB") print(f"Cached: {.memory_reserved() / 1e9:.2f} GB")
2. Use PyTorch Profiler
from import profile, record_function, ProfilerActivity with profile(activities=[], record_shapes=True) as prof: outputs = model(**inputs) print(prof.key_averages().table(sort_by="cuda_time_total", row_limit=10))
12. Multilingual and cross-modal
1. Multilingual Translation (mBART)
from transformers import MBartForConditionalGeneration, MBart50TokenizerFast model = MBartForConditionalGeneration.from_pretrained("facebook/mbart-large-50-many-to-many-mmt") tokenizer = MBart50TokenizerFast.from_pretrained("facebook/mbart-large-50-many-to-many-mmt") # Chinese to Englishtokenizer.src_lang = "zh_CN" text = "Welcome to Transformers library" encoded = tokenizer(text, return_tensors="pt") generated_tokens = (**encoded, forced_bos_token_id=tokenizer.lang_code_to_id["en_XX"]) print(tokenizer.batch_decode(generated_tokens, skip_special_tokens=True)) # ['Welcome to the Transformers library']
2. Graphic and text multimodal (CLIP)
from PIL import Image from transformers import CLIPProcessor, CLIPModel model = CLIPModel.from_pretrained("openai/clip-vit-base-patch32") processor = CLIPProcessor.from_pretrained("openai/clip-vit-base-patch32") image = ("") text = ["a photo of a cat", "a photo of a dog"] inputs = processor(text=text, images=image, return_tensors="pt", padding=True) outputs = model(**inputs) # Calculate the similarity of the graphiclogits_per_image = outputs.logits_per_image probs = logits_per_image.softmax(dim=1) # Probability distribution
13. Learning path supplement
1. Understand the Transformer architecture
Implement a simplified version of Transformer:
import as nn class TransformerBlock(): def __init__(self, d_model=512, nhead=8): super().__init__() = (d_model, nhead) = (d_model, d_model) = (d_model) def forward(self, x): attn_output, _ = (x, x, x) x = x + attn_output x = (x) x = x + (x) return x
2. Participate in open source projects
- Contribute to the Hugging Face code library
- Reproduce the latest paper models (such as LLaMA, BLOOM)
14. Frequently Asked Questions
1. OOM (insufficient video memory) error handling
Solution:
- Reduce
batch_size
- Enable gradient accumulation (
gradient_accumulation_steps
) - Use mixing accuracy (
fp16=True
) - Clean the cache:
.empty_cache()
2. Special treatment of Chinese word participle
from transformers import BertTokenizer tokenizer = BertTokenizer.from_pretrained("bert-base-chinese") # Add special vocabulary manuallytokenizer.add_tokens(["【Special Word】"]) # Adjust the model embedding layermodel.resize_token_embeddings(len(tokenizer))
Continue to expand the following abouttransformers
The in-depth application content of the library covers more practical scenarios, cutting-edge technologies and industrial-level practical solutions.
15. Frontier technical practice
1. Big Language Model (LLM) fine-tuning (taking LLaMA as an example)
from transformers import LlamaForCausalLM, LlamaTokenizer, TrainingArguments # Loading the model and word participle (requiring permission)model = LlamaForCausalLM.from_pretrained("decapoda-research/llama-7b-hf") tokenizer = LlamaTokenizer.from_pretrained("decapoda-research/llama-7b-hf") # Low-rank adaptation (LoRA) fine adjustmentfrom peft import get_peft_model, LoraConfig lora_config = LoraConfig( r=8, # Low rank dimension lora_alpha=32, target_modules=["q_proj", "v_proj"], # Only fine-tune some modules lora_dropout=0.05, bias="none" ) model = get_peft_model(model, lora_config) model.print_trainable_parameters() # Show the proportion of trainable parameters (usually <1%)# Continue to configure training parameters...
2. Reinforcement Learning and Human Feedback (RLHF)
# Use the TRL library for RLHF trainingfrom trl import PPOTrainer, AutoModelForCausalLMWithValueHead model = AutoModelForCausalLMWithValueHead.from_pretrained("gpt2") ppo_trainer = PPOTrainer( model=model, config=training_args, dataset=dataset, tokenizer=tokenizer ) # Define the reward modelfor epoch in range(3): for batch in ppo_trainer.dataloader: # Generate a response response_tensors = (batch["input_ids"]) # Calculate rewards (need to custom reward functions) rewards = calculate_rewards(response_tensors, batch) # PPO optimization steps ppo_trainer.step( response_tensors, rewards, batch["attention_mask"] )
16. Industrial application solutions
1. Distributed training (multiple GPU/TPU)
from transformers import TrainingArguments # Configure distributed trainingtraining_args = TrainingArguments( per_device_train_batch_size=4, gradient_accumulation_steps=8, fp16=True, tpu_num_cores=8, # Specify the number of cores when using TPU dataloader_num_workers=4, deepspeed="./configs/deepspeed_config.json" # Optimize with DeepSpeed) # DeepSpeed configuration file example (ds_config.json):{ "fp16": { "enabled": true }, "optimizer": { "type": "AdamW", "params": { "lr": 3e-5 } }, "zero_optimization": { "stage": 3 # Enable ZeRO-3 optimization } }
2. Streaming Inference Service (FastAPI + Transformers)
from fastapi import FastAPI from pydantic import BaseModel from transformers import pipeline app = FastAPI() generator = pipeline("text-generation", model="gpt2") class Request(BaseModel): text: str max_length: int = 100 @("/generate") async def generate_text(request: Request): result = generator(, max_length=request.max_length) return {"generated_text": result[0]["generated_text"]} # Start the service:uvicorn main:app --port 8000
17. Special scene processing
1. Long text processing (sliding window)
from transformers import AutoTokenizer, AutoModelForQuestionAnswering tokenizer = AutoTokenizer.from_pretrained("bert-large-uncased-whole-word-masking-finetuned-squad") model = AutoModelForQuestionAnswering.from_pretrained("bert-large-uncased-whole-word-masking-finetuned-squad") def process_long_text(context, question, max_length=384, stride=128): # Process long text in chunks inputs = tokenizer( question, context, max_length=max_length, truncation="only_second", stride=stride, return_overflowing_tokens=True, return_offsets_mapping=True ) # Reason and merge results for each block best_score = 0 best_answer = "" for i in range(len(inputs["input_ids"])): outputs = model(**{k: ([v[i]]) for k, v in ()}) answer_start = (outputs.start_logits) answer_end = (outputs.end_logits) + 1 score = (outputs.start_logits[answer_start] + outputs.end_logits[answer_end-1]).item() if score > best_score: best_score = score best_answer = (inputs["input_ids"][i][answer_start:answer_end]) return best_answer
2. Low resource language processing
# Cross-language migration using XLM-RoBERTafrom transformers import XLMRobertaTokenizer, XLMRobertaForSequenceClassification tokenizer = XLMRobertaTokenizer.from_pretrained("xlm-roberta-base") model = XLMRobertaForSequenceClassification.from_pretrained("xlm-roberta-base") # Fine-tuning with a small number of samples(Code andBERTTraining similar)
18. Model explanatory
1. Characteristic importance analysis (using Captum)
from import LayerIntegratedGradients from transformers import BertForSequenceClassification model = BertForSequenceClassification.from_pretrained("bert-base-uncased") def forward_func(input_ids, attention_mask): return model(input_ids, attention_mask).logits lig = LayerIntegratedGradients(forward_func, ) # Calculate the importance of input wordsattributions, delta = ( inputs=input_ids, baselines=tokenizer.pad_token_id * torch.ones_like(input_ids), additional_forward_args=attention_mask, return_convergence_delta=True ) # Visualize the resultsimport as plt (range(len(attributions[0])), attributions[0].detach().numpy()) (ticks=range(len(tokens)), labels=tokens, rotation=90) ()
19. Ecosystem Integration
1. Integrate with spaCy
import spacy from spacy_transformers import TransformersLanguage, TransformersWordPiecer # Create a spacy pipelinenlp = TransformersLanguage(trf_name="bert-base-uncased") # Custom components@("CustomClassifier.v1") def create_classifier(transformer, tok2vec, n_classes): return TransformersTextCategorizer(transformer, tok2vec, n_classes) # Use Transformer model directly in spacydoc = nlp("This is a text to analyze.") print(doc._.trf_last_hidden_state.shape) # [seq_len, hidden_dim]
2. Quickly build a demo interface with Gradio
import gradio as gr from transformers import pipeline ner_pipeline = pipeline("ner") def extract_entities(text): results = ner_pipeline(text) return {"text": text, "entities": [ {"entity": res["entity"], "start": res["start"], "end": res["end"]} for res in results ]} ( fn=extract_entities, inputs=(lines=5), outputs=() ).launch()
20. Continuous learning suggestions
Track the latest progress:
- Follow Hugging Face blogs and papers (such as T5, BLOOM, Stable Diffusion)
- Participate in community activities (Hugging Face's Discord and forums)
Advanced practical projects:
- Build an end-to-end NLP system (data cleaning → model training → deployment monitoring)
- Participate in Kaggle competitions (such as CommonLit Readability Prize)
System optimization direction:
- Model quantization and pruning
- Server-side optimization (TensorRT acceleration, model parallelism)
- Edge device deployment (ONNX Runtime, Core ML)
Continue to expand the following abouttransformers
The ultimate practical guide for the library covers production-level optimization, cutting-edge model architecture, domain-specific solutions and ethical considerations.
21. Production-level model optimization
1. Model pruning and knowledge distillation
# Use nn_pruning for structured pruningfrom transformers import BertForSequenceClassification from nn_pruning import ModelPruning model = BertForSequenceClassification.from_pretrained("bert-base-uncased") pruner = ModelPruning( model, target_sparsity=0.5, # Prune 50% attention head pattern="block_sparse" # Structured pruning pattern) # Perform pruning and fine-tuningpruned_model = () pruned_model.save_pretrained("./pruned_bert") # Knowledge distillation (Teacher → Student Model)from transformers import DistilBertForSequenceClassification, DistilBertTokenizer teacher = BertForSequenceClassification.from_pretrained("bert-base-uncased") student = DistilBertForSequenceClassification.from_pretrained("distilbert-base-uncased") # Use the distillation trainerfrom transformers import DistillationTrainingArguments, DistillationTrainer training_args = DistillationTrainingArguments( output_dir="./distilled", temperature=2.0, # Softening probability distribution alpha_ce=0.5, # Cross entropy loss weight alpha_mse=0.5 # Hide layer MSE loss weight) trainer = DistillationTrainer( teacher=teacher, student=student, args=training_args, train_dataset=tokenized_datasets["train"], tokenizer=tokenizer ) ()
2. TensorRT accelerates reasoning
# Convert the model to TensorRT enginetrtexec --onnx= --saveEngine= --fp16
# Python calls TensorRT engineimport tensorrt as trt import as cuda runtime = (()) with open("", "rb") as f: engine = runtime.deserialize_cuda_engine(()) context = engine.create_execution_context() # Bind the input and output buffer for inference
22. Domain-specific model
1. Biomedical NLP (BioBERT)
from transformers import AutoTokenizer, AutoModelForTokenClassification tokenizer = AutoTokenizer.from_pretrained("dmis-lab/biobert-v1.1") model = AutoModelForTokenClassification.from_pretrained("dmis-lab/biobert-v1.1") text = "The patient exhibited EGFR mutations and responded to osimertinib." inputs = tokenizer(text, return_tensors="pt") outputs = model(**inputs).logits # Extract gene entitiespredictions = (outputs, dim=2) print([([token]) for token in inputs.input_ids[0]]) print(()) # BIOMark the results
2. Legal Document Analysis (Legal-BERT)
# Contract Terms Classificationfrom transformers import BertTokenizer, BertForSequenceClassification tokenizer = BertTokenizer.from_pretrained("nlpaueb/legal-bert-base-uncased") model = BertForSequenceClassification.from_pretrained("nlpaueb/legal-bert-base-uncased") clause = "The Parties hereby agree to arbitrate all disputes in accordance with ICC rules." inputs = tokenizer(clause, return_tensors="pt", truncation=True, padding=True) outputs = model(**inputs) predicted_class = ().item() # 0: Arbitration clause, 1: Confidentiality clauses, etc.
23. Edge device deployment
1. Core ML Conversion (iOS Deployment)
from transformers import BertForSequenceClassification import coremltools as ct model = BertForSequenceClassification.from_pretrained("bert-base-uncased") tokenizer = BertTokenizer.from_pretrained("bert-base-uncased") # Transformation Modeltraced_model = (model, (input_ids, attention_mask)) mlmodel = ( traced_model, inputs=[ (name="input_ids", shape=input_ids.shape), (name="attention_mask", shape=attention_mask.shape) ] ) ("")
2. TensorFlow Lite Quantification (Android Deployment)
from transformers import TFBertForSequenceClassification import tensorflow as tf model = TFBertForSequenceClassification.from_pretrained("bert-base-uncased") # Convert to TFLiteconverter = .from_keras_model(model) = [] # Dynamic range quantizationtflite_model = () with open("model_quant.tflite", "wb") as f: (tflite_model)
24. Ethics and Security
1. Prejudice detection and mitigation
from transformers import pipeline from fairness_metrics import demographic_parity # Detect model biasclassifier = pipeline("text-classification", model="bert-base-uncased") protected_groups = { "gender": ["she", "he"], "race": ["African", "European"] } bias_scores = {} for category, terms in protected_groups.items(): texts = [f"{term} is qualified for this position" for term in terms] results = classifier(texts) bias_scores[category] = demographic_parity(results)
2. Fight against sample defense
from textattack import AttackRecipe from import HuggingFaceModelWrapper model_wrapper = HuggingFaceModelWrapper(model, tokenizer) attack = ("bae") # BAE attack method# Generate adversarial samplesattack_args = (num_examples=5) attacker = (attack, model_wrapper, attack_args) attack_results = attacker.attack_dataset(dataset)
25. Exploration of cutting-edge architecture
1. Sparse Transformer (processing ultra-long sequences)
from transformers import LongformerModel model = LongformerModel.from_pretrained("allenai/longformer-base-4096") inputs = tokenizer("This is a very long document..."*1000, return_tensors="pt") outputs = model(**inputs) # Longest support4096 tokens
2. Hybrid Expert Model (MoE)
# Use Switch Transformersfrom transformers import SwitchTransformersForConditionalGeneration model = SwitchTransformersForConditionalGeneration.from_pretrained("google/switch-base-8") outputs = ( input_ids, expert_choice_mask=True, # Track expert routing) print(outputs.expert_choices) # Show eachtokenExperts using it
26. Full-link project template
""" End-to-end text classification system architecture: 1. Data acquisition → 2. Cleaning → 3. Annotation → 4. Model training → 5. Evaluation → 6. Deployment → 7. Monitoring """ # Enhanced training process in step 4from transformers import TrainerCallback class CustomCallback(TrainerCallback): def on_log(self, args, state, control, logs=None, **kwargs): # Record metrics in real time to Prometheus prometheus_logger.log_metrics(logs) #Drift detection in step 7from alibi_detect.cd import MMDDrift detector = MMDDrift( X_train, backend="tensorflow", p_val=0.05 ) drift_preds = (X_prod)
27. Lifelong learning advice
Technical tracking:
- Subscribe to arXiv's category
- Participate in Hugging Face Community Weekly
Skill extension:
- Learning the quantitative theory of model ("Efficient Machine Learning")
- Master the basics of CUDA programming
Cross-border integration:
- Explore the combination of LLM and knowledge graphs
- Research multimodal large models (such as Flamingo, DALL·E 3)
Ethical Practice:
- Regular model fairness audits
- Participate in the AI for Social Good project
This is the article about this comprehensive explanation of Python Transformers library [NLP processing library]. For more related content of Python Transformers library, please search for my previous articles or continue browsing the related articles below. I hope everyone will support me in the future!