A comprehensive comparison and practical battle between Python packaging methods

1. The difference between

1. (Traditional method)

is a traditional method of Python packaging, usingsetuptoolsordistutilsDefine the metadata and dependencies of the package. Typical examples are as follows:

from setuptools import setup

setup(
    name='mypackage',
    version='0.1',
    packages=['mypackage'],
    install_requires=['requests']
)

How to use：

python  sdist bdist_wheel
pip install .

2. (Modern way)

Since the introduction of PEP 518,Become the recommended configuration method. It separates the build system configuration and package metadata and supports a variety of build tools (such assetuptools、poetrywait). Example:

[build-system]
requires = ["setuptools>=42", "wheel"]
build-backend = "setuptools.build_meta"

[project]
name = "mypackage"
version = "0.1"
dependencies = ["requests"]

How to use：

pip install .

2. Why recommend it?

Standardization and compatibility: Comply with the latest packaging standards and better compatibility with various tools.
Simplify configuration: Separate the build system and metadata to make the configuration clearer.
Multi-built system support: Supports a variety of tools to provide greater flexibility.
Security: Reduce dependence on custom scripts and reduce risks.

Necessity in actual scenarios

Suppose you are developing a complex machine learning library that involves multiple dependencies and complex build steps. useThese requirements can be easily defined and ensure consistency across different development and deployment environments. In addition, many modern tools (such as CI/CD systems) have built-insimplifies the automation process.

Best practices for building Python packages

New project use: For new projects, it is recommended to use, to meet modern packaging standards and improve compatibility.
Gradually migrate old projects: If the maintenance is already in useProjects that can continue to be used, but it is recommended to migrate to。
Use in combination: In some cases, it can be used at the same timeand, for exampleHandle most configurations, and keep a minimized oneto handle specific functions (such as building C extensions).
use: If you want to use a more declarative format but still use it, you can consider using, put the metadata in the configuration file, and the logic remains inmiddle.
Leverage build tools: Use asPoetryorFlitand other tools can simplify dependency management and packaging processes and automatically manageand other related files creation.

3. Practical example: Build and release a machine learning package

The following is an example of an actual machine learning project to show how to use itBuild, test and publish a Python package.

Project Overview

We will build a name calledmlpredictorpackage, this package:

Contains a simple classifier model using scikit-learn.
Provides the functions of training models and making predictions.
Structured for publishing to PyPI and GitHub.

Steps detailed explanation

1. Create a project structure

mlpredictor/
│
├── mlpredictor/
│   ├── __init__.py
│   ├── 
│
├── tests/
│   ├── test_model.py
│
├── LICENSE
├── 
├── 
└── .gitignore

2. Write code

mlpredictor/

from  import load_iris
from sklearn.model_selection import train_test_split
from  import RandomForestClassifier
import pickle


class MLPredictor:
    def __init__(self):
         = None

    def train(self):
        iris = load_iris()
        X_train, X_test, y_train, y_test = train_test_split(
            , , test_size=0.2, random_state=42
        )
         = RandomForestClassifier()
        (X_train, y_train)

    def predict(self, data):
        if not :
            raise Exception("Model is not trained yet!")
        return ([data])

    def save_model(self, path=""):
        with open(path, "wb") as f:
            (, f)

    def load_model(self, path=""):
        with open(path, "rb") as f:
             = (f)

mlpredictor/*init*.py

from .model import MLPredictor

__all__ = ["MLPredictor"]

3. Create a file

[build-system]
requires = ["setuptools>=42", "wheel"]
build-backend = "setuptools.build_meta"

[project]
name = "mlpredictor"
version = "0.1.0"
description = "A simple machine learning package using scikit-learn"
authors = [
    {name = "Ebrahim", email = "ebimsv0501@"}
]
license = {text = "MIT"}
readme = ""
requires-python = ">=3.6"
dependencies = [
    "scikit-learn>=1.0",
]

[]
"Homepage" = "/xxx_your_account/mlpredictor"

[build-system]: Specify the build system requirements, use heresetuptoolsandwheel。
[project]: Contains the metadata of the package, such as name, version, description, author, license, dependencies, etc.

4. Write tests

usepytestAdd test.

tests/test_model.py

import pytest
from mlpredictor import MLPredictor

def test_train_and_predict():
    model = MLPredictor()
    ()
    result = ([5.1, 3.5, 1.4, 0.2])
    assert len(result) == 1

if __name__ == "__main__":
    ()

5. Add README, License, and .gitignore

# MLPredictor

MLPredictor It's a simple machine learning package，use scikit-learn train RandomForest Model，并use户能够进行预测。This package is designed to demonstrate how to package Python Machine learning projects for distribution。

## Features
- exist Iris 数据集上train RandomForestClassifier。
- train后对新数据进行预测。
- 保存和加载train好的Model。

## Install
You can **PyPI** Or from **source code** Install this package。

### Install via PyPI
```bash
pip install mlpredictor

Install via source code (GitHub)

git clone /xxx_your_account/
cd mlpredictor
pip install .

How to use

After installation, you can useMLPredictorTrain the model and make predictions.

Examples: Training and Prediction

from mlpredictor import MLPredictor

# Initialize the predictorpredictor = MLPredictor()

# Train the model on the Iris dataset()

# Predict the sample inputsample_input = [5.1, 3.5, 1.4, 0.2]
prediction = (sample_input)

print(f"Prediction Category: {prediction}")

LICENSE

You can choose the right open source license, such as MIT License.

.gitignore

*.pyc
__pycache__/
*.pkl
dist/
build/

6. Local test package

Install the package with the following command:

pip install .

After installation, run the tests to make sure everything is working:

pytest tests

Notice：

If used, it will readFile to collect package metadata and installation information and parse and install specified dependencies.
If using, it will read the file, possibly specifying build system requirements and configuration. After executing the above command, the following directory is usually created:
- Distribution Directory:may bebuild/、dist/or.eggs/Directory, depending on the installation process and whether it is a source code installation or wheel installation.
- build/: Created during the build process, containing temporary files used to create packages.
- dist/: Contains build distribution files (such as wheel files) generated from packages.
- egg-info/or.egg-info/: Contains metadata about installed packages, including their dependencies and version numbers.

After ensuring that the project is working properly, continue to the next steps.

7. Push to GitHub

Initialize the Git repository

git init
git add .
git commit -m "Initial commit"

Create a GitHub repository

Go to GitHub and create a name calledmlpredictorNew warehouse.
Push code to GitHub

git remote add origin /xxx_your_account/
git branch -M main
git push -u origin main

Notice:Willxxx_your_accountReplace with your GitHub username.

8. Publish to PyPI

Now that the project has been set up and pushed to GitHub, it can be published to PyPI.

Install the necessary tools

pip install twine build

Build package

python -m build

This will be indist/Created in the directory.and.whldocument. examinedist/Directory, make sure to contain files similar to the following:

mlpredictor-0.1.
mlpredictor-0.1.

Upload to PyPI

twine upload dist/*

You need a PyPI account to upload the package. After the upload is successful, others can install your package through the following command:

pip install mlpredictor

9. Install and use the package

passpipAfter installation, you can use this package in Python code:

from mlpredictor import MLPredictor

predictor = MLPredictor()
()
prediction = ([5.1, 3.5, 1.4, 0.2])
print("Predicted class:", ())

#Export example:# Predicted class: 0

5. Summary

In the field of Python packaging, and each has its own importance and applicable scenarios. Although still playing a role in traditional projects, the shift toward , represents a trend in the Python community towards safer, standardized practices. For new projects, it is highly recommended to adopt , as it not only simplifies the packaging process, but also improves compatibility with a variety of tools and libraries.

With practical examples in this article, you should be able to understand how to build, test, and publish a fully functional Python package using it. Whether it’s an individual project or a team collaboration, following these best practices will greatly improve the maintainability and scalability of your project.

The above is the detailed content of the comprehensive comparison and actual practice of Python packaging methods. For more information about Python and comparison, please follow my other related articles!