I. Introduction
In order to avoid the time-consuming task of developing large amounts of source code, and the sometimes-intensive task of having specialized knowledge that we can't possibly understand, we create as many objects as we can using classes that already exist in the libraries, often in a single line of code. Thus, libraries help us to perform important tasks with the right amount of code. I guess this is one of the reasons why python is so active around us. Feel free to like and favorite it to learn it later.
II. Foreplay
I forgot to mention earlier that I still recommend an IDE when using python.Anaconda
This can better manage these third-party library files, the benefits of which only you really use to know the old rules if you want to use their own check.The teacher said:"The two longest paths a programmer can take are to go to Baidu yourself; and to get someone else to do it for you!!!" This is really a famous quote
III. Python Standard Library
Sometimes you may not think that there are so many functions in python standard library, Python standard library provides rich functions, including text/binary data processing, mathematical operations, functional programming, file/directory access, data persistence, data compression/archiving, encryption, operating system services, concurrent programming, inter-process communication, network protocols, JSON / XML / other Internet data formats, multimedia, internationalization, GUI, debugging, parsing, and more. A selection of Python standard library modules are listed below.
- difflib: difference calculation tool
- COLLECTIONS: enhanced data structures that build on lists, tuples, dictionaries, and collections.
- csv: Processes files with comma-separated values.
- datetime, time: date and time operations.
- decimal: fixed-point or floating-point operations, including monetary calculations.
- doctest: simple unit testing by validating tests or expected results embedded in a docstring.
- json: processing JSON (JavaScript Object Notation) data for Web services and NoSQL document databases .
- math: common math constants and operations.
- os: interacts with the operating system.
- queue: a first-in-first-out data structure.
- random: pseudo-random number operation.
- re: regular expression for pattern matching.
- sqlite3: SQLite relational database access.
- statistics: mathematical and statistical functions such as mean, median, plural and variance.
- sys: - Command line parameter handling, such as standard input streams, output streams, and error streams.
- timeit: performance analysis.
- string: generic string manipulation
- textwrap: textwrap
- unicodedata: Unicode character database
- stringprep: Internet String Preparation Tool
- readline: GNU's readline interface.
- rlcompleter: GNU's implementation of a line-by-line read function.
Python has a large and still rapidly growing open source community with developers from many different fields. The large number of open source libraries in this community is one of the most important reasons for Python's popularity.
It can be amazing how many tasks can be accomplished with just a few lines of Python code. Listed below are some popular data science libraries.
IV. Scientific computing and statistics
- NumPy (Numerical Python): Python has no built-in array data structures. It provides list types that are easier to use but slower to process.NumPy provides high-performance ndarray data structures to represent lists and matrices, as well as operations to process them.
- SciPy (Scientific Python): SciPy is based on NumPy, with the addition of programs for scientific processing, such as integrals, differential equations, additional matrix processing, and so on. Responsible for managing SciPy and NumPy.
- StatsModels: provides support for statistical model evaluation, statistical testing and statistical data research.
- IPython is a part of the Python scientific computing standard toolset, it can link many things together, somewhat similar to an enhanced version of the Python shell. the purpose is to improve the speed of programming, testing and debugging Python code, it seems that a lot of foreign university professors, as well as Google bulls are very like to use IPython, it is indeed very It's really convenient.
V. Data processing and analysis
pandas: a very popular data processing library. pandas takes full advantage of NumPy's ndarray type, and its two key data structures are Series (one-dimensional) and DataFrame (two-dimensional).
modin[14] pandas acceleration library, interface syntax is highly consistent with pandas
dask[15] pandas acceleration library, interface syntax highly consistent with pandas
plydata[16] pandas pipeline syntax library
VI. Visualization
Pyecharts Echarts is an open source data visualization by Baidu, with good interactivity, sophisticated chart design, has been recognized by many developers. Python is an expressive language, very suitable for data processing. When data analysis meets data visualization, pyecharts is born!
Matplotlib: Highly customizable visualization and plotting library. matplotlib can draw regular, scatter, bar, contour, pie, vector field, grid, polar, 3D plots, and add text descriptions.
Seaborn: A higher level visualization library built on Matplotlib. Compared to Matplotlib, Seaborn improves the look and feel, adds visualization methods, and allows you to create visualizations with less code.
VII. Machine learning, deep learning and reinforcement learning
- scikit-learn: a top machine learning library. Machine Learning is a subset of AI, and Deep Learning is a subset of Machine Learning that focuses on neural networks.
- Keras: one of the easiest to use deep learning libraries. keras runs on top of TensorFlow (Google), CNTK (Microsoft's Cognitive Toolkit for Deep Learning) or Theano (University of Montreal).
- TensorFlow: Developed by Google, TensorFlow is the most widely used deep learning library.TensorFlow works with GPUs (Graphics Processing Units) or Google's customized TPUs (Tensor Processing Units) for optimal performance.TensorFlow has a very important place in AI and Big Data analytics because AI and Big Data have a huge demand for data processing is huge.
- OpenAI Gym: a library and development environment for developing, testing, and comparing reinforcement learning algorithms.
- pytorch Pytorch is the python version of torch, a neural network framework open-sourced by Facebook, specifically for GPU-accelerated deep neural network (DNN) programming. torch is a classic tensor library for manipulating multidimensional matrix data, and is widely used in machine learning and other math-intensive applications. Unlike Tensorflow's static computational graphs, pytorch's computational graphs are dynamic and can be changed in real time according to computational needs. However, due to the use of Lua language Torch, resulting in domestic has been very niche, and gradually by the support of Python Tensorflow to steal users. As a port of the classic machine learning library Torch, PyTorch provides Python language users with a comfortable option to write code.
VIII. Natural language processing
- NLTK (Natural Language Toolkit): for natural language processing (NLP) tasks.
- TextBlob: an object-oriented NLP text processing library , built on NLTK and pattern NLP libraries to simplify many NLP tasks .
- Gensim: Functions similar to NLTK. Typically used to build an index for an ensemble of documents and then determine how similar another document is to each document in the index.
To this article on Python these libraries, you know how much? The article is introduced to this, more related Python library content please search for my previous posts or continue to browse the following related articles I hope you will support me more in the future!