SoFunction
Updated on 2024-11-10

Python Viewing Data Types and Formats

Python Viewing Data Types and Formats

Generally we get a data, will first look at the data how many rows and columns, what are the individual fields, what is the data format type. Before we start talking about data formats, we need to sort out the various data types.

We often use the library is generally numpy and pandas, Numpy under the core is an array (array, ndarray), Pandas under the core is a data frame (Series, DataFrame)

Let's create a random bit of data to test

import pandas as pd
import numpy as np
df=((5,10,size=(10,2)),columns=['a','b'])
Array=(5,10,size=(10,2))
#Suppose we don't knowdfcap (a poem)ArrayWhat data type is it?

See if the available data is of the dataframe type or the array-matrix type

Syntax: type(XXX) for tuple/list/array/ndarray/Series/Dataframe

print(type(df))
# Output class ''This is the DataFrame type of data
print(type(Array))
# Output class ''This is multidimensional array
print(type(tuple(Array)))
# Output 'tuple' which is tuple
print(type(list(df['a'])))
#exports'class list'It's alisttypology

See if the data format is a string or a numeric format

Here you need to distinguish between Numpy and Pandas view slightly differently, one is dtype, one is dtypes

print()
# Output int64
print()
#exportsDfThe data format of all columns under a:int64,b:int64

Python datatype bytes

1 Characteristics of the bytes type

After Python, Python's own characters are encoded and displayed in utf-8 format by default

  • Python's default stringstring datatype is a sequence of utf-8 display forms
  • The bytes data type is an immutable sequence in binary form in utf-8 format
  • The bytearray data type is a variable sequence in binary form in utf-8 format.

1.1 ASCII tables 

2 bytes type creation and conversion

2.1 bytes types and numbers

Numeric types are not strings and cannot be generated directly into the bytes class.

Python defines special meanings for numeric types

① When the input parameter is a number, it means to create a vector of nul(\x00)

byte_str = bytes(10)
print(byte_str)
>>> b'\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00'

② When the input parameter is a sequence of numbers, it is directly converted to a sequence of bytes, and the corresponding values are the same, writing the sequence of numbers directly to the bottom layer should use this method

byte_str = bytes([1, 10, 0xF])
print(byte_str)
>>> b'\x00\x10\x0f'

③ When the binary data is in the interval [33, 126], it belongs to the range of displayable characters on the ASCII table, and the corresponding character will be displayed directly.

Numbers are created directly using bytes

byte_str = bytes([33, 48, 126])
print(byte_str)
>>> b'!0~'

2.2 The bytes type and ASCII characters

2.2.1 Creating bytes of data

①Use b''Create

byte_str = b'Python'
print(byte_str)
>>> b'Python'

② Use bytes() to create immutable sequences

byte_str = bytes('Python', encoding='utf-8')
print(byte_str)
>>> b'Python'

③ Use bytearray() to create variable sequences

byte_str = bytearray('Python', encoding='utf-8')
print(byte_str)
>>> bytearray(b'Python')

2.2.2 Restore bytes data

① Reduce immutable sequences using ()

byte_str = bytes('Python', encoding='utf-8')
utf_str = (byte_str)
print(utf_str)
>>> 'Python'

② Use () to reduce variable sequences

byte_str = bytearray('Python', encoding='utf-8')
utf_str = (byte_str)
print(utf_str)
>>> 'Python'

2.3 bytes type and Chinese characters

In UTF-8, each Chinese character is represented by 3 Byte

byte_str = bytes('I am.', encoding='utf-8')
print(byte_str)
>>> b'\xe6\x88\x91\xe6\x98\xaf'

Restore:

byte_str = b'\xe6\x88\x91\xe6\x98\xaf'
utf_str = (byte_str)
print(utf_str)
>>> 'I am.'

3 bytes type slice iteration

① The underlying int type is returned by means of bytes[index].

byte_str = b'a'
print(type(byte_str[0]))
print(byte_str[0])
>>> <class 'int'>
>>> 97
byte_str = b'abc'
print(type(byte_str[2]))
print(byte_str[2])
>>> <class 'int'>
>>> 99

② Returns the underlying int type by for ... in bytes returns the underlying int type.

byte_str = b'abc'
for byte in byte_str:
    print(type(byte))
    print(byte)
    
>>> <class 'int'>
>>> 97
>>> <class 'int'>
>>> 98
>>> <class 'int'>
>>> 99

③ The underlying bytes type is returned by means of bytes[start:end].

byte_str = b'a'
print(type(byte_str[:]))
print(byte_str[:])
>>> <class 'bytes'>
>>> b'a'

The above is a personal experience, I hope it can give you a reference, and I hope you can support me more.