SoFunction
Updated on 2024-11-17

Solving problems encountered with pandas using read_csv to read files

Below:

Data files:

Shanghai Airports (sh600009)
24.11 3.58
Dongfeng Motor (sh600006) 74.25 1.74
China Guomao (sh600007) 26.38 2.66
Baosteel (sh600010) 61.01 2.35
WISCO (sh600005) 75.85 1.3
Pudong Development Bank (sh600000) 6.65 0.96

When reading a CSV file using the read_csv() API to find a column of data that is larger, the

df=pd.read_csv(output_file,encoding='gb2312',names=['a','b','c'])
>20

report an error

TypeError:'>'not supported between instances of 'str' and 'int'

From the error message returned, we can see that it should be a data type error, and it reads back 'str'.

in : 
out:
 a object
 b object
 c object
 dtype: object

This shows that the type is object.

Check the read_csv() documentation Configuration:

dtype : Type name or dict of column -> type, default None
Data type for data or columns. . {'a': np.float64, 'b': np.int32} (unsupported with engine='python'). Use str or object to preserve and not interpret dtype.

New in version 0.20.0: support for the Python parser.

It can be seen that the default is to use 'str' or 'object' to save the

So all you need to do is change the 'dtype' configuration when reading the

df=pd.read_csv(output_file,encoding='gb2312',names=['a','b','c'],dtype={'b':np.folat64})

The above article to solve pandas use read_csv() read file encountered problems is all I have shared with you, I hope to give you a reference, and I hope you support me more.