SoFunction
Updated on 2024-11-15

Python output \u encoding will be converted to Chinese example

Crawling down the piggy bank's website rental room information but the output is this kind:

Python输出\u编码将其转换成中文

Baidu. python2.7 coding on window is indeed a pitfall!

The solution is as follows

If it's a dictionary, you have to convert it to a string and import it into a json library.

Then so output((data).decode("unicode-escape"))

The entire code demo

# -*- coding: UTF-8 -*-
#Piggy Crawl
import requests
from bs4 import BeautifulSoup
import json
def get_xinxi(i):
 url = '/search-duanzufang-p%d-0/' %i
 html = (url)
 soup = BeautifulSoup()
 #Get Address
 dizhis=(' div > a > span')
 #Get Price
 prices = (' span.result_price')
 # Get simple information
 ems = (' div > em')
 datas =[]
 for dizhi,price,em in zip(dizhis,prices,ems):
  data={
   'Price':price.get_text(),
   'Information':em.get_text().replace('\n','').replace(' ',''),
   'address':dizhi.get_text()
  }
  print((data).decode("unicode-escape"))
i=1
while(i<12):
 get_xinxi(i)
 i=i+1

Crawled 12 pages of information

Python输出\u编码将其转换成中文

Summary:

What's important to note is that

Creating a Soup

soup = BeautifulSoup()

For assignment of multiple values

for dizhi,price,em in zip(dizhis,prices,ems):

Dictionary output encoding problems

(data).decode("unicode-escape")

If you want to get the details of each one, you can get the value of its href attribute.

#page_list > ul > li:nth-of-type(1) > a

Then get the value of its attribute get('href') to get the details of each in the parsing page to get the desired information to add to the data dictionary

The above example of this Python output \u encoding to convert it to Chinese is all I have to share with you, I hope it can give you a reference, and I hope you will support me more.