SoFunction
Updated on 2024-11-21

Python implementation of a self-service number query tool

It's not difficult to develop based on the underlying data, it's just a matter of using the user input variables as filter conditions, mapping the parameters to a sql statement, and generating a sql statement and then going to the database to execute it.

Finally, QT is used to develop a GUI interface, the user interface clicks and filters the conditions, signals to trigger the corresponding buttons with the binding of the pass slot function to execute the

Specific thoughts:

I. Database connection classes

This reads and writes the oracle database using pandas.

II. Main function module

1) Input parameter module, external input condition parameters, establish database key field mapping

--Note: Reading an external txt file and filtering fields may require key-value pair conversion.

(2) sql statement collection module, the business sql statements to be executed to the unified storage here

3) Data processing function factory

4) Extracting data using multiple threads

I. Database connection classes

cx_Oracle is a Python extension module that is the python equivalent of an Oracle database driver, enabling querying and updating of Oracle databases by using the database API common to all database access modules.

Pandas is based on NumPy. Developed as a module for solving data analysis tasks, Pandas introduces a large number of libraries and a number of standard data models that provide the method classes and functions needed to efficiently manipulate large data sets

There are three main ways for pandas to call a database: read_sql_table, read_sql_query, and read_sql.

This article introduces the use of the read_sql_query method in Pandas.

1:pd.read_sql_query()
Read custom data,return of something its original ownerDataFramespecification,pass (a bill or inspection etc)SQLQuery scripts including additions, deletions and modifications。
pd.read_sql_query(sql, con, index_col=None,coerce_float=True, params=None, parse_dates=None,chunksize=None)
sql:executablesqlscripts,text type
con:database connection
index_col:Select the column that returns the index of the result set,copies/copies列表
coerce_float:Very useful.,String in numeric form directly asfloatinput data type
parse_dates:Converts a column of date-type strings into thedatetimetype data,together withpd.to_datetimefunction is similar to。
params:towardsqlscripts中传入的参数,There are lists of official types,Tuples and Dictionaries。The syntax used to pass parameters is database driver related。
chunksize:If an integer value is provided,Then it will return agenerator,The number of lines per output is the size of the supplied value

read_sql_query()acceptableSQLstatement,DELETE,INSERT INTO、UPDATEThe operation has no return value(But it will be executed in the database),The program throws theSourceCodeCloseError,and terminate the proceedings。SELECTwill return the result。If you want to keep running,cantryCatch this exception。
 
2:pd.read_sql_table()
Reading a table in a database,return of something its original ownerDataFramespecification(pass (a bill or inspection etc)表名)
import pandas as pd
pd.read_sql_table(table_name, con, schema=None,index_col=None, coerce_float=True, parse_dates=None, columns=None,chunksize=None)
 
3:pd.read_sql()
读数据库pass (a bill or inspection etc)SQLscripts或者表名
import pandas as pd
pd.read_sql(sql, con, index_col=None,coerce_float=True, params=None, parse_dates=None, columns=None, chunksize=None)

The following creates the connection class Oracle_DB for connecting to an oracel database

There are two main function methods for manipulating data.

import cx_Oracle
# Pandas read/write manipulation of Oracle databases
import pandas as pd

# Avoid messy codes from encoding issues
import os
['NLS_LANG'] = 'SIMPLIFIED CHINESE_CHINA.UTF8'


class Oracle_DB(object):
    def __init__(self):
        try:
            # Connect to oracle
            # Method 1: create_engine() provided by sqlalchemy
            # from sqlalchemy import create_engine
            # engine = create_engine('oracle+cx_oracle://username:password@ip:1521/ORCL')
            # # Method 2: cx_Oracle.connect()
             = cx_Oracle.connect('username', 'password', 'ip:1521/database')

        except cx_Oracle.Error as e:
            print("Error %d:%s" % ([0], [1]))
            exit()
            
    # Search for partial information
    def search_one(self, sql,sparm):
        try:
            # query to get data with sql statement
            # Substitute parameters: sparm -- query specified field parameters
            df = pd.read_sql_query(sql, ,params=sparm)

            ()

        except Exception as e:
            return "Error " + [0]

        return df

    # Search for all information
    def search_all(self, sql):
        try:

            # query to get data with sql statement

            df = pd.read_sql_query(sql, )

            ()

        except Exception as e:
            return "Error " + [0]

        return df

II. Data Extraction Master Function Module

cx_Oracle is a Python extension module, the python equivalent of a driver for Oracle databases, that enables querying and updating of Oracle databases by using the database API common to all database access modules.

1) External input parameter module

txt text, contains a column of data, the first line of the column name, read when the first line is ignored

#Build ID - Numbering Dictionary
def buildid():
    sqlid = """select * from b_build_info"""
    db = Oracle_DB()  # Instantiate an object
    b_build_info = db.search_all(sqlid)
    ID_bUILDCODE = b_build_info.set_index("BUILDCODE")["ID"].to_dict()
    return ID_bUILDCODE
    
# Incoming list of data to be exported by text
def read_task_list():
    build_code=buildid()
    tasklist=[]
    is_first_line=True
    with open("./b_lst.txt") as lst:
        for line in lst:
            if is_first_line:
                is_first_line=False
                continue
            (build_code.get(('\n')))  # Key-value pair conversion
    return tasklist

2) Collection of business sql statements

Note that the {0} after in is not quoted, and is passed as a tuple, and the params parameter is passed as a sparm

= {'Start_time':'2021-04-01','End_time':'2021-05-01'}, this parameter can be changed as required

def sql_d(lst):
    # Monthly data
    sql_d_energy_item_month = """select * from d_energy_item_month
           where recorddate >= to_date(:Start_time, 'yyyy-MM-dd')
           and recorddate < to_date(:End_time, 'yyyy-MM-dd')
           and  buildid  in {0}
           order by recorddate asc""".format(lst)

    # Monthly data
    sql_d_energy_month = """select d.*, from d_energy_month d join t_device_info t on  = 
           where  >= to_date(:Start_time, 'yyyy-MM-dd')
           and  < to_date(:End_time, 'yyyy-MM-dd')
           and  = '{0}'
           order by  asc""".format(lst)

    # Query the data of the day
    sql_energy_item_hour_cheak = """select * from d_energy_item_hour
            where trunc(sysdate)=trunc(recorddate)
            order by recorddate asc""".format(lst)

    sql_collection = [sql_d_energy_item_month, sql_d_energy_item_day, sql_d_energy_item_hour, sql_d_energy_month,
                      sql_d_energy_day, sql_d_energy_hour, sql_energy_hour_cheak]
                      # Omit part of the sql statement here
    return sql_collection

3) Business data processing

The business data processing process, raw data post-processing, is not described here:

def db_extranction(lst,sparm,sql_type):   
    """sql_type - enter the serial number of the sql operation to be performed"""
    sql_=sql_d(lst)[sql_type]  # Output sql statements
    db = Oracle_DB()  # Instantiate an object
    res=db.search_one(sql_,sparm)
    # Data handling processing
    RES=Data_item_factory(res)  # Omitted here
    # res = db.search_all(sql_d_energy_item_month)
    print(RES)
    return RES

Multi-threaded data extraction section, here tasklist list multi-threaded data extraction

import threading
# Pandas read/write manipulation of Oracle databases
from tools.Data_Update_oracle import Oracle_DB
import pandas as pd
from concurrent import futures  

if __name__ == '__main__':
    #External incoming
    tasklist= read_task_list()
    print(tasklist)
    # Enter the time lookup range parameter, which can be manually modified
    sparm = {'Start_time':'2021-04-01','End_time':'2021-05-01'}
    lst = tuple(list(tasklist))
    
    # Business type serial number, can be manually modified
    sql_type=0
    
    # Extract all
    db_extranction(lst,sparm,sql_type)  

    # Multi-threaded batch extraction by field
    Method 1:utilizationthreadingmodularThreadClass constructors create threads
    #threads=[(target=db_extranction,args=(lst,sparm,sql_type)) for lst in tasklist]
    # [threads[i].start() for i in range(len(threads))]
    
    Method II:utilizationpython(used form a nominal expression)concurrentstorehouse,This is officially based on threading seal inside,先安装该storehouse
    # with (len(tasklist)) as executor:
    #     ([db_extranction(lst,sparm,sql_type) for lst in tasklist],tasklist)  

To this point, the entire database counting tool development process is complete, just short of the last step to share with partners to use, made into a GUI application here does not do a detailed introduction, to build an independent python environment, quickly release your application!

Above is Python implementation of a self-service query tool to take the number of details, more information about python self-service query to take the number of information please pay attention to my other related articles!