It's not difficult to develop based on the underlying data, it's just a matter of using the user input variables as filter conditions, mapping the parameters to a sql statement, and generating a sql statement and then going to the database to execute it.
Finally, QT is used to develop a GUI interface, the user interface clicks and filters the conditions, signals to trigger the corresponding buttons with the binding of the pass slot function to execute the
Specific thoughts:
I. Database connection classes
This reads and writes the oracle database using pandas.
II. Main function module
1) Input parameter module, external input condition parameters, establish database key field mapping
--Note: Reading an external txt file and filtering fields may require key-value pair conversion.
(2) sql statement collection module, the business sql statements to be executed to the unified storage here
3) Data processing function factory
4) Extracting data using multiple threads
I. Database connection classes
cx_Oracle is a Python extension module that is the python equivalent of an Oracle database driver, enabling querying and updating of Oracle databases by using the database API common to all database access modules.
Pandas is based on NumPy. Developed as a module for solving data analysis tasks, Pandas introduces a large number of libraries and a number of standard data models that provide the method classes and functions needed to efficiently manipulate large data sets
There are three main ways for pandas to call a database: read_sql_table, read_sql_query, and read_sql.
This article introduces the use of the read_sql_query method in Pandas.
1:pd.read_sql_query() Read custom data,return of something its original ownerDataFramespecification,pass (a bill or inspection etc)SQLQuery scripts including additions, deletions and modifications。 pd.read_sql_query(sql, con, index_col=None,coerce_float=True, params=None, parse_dates=None,chunksize=None) sql:executablesqlscripts,text type con:database connection index_col:Select the column that returns the index of the result set,copies/copies列表 coerce_float:Very useful.,String in numeric form directly asfloatinput data type parse_dates:Converts a column of date-type strings into thedatetimetype data,together withpd.to_datetimefunction is similar to。 params:towardsqlscripts中传入的参数,There are lists of official types,Tuples and Dictionaries。The syntax used to pass parameters is database driver related。 chunksize:If an integer value is provided,Then it will return agenerator,The number of lines per output is the size of the supplied value read_sql_query()acceptableSQLstatement,DELETE,INSERT INTO、UPDATEThe operation has no return value(But it will be executed in the database),The program throws theSourceCodeCloseError,and terminate the proceedings。SELECTwill return the result。If you want to keep running,cantryCatch this exception。 2:pd.read_sql_table() Reading a table in a database,return of something its original ownerDataFramespecification(pass (a bill or inspection etc)表名) import pandas as pd pd.read_sql_table(table_name, con, schema=None,index_col=None, coerce_float=True, parse_dates=None, columns=None,chunksize=None) 3:pd.read_sql() 读数据库pass (a bill or inspection etc)SQLscripts或者表名 import pandas as pd pd.read_sql(sql, con, index_col=None,coerce_float=True, params=None, parse_dates=None, columns=None, chunksize=None)
The following creates the connection class Oracle_DB for connecting to an oracel database
There are two main function methods for manipulating data.
import cx_Oracle # Pandas read/write manipulation of Oracle databases import pandas as pd # Avoid messy codes from encoding issues import os ['NLS_LANG'] = 'SIMPLIFIED CHINESE_CHINA.UTF8' class Oracle_DB(object): def __init__(self): try: # Connect to oracle # Method 1: create_engine() provided by sqlalchemy # from sqlalchemy import create_engine # engine = create_engine('oracle+cx_oracle://username:password@ip:1521/ORCL') # # Method 2: cx_Oracle.connect() = cx_Oracle.connect('username', 'password', 'ip:1521/database') except cx_Oracle.Error as e: print("Error %d:%s" % ([0], [1])) exit() # Search for partial information def search_one(self, sql,sparm): try: # query to get data with sql statement # Substitute parameters: sparm -- query specified field parameters df = pd.read_sql_query(sql, ,params=sparm) () except Exception as e: return "Error " + [0] return df # Search for all information def search_all(self, sql): try: # query to get data with sql statement df = pd.read_sql_query(sql, ) () except Exception as e: return "Error " + [0] return df
II. Data Extraction Master Function Module
cx_Oracle is a Python extension module, the python equivalent of a driver for Oracle databases, that enables querying and updating of Oracle databases by using the database API common to all database access modules.
1) External input parameter module
txt text, contains a column of data, the first line of the column name, read when the first line is ignored
#Build ID - Numbering Dictionary def buildid(): sqlid = """select * from b_build_info""" db = Oracle_DB() # Instantiate an object b_build_info = db.search_all(sqlid) ID_bUILDCODE = b_build_info.set_index("BUILDCODE")["ID"].to_dict() return ID_bUILDCODE # Incoming list of data to be exported by text def read_task_list(): build_code=buildid() tasklist=[] is_first_line=True with open("./b_lst.txt") as lst: for line in lst: if is_first_line: is_first_line=False continue (build_code.get(('\n'))) # Key-value pair conversion return tasklist
2) Collection of business sql statements
Note that the {0} after in is not quoted, and is passed as a tuple, and the params parameter is passed as a sparm
= {'Start_time':'2021-04-01','End_time':'2021-05-01'}, this parameter can be changed as required
def sql_d(lst): # Monthly data sql_d_energy_item_month = """select * from d_energy_item_month where recorddate >= to_date(:Start_time, 'yyyy-MM-dd') and recorddate < to_date(:End_time, 'yyyy-MM-dd') and buildid in {0} order by recorddate asc""".format(lst) # Monthly data sql_d_energy_month = """select d.*, from d_energy_month d join t_device_info t on = where >= to_date(:Start_time, 'yyyy-MM-dd') and < to_date(:End_time, 'yyyy-MM-dd') and = '{0}' order by asc""".format(lst) # Query the data of the day sql_energy_item_hour_cheak = """select * from d_energy_item_hour where trunc(sysdate)=trunc(recorddate) order by recorddate asc""".format(lst) sql_collection = [sql_d_energy_item_month, sql_d_energy_item_day, sql_d_energy_item_hour, sql_d_energy_month, sql_d_energy_day, sql_d_energy_hour, sql_energy_hour_cheak] # Omit part of the sql statement here return sql_collection
3) Business data processing
The business data processing process, raw data post-processing, is not described here:
def db_extranction(lst,sparm,sql_type): """sql_type - enter the serial number of the sql operation to be performed""" sql_=sql_d(lst)[sql_type] # Output sql statements db = Oracle_DB() # Instantiate an object res=db.search_one(sql_,sparm) # Data handling processing RES=Data_item_factory(res) # Omitted here # res = db.search_all(sql_d_energy_item_month) print(RES) return RES
Multi-threaded data extraction section, here tasklist list multi-threaded data extraction
import threading # Pandas read/write manipulation of Oracle databases from tools.Data_Update_oracle import Oracle_DB import pandas as pd from concurrent import futures if __name__ == '__main__': #External incoming tasklist= read_task_list() print(tasklist) # Enter the time lookup range parameter, which can be manually modified sparm = {'Start_time':'2021-04-01','End_time':'2021-05-01'} lst = tuple(list(tasklist)) # Business type serial number, can be manually modified sql_type=0 # Extract all db_extranction(lst,sparm,sql_type) # Multi-threaded batch extraction by field Method 1:utilizationthreadingmodularThreadClass constructors create threads #threads=[(target=db_extranction,args=(lst,sparm,sql_type)) for lst in tasklist] # [threads[i].start() for i in range(len(threads))] Method II:utilizationpython(used form a nominal expression)concurrentstorehouse,This is officially based on threading seal inside,先安装该storehouse # with (len(tasklist)) as executor: # ([db_extranction(lst,sparm,sql_type) for lst in tasklist],tasklist)
To this point, the entire database counting tool development process is complete, just short of the last step to share with partners to use, made into a GUI application here does not do a detailed introduction, to build an independent python environment, quickly release your application!
Above is Python implementation of a self-service query tool to take the number of details, more information about python self-service query to take the number of information please pay attention to my other related articles!