Quantitative factors are usually measured by simulating trades and calculating various metrics, which:
- Third-party libraries to be used for measurement: numpy, pandas, talib
- Third-party libraries needed for plotting: matplotlib, seaborn
Other libraries are added additionally as required by the policy
Factorization framework
Here the blogger shares the process he often uses when measuring and hopes to make progress with you!
The whole process from factor to return is measured as follows: strategy (factor portfolio) -> buy and sell signals -> buy and sell points -> return
We therefore measure this for each individual stock:
1. Pre-processing of stock data
First of all, here is a commonly used tool to import, including the library used for measurement and plotting library (including the picture of the Chinese display blank solution)
# For measurements import numpy as np import pandas as pd from copy import deepcopy from tqdm import tqdm from datetime import datetime import talib # For drawing import matplotlib as mpl import as plt import seaborn as sns %matplotlib inline # Mapping Reality Chinese () [""] = (20,10) ['-serif'] = ['Arial Unicode MS'] # Current fonts support Chinese ['axes.unicode_minus'] = False # Solve the problem that saving an image with a negative sign '-' is shown as a square # Other import warnings ("ignore")
Then there is a loop to read the code of the stock:
import os def readfile(path, limit=None): files = (path) file_list = [] for file in files: # Traversing folders if not (file): file_list.append(path + '/' + file) if limit: return file_list[:limit] return file_list stock_dict = {} for _file in tqdm(readfile("../data/stock_data")): if not _file.endswith(".pkl"): continue # TODO Here you can add a filter if you need to add the current stock to the measured stock pool file_df = pd.read_pickle(_file) file_df.set_index(["Date"], inplace=True) file_df. = "" file_df.index = pd.to_datetime(file_df.index) file_df.rename(columns={'Opening':'open',"Closing.":"close","Highest.":"high","Minimum.":"low","Volume.":"volume"},inplace=True) stock_code = _file.split("/")[-1].replace(".pkl", '') # TODO Here you can add a date to intercept a part of the data stock_dict[stock_code] = file_df
The above section processes the stock data, and the processed data is stored in the stock_dict variable, where the key is the stock code and the value is the stock data.
2. Indicator measurement
When measuring metrics, let's take a stock as an example:
for _index,_stock_df in tqdm(stock_dict.items()): measure_df = deepcopy(_stock_df)
In the code:
- Here measure_df is the dataframe data to be measured
- The use of deepcopy is to prevent the measurement process from affecting the original data.
We can then cycle through each line of this one stock (representing each day) and measure the trading rules as follows:
- Buy rule: buy signal issued & no current position, then buy
- Sell rules: Sell signals issued & current position, then sold
# Beginning to measure trade_record_list = [] this_trade:dict = None for _mea_i, _mea_series in measure_df.iterrows(): # Cycle every day if signal a buy (i.e. commit to buying): if this_trade is None: # No current position, then buy this_trade = { "buy_date": _mea_i, "close_record": [_mea_series['close']], } elif signal to sell: if this_trade is not None: # To execute a sell this_trade['sell_date'] = _mea_i this_trade['close_record'].append(_mea_series['close']) trade_record_list.append(this_trade) this_trade = None else: if this_trade is not None: # Currently have positions this_trade['close_record'].append(_mea_series['close'])
In the above code, we have saved every complete transaction (buy->hold->sell), in the trade_record_list variable, and every complete transaction is recorded:
{ 'buy_date': Timestamp('2015-08-31 00:00:00'), # Time to buy 'close_record': [41.1,42.0,40.15,40.65,36.6,32.97], # Record of closing prices 'sell_date': Timestamp('2015-10-12 00:00:00')} # Time to sell # TODO can also add metrics for custom records }
3. Organization of measurements
Use (trade_record_list) directly to see the total transaction results:
The process of organizing is also relatively simple and independent of cycling through this trade and then calculating the desired metrics, such as the annualized return of a single trade can be used:
trade_record_df = (trade_record_list) for _,_trade_series in trade_record_df.iterrows(): trade_record_df.loc[_i,'Annualized rate of return'] = (_trade_series['close_record'][-1] - _trade_series['close_record'][0])/_trade_series['close_record'][0]/(_trade_series['sell_date'] - _trade_series['buy_date']).days * 365 # Annualized returns # TODO Add more metrics here based on your desired results
4. Mapping of results
The code for plotting is usually more fixed, such as a win rate plot:
# Clear the drawing cache () () # Start plotting (figsize=(10, 14), dpi=100) # Use seaborn to chart wins fig = ((total_measure_record).(2), annot=True, cmap="RdBu_r",center=0.5) ("Winning percentage chart.") scatter_fig = fig.get_figure() # Saved locally scatter_fig.savefig("Winning percentage chart.") scatter_fig.show() # Last shown
To this article on the Python quantitative factor measurement and plotting of ultra-detailed process code is introduced to this article, more related Python quantitative factor measurement content please search my previous articles or continue to browse the following related articles I hope that you will support me more in the future!