quantpylib.simulator.gene
quantpylib.simulator.gene
houses powerful features for numerical computations involving market
and non-market variables, including a no-code mathematical parser-evaluator that computes trading signals/indicators
from formulaic, well-defined Python str
objects. The parser-evaluator is exposed via the
quantpylib.simulator.gene.Gene
class APIs, which internally uses a tree-data structure to encode trading formulas. The quantpylib.simulator.gene.GeneticAlpha
class leverages this parser-evaluator, as well as the backtest engine provided by our quantpylib.simulator.alpha.Alpha
class to provide a no-code solution to backtesting trading strategies. The GeneticAlpha
class
extends the Alpha
class to implement all the necessary methods for signal computation, forecast generation, position sizing, risk-management, volatility targeting and backtest logic. All the performance metrics and hypothesis testing suites made available to the Alpha
objects are naturally available to any GeneticAlpha
instance via the same function signatures.
GeneticAlpha
(Bases: quantpylib.simulator.alpha.Alpha
)
Parameters:
Name | Type | Description | Default |
---|---|---|---|
genome |
Gene or str
|
Genome representation as |
required |
**kwargs |
Backtest parameters required to instantiate |
{}
|
Parameters in **kwargs
are required and passed into quantpylib.simulator.alpha.Alpha
and are as follows:
Parameters:
Name | Type | Description | Default |
---|---|---|---|
start |
datetime
|
Start of backtest simulation. If not tz-aware, assumed UTC. If not given, assume min of OHLC dataset in dfs. |
None
|
end |
datetime
|
Start of backtest simulation. If not tz-aware, assumed UTC. If not given, assume max of OHLC dataset in dfs. |
None
|
dfs |
dict
|
inst : OHLCV/other Dataframes used for computations. Default is an empty dictionary. |
{}
|
instruments |
list
|
List of traded instruments. |
[]
|
execrates |
ndarray
|
Execution rates for each instrument. Default is None. |
None
|
commrates |
ndarray
|
Commission rates for each instrument. Default is None. |
None
|
longswps |
ndarray
|
Long annualized swap/funding rates for each instrument. Positive swaps means long positions incur swap fees. Default is None. |
None
|
shortswps |
ndarray
|
Short annualized swap/funding rates for each instrument. Positive swaps means short positions incur swap fees. Default is None. |
None
|
granularity |
Period
|
The granularity of each trading signal evaluation.
Datapoints of lower granularity than specified are ignored. Last known datapoint of multiple entries in the same granularity interval is taken. Default is |
DAILY
|
positional_inertia |
float
|
Parameter controlling position change inertia. Default is 0. |
0
|
portfolio_vol |
float
|
Target portfolio volatility. Default is 0.20, representing 20% annualized volatility. |
0.2
|
weekend_trading |
bool
|
Indicates if there is weekend trading, such as in cryptocurrency markets. Defaults to False. |
False
|
around_the_clock |
bool
|
Indicates if there is 24H trading, such as in cryptocurrency and fx markets. Defaults to False. |
False
|
currency_denomination |
str
|
Currency denomination for the portfolio. Default is "USD". |
'USD'
|
starting_capital |
float
|
Amount to begins the backtest with. Defaults to 10000.0. |
10000.0
|
Notes:
execrates, commrates, longswps, shortswps are presented in decimals. execrates = [0.001, 0.005, ...]
encodes
that 0.1% of notional value transacted is deducted as execution costs for first instrument, second instrument...
commrates specify commisions in the same units (as percentage of notional value),
as for overnight swap rates for both long and short positions.
run_simulation
async
Runs the entire backtest.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
verbose |
(boolean, optional) flag to print out backtest simulation information at runtime. |
False
|
Returns:
Type | Description |
---|---|
DataFrame
|
a DataFrame containing backtest statistics. Contains information about contracts held throughtout the backtest, portfolio weights, portfolio leverage, nominal exposusure, execution costs, commissions, swaps, PnL, portfolio capital and so on. |
get_performance_measures
Computes the performance metrics for the trading strategy.
Returns:
Type | Description |
---|---|
dict
|
A dictionary containing various performance metrics:
|
hypothesis_tests
async
Conducts monte carlo permutation p-value hypothesis tests on the performance of the trading strategy represented by the object instance.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
num_decision_shuffles |
int
|
Number of decision shuffles for monto carlo permutation tests. Default is 1000. |
1000
|
num_data_shuffles |
int
|
Number of data shuffles for permutation tests. Default is 10 (computationally expensive). |
10
|
Returns:
Type | Description |
---|---|
dict
|
A dictionary containing the results of hypothesis tests.
|
Gene
Represents a formulaic alpha expression used to encode trading rules.
This class internally represents a trading rule as a tree data structure, where each node can either be a terminal (leaf) node or a functional node. Terminal nodes represent data points or constants, while functional nodes represent operations on their child nodes.
str_to_gene
staticmethod
__init__
Initializes a Gene object.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
prim |
str
|
The primary function/terminal of the gene. |
required |
space |
str
|
The space value associated with the gene, specifying details of the primitive. For example, in the context of financial trading, this could represent parameters such as window size or lookback period of a rolling correlation function. Defaults to None. |
None
|
is_terminal |
bool
|
Indicates whether the gene is terminal node. |
required |
parent |
Gene
|
The parent gene. Defaults to None. |
None
|
children |
list
|
The list of child genes. Defaults to an empty list. |
[]
|
The list of prim
primitives supported by our library, their behavior and their interpretations can be found here.
evaluate_node
Recursively evaluates a node in the formulaic alpha expression. When called on the root node in the gene representation, this function evaluates the entire formulaic expression.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
insts |
list
|
The list of instrument names. |
required |
dfs |
dict
|
A dictionary containing pricing/alternative data DataFrames for each instrument. |
required |
idx |
Index
|
The index for alignment. |
required |
Returns:
Type | Description |
---|---|
DataFrame
|
The evaluated node with DataFrame.index=idx and DataFrame.columns=insts. |
{
'BRKB':
open high low close adj_close volume
datetime
2000-01-03 00:00:00+00:00 1825.0000 1829.0000 1741.0000 1765.0000 35.3000 873500
2000-01-04 00:00:00+00:00 1725.0000 1733.0000 1695.0000 1704.0000 34.0800 1380000
2000-01-05 00:00:00+00:00 1707.0000 1773.0000 1695.0000 1732.0000 34.6400 997000
2000-01-06 00:00:00+00:00 1745.0000 1804.0000 1727.0000 1804.0000 36.0800 917000
2000-01-07 00:00:00+00:00 1830.0000 1848.0000 1805.0000 1820.0000 36.4000 1001500
... ... ... ... ... ... ...
2009-12-24 00:00:00+00:00 3281.9999 3295.9899 3274.9999 3286.9999 65.7400 607600
2009-12-28 00:00:00+00:00 3279.9999 3289.9999 3274.9999 3285.3699 65.7074 1080250
2009-12-29 00:00:00+00:00 3284.9999 3289.6899 3269.9999 3279.9999 65.6000 1105300
2009-12-30 00:00:00+00:00 3282.9999 3289.6499 3279.9999 3289.6499 65.7930 560350
2009-12-31 00:00:00+00:00 3289.9999 3300.9899 3279.9999 3285.9999 65.7200 972900
}
make_dot
Generate a DOT language representation of the tree structure rooted at this node.
Returns:
Type | Description |
---|---|
str
|
A string containing the DOT language representation of the tree. |
Notes
This method uses pre-order traversal to generate the DOT representation of the tree rooted at the current node. Each node in the tree corresponds to a vertex in the DOT graph, and each edge represents the parent-child relationship between nodes.
The generated DOT string can be rendered into a graphical visualization using graphviz or other tools that support the DOT language.
height
depth
size
pre_ord_apply
Apply a function to each node in the tree using pre-order traversal.
This method traverses the tree in a pre-order fashion, meaning it applies the function to the current node before recursively traversing its children. The function is applied to each node along with any additional keyword arguments provided.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
func |
callable
|
A function to be applied to each node in the tree. |
required |
**kwargs |
Additional keyword arguments to be passed to the function. |
{}
|
List of Primitives
The Op value = idx_op
can be taken to be default as union_idx_op
, or when explicitly paired with the un
Space value.
It takes intersect_idx_op
when paired with the ix
Space value. Examples would be plus(open,close)
, plus_un(open,close)
, plus_ix(open,close)
.
The different operators and their behavior is documented here.
Primitive | Space | Op | Terminal | Args | Meaning | Example |
---|---|---|---|---|---|---|
const | int,float | - | Yes | - | represents a constant numerical value of x | const_3.14 |
open | - | - | Yes | - | open price | open |
high | - | - | Yes | - | high price | high |
low | - | - | Yes | - | low price | low |
close | - | - | Yes | - | close price | close |
volume | - | - | Yes | - | trade volume | volume |
* | - | - | Yes | - | custom variable | * (e.g. epsEst, sentiment) |
abs | - | self_idx_op2 | No | 1 | absolute value | abs(minus(close,open)) |
neg | - | self_idx_op2 | No | 1 | negation | neg(minus(close,open)) |
log | - | self_idx_op2 | No | 1 | natural logarithm (replacing inf with NaN) | log(volume) |
sign | - | self_idx_op2 | No | 1 | sign function | sign(minus(close,open)) |
tanh | - | self_idx_op2 | No | 1 | tanh function | tanh(cszscre(logret_1())) |
sigmoid | - | self_idx_op2 | No | 1 | sigmoid function | sigmoid(cszscre(logret_1())) |
recpcal | - | self_idx_op2 | No | 1 | reciprocal (replacing inf with NaN) | recpcal(close) |
pow | int | self_idx_op2 | No | 1 | power function (replacing inf with NaN) | pow_2(close) |
csrank | - | all_idx_op | No | 1 | cross-sectional rank (smallest=1, average draws) | csrank(volume) |
cszscre | - | all_idx_op | No | 1 | cross-sectional Z-score | cszscre(volume) |
percentile | - | all_idx_op | No | 1 | cross-sectional percentile values | percentile(volume) |
ls | int,float / int,float | all_idx_op | No | 1 | -1 for values below 25 percentile and +1 for values above 75 percentile | ls_25/75(volume) |
delta | int | self_idx_op | No | 1 | time-series change in variable over time | delta_1(close) |
delay | int | self_idx_op | No | 1 | time-series delay by specified number of periods | delay_1(close) |
forward | int | self_idx_op | No | 1 | time-series lookahead by specified number of periods | forward_1(close) |
sum | int | self_idx_op | No | 1 | sum of time-series values | sum_5(volume) |
prod | int | self_idx_op | No | 1 | product of time-series values | prod_5(volume) |
mean, sma | int | self_idx_op | No | 1 | simple mean of time-series values | mean_5(volume) |
ema, ewma | int | self_idx_op | No | 1 | exponentially weighted moving average | ewma_5(volume) |
median | int | self_idx_op | No | 1 | median of time-series values | median_5(volume) |
std | int | self_idx_op | No | 1 | standard deviation of time-series values | std_5(volume) |
var | int | self_idx_op | No | 1 | variance of time-series values | var_5(volume) |
skew | int | self_idx_op | No | 1 | skewness of time-series values | skew_5(volume) |
kurt | int | self_idx_op | No | 1 | kurtosis of time-series values | kurt_5(volume) |
tsrank | int | self_idx_op | No | 1 | time-series rank | tsrank_5(volume) |
tsmax | int | self_idx_op | No | 1 | maximum value over time | tsmax_5(volume) |
tsmin | int | self_idx_op | No | 1 | minimum value over time | tsmin_5(volume) |
tsargmax | int | self_idx_op | No | 1 | index of maximum value over time | tsargmax_5(volume) |
tsargmin | int | self_idx_op | No | 1 | index of minimum value over time | tsargmin_5(volume) |
tszscre | int | self_idx_op | No | 1 | time-series Z-score | tszscre_5(volume) |
max | -,un,ix | idx_op | No | >=2 | maximum over arguments | max_ix(open,close,high) |
plus | -,un,ix | idx_op | No | >=2 | sum over arguments | plus_un(open,close,high) |
minus | -,un,ix | idx_op | No | 2 | subtraction | minus(high,low) |
mult | -,un,ix | idx_op | No | 2 | multiplication | mult(open,close) |
div | -,un,ix | idx_op | No | 2 | division | div(open,close) |
and | -,un,ix | idx_op | No | 2 | logical AND | and(gt(high,low),lt(high,low)) |
or | -,un,ix | idx_op | No | 2 | logical OR | or(gt(high,low),lt(high,low)) |
eq | -,un,ix | idx_op | No | 2 | logical EQUALS | eq(gt(high,low),lt(high,low)) |
gt | -,un,ix | idx_op | No | 2 | greater-than comparison | gt(open,close) |
gte | -,un,ix | idx_op | No | 2 | greater-than-equals comparison | gte(open,close) |
lt | -,un,ix | idx_op | No | 2 | less-than comparison | lt(open,close) |
lte | -,un,ix | idx_op | No | 2 | less-than-equals comparison | lte(open,close) |
ite | -,un,ix | idx_op | No | 3 | if-then-else operation | ite(or(gt(high,low),lt(high,low)),const_1,const_-1) |
cor | int | slow_idx_op | No | 2 | rolling-correlation | cor_12(volume,close) |
kentau | int | slow_idx_op | No | 2 | rolling-Kendall's tau correlation | kentau_12(volume,close) |
cov | int | slow_idx_op | No | 2 | rolling-covariance | cov_12(volume,close) |
dot | int | slow_idx_op | No | 2 | rolling-dot product | dot_12(volume,close) |
wmean, wma | int | slow_idx_op | No | 2 | weighted moving average | wmean_12(close,weights) |
grssret | int | - | Pseudo | 0 | period gross returns | grssret_12() |
logret | int | - | Pseudo | 0 | period log returns | logret_12() |
netret | int | - | Pseudo | 0 | period net returns (gross returns - 1) | netret_12() |
volatility | int | - | Pseudo | 0 | volatility (standard deviation of log returns) | volatility_12() |
rsi | int | - | Pseudo | 0 | relative strength index indicator | rsi_12() |
mvwap | int | - | Pseudo | 0 | moving volume-weighted average price indicator | mvwap_12() |
obv | int | - | Pseudo | 0 | on-balance volume indicator | obv_12() |
atr | int | - | Pseudo | 0 | average true range indicator | atr_12() |
tr | - | - | Pseudo | 0 | true range indicator | tr() |
adx | int | - | Pseudo | 0 | average directional movement index | adx_12() |
addv | int | - | Pseudo | 0 | average daily dollar volume | addv_12() |
mac | int / int | - | No | 1 | moving average crossover indicator function for fast/slow | mac_20/50 |
vma, vwma | int | - | No | 1 | volume weighted moving average | vma_20(close) |
vwvar | int | - | No | 1 | volume weighted variance | vmvar_20(logret_1()) |
vwstd | int | - | No | 1 | volume weighted standard deviation | vmstd_20(logret_1()) |
between | int,float / int,float | - | No | 1 | indicator function for a <= x <= b | between_10/90(percentile(volume)) |