正则稀疏化的多因子量化选股策略Multi-factor Quantitative Stock Selection Strategy Based on Sparsity Penalty
舒时克;李路;
摘要(Abstract):
针对高维度数据集特征之间的复杂性,而传统的L1惩罚项不满足Oracle性质的无偏性,将逻辑回归弹性网(LR-Elastic Net)中的L1惩罚项替换为SCAD(Smoothly Clipped Absolute Deviation)和MCP(Minimax Concave Penalty)惩罚项,分别构建了LR-SCAD和LR-MCP模型,在保留稀疏性的同时满足了无偏性,并利用ADMM(Alternating Direction Method of Multipliers)算法进行求解。通过模拟实验发现,LR-Elastic Net模型能很好地处理特征存在相关性的小样本数据,而LR-SCAD和LR-MCP模型在特征存在相关性的大样本数据中表现较好;建立LR-Elastic Net、LR-SCAD和LR-MCP策略,并应用于沪深300指数成分股数据。回测结果显示,LR-SCAD和LR-MCP策略在股票相关性很强的数据中比LR-Elastic Net策略表现更好。
关键词(KeyWords): 弹性网(Elastic Net);SCAD;MCP;ADMM算法;逻辑回归;多因子选股
基金项目(Foundation): 国家自然科学基金(11501055,11801362)
作者(Author): 舒时克;李路;
Email:
DOI:
参考文献(References):
- [1] JAGANNATHAN R,MA T.Risk reduction in large portfolios:Why imposing the wrong constraints helps[J].The Journal of Finance,2003,58(4):1651-1684.
- [2] ZOU H,HASTIE T.Regularization and Variable Selection via the Elastic net[J].Journal of the Royal Statistical Society,2005,67(2):301-320.
- [3]谢合亮,胡迪.多因子量化模型在投资组合中的应用——基于LASSO与Elastic Net的比较研究[J].统计与信息论坛,2017,32(10):36-42.
- [4] FAN J,LI R.Variable selection via nonconcave penalized likelihood and its oracle properties[J].Journal of the American Statal Association,2001,96(9):1348-1360.
- [5] BOYD S,PARIKH N,CHU E,et al. Distributed optimization and statistical learning via the alternating direction method of multipliers[M].[S.l.]:Now Foundations and Trends,2011:1-122.
- [6] ZHANG C H.Penalized linear unbiased selection[J].Department of Statistics,2007(3):1-22.
- [7]闫莉,陈夏.高维广义线性模型的惩罚拟似然SCAD估计[J].武汉大学学报(理学版),2018,64(6):533-539.
- [8]秦磊,谢邦昌.Logistic回归的ArctanLASSO惩罚似然估计及应用[J].数量经济技术经济研究,2015,32(6):135-146.
- [9] CANNARILE F,COMPARE M,BARALDI P,et al.Elastic net multinomial logistic regression for fault diagnostics of on-board aeronautical systems[J].Aerospace Science and Technology,2019,94(9):1-15.
- [10]荣雯雯,张奇,刘艳.基于正则化回归的变量选择方法在高维数据中的应用[J].实用预防医学,2018,25(6):645-648.
- [11] SHERWOOD B. Variable selection for additive partial linear quantile regression with missing covariates[J].Journal of Multivariate Analysis,2016,152(3):206-223.
- [12]孙红卫,杨文越,王慧,等.惩罚logistic回归用于高维变量选择的模拟评价[J].中国卫生统计,2016,33(4):607-611.
- [13]赵思雨.带惩罚的Logistic回归方法研究及其在企业财务预警中的应用[D].广州:暨南大学,2018.
- [14]刘乐平,张龙,蔡正高.多重假设检验及其在经济计量中的应用[J].统计研究,2007(4):26-30.
- [15]李瑞.SNP定位的一种降维及变量选择方法[D].合肥:中国科技大学,2011.
- [16] ALFONS A,CROUX C,GELPER S.Sparse least trimmed squares regression for analyzing high-dimensional large data sets[J].The Annals of Applied Statistics,2013,7(1):226-248.
- [17]方匡南,杨阳.SGL-SVM方法研究及其在财务困境预测中的应用[J].统计研究,2018,35(8):104-115.
- [18]李斌,林彦,唐闻轩.ML-TEA:一套基于机器学习和技术分析的量化投资算法[J].系统工程理论与实践,2017,37(5):1089-1100.
- [19]韩杨.对技术分析在中国股市的有效性研究[J].经济科学,2001(3):49-57.
- [20] TAKEUCHI L.Applying deep learning to enhance momentum trading strategies in stocks[J].Expert Systems with Applications,2013,14:5501-5506.