Systems and Means of Informatics
2021, Volume 31, Issue 4, pp 18-26
A PROGRAM FOR CONSTRUCTING OF QUITE INTERPRETABLE AND RTF-ADEQUATE LINEAR REGRESSION MODELS
Abstract
The article is devoted to the problem of feature selection in regression models estimated using the ordinary least squares method. Models constructed as a result of such selection are often inadequate and poorly interpreted. For the first time, the definitions of "quite interpretable" and "RTF-adequate" regression models are formulated. The previously proposed effective algorithm for solving the problem of feature selection is considered. On its basis, an algorithm has been developed for constructing quite interpretable and RTF-adequate linear regression models. In it, for each regression, the following tests are sequentially carried out: "informativeness" of variables, multicollinearity, correspondence of coefficients signs to the physical meaning of factors, adequacy of model in terms of coefficient of determination and significance in general according to Fisher's F-test, and significance of the coefficients according to the Student's t-test.
The proposed algorithm is implemented as a program for the Gretl econometric package. The developed program is universal and can be used to solve a wide range of data analysis tasks.
[+] References (8)
- Pardoe, I. 2020. Applied regression modeling. Hoboken, NJ, USA: Wiley. 336 p.
- Miller, A.J. 2002. Subset selection in regression. London, U.K.: Chapman & Hall/CRC. 256 p.
- Venkatesh, B., and J. Anuradha. 2019. A review of feature selection and its methods. Cybernetics Information Technologies 19:3-26.
- Strizhov, V.V., and E.A. Krymova. 2010. Metody vybora regressionnykh modeley [Regression model selection methods]. Moscow: CC RAS. 60 p.
- Noskov, S. I., and N. A. Potorochenko. 1992. Dialogovaya sistema realizatsii konkursa regressionnykh zavisimostey [Dialogue system for the implementation of the competition of regression dependencies]. Upravlyayushchie sistemy i mashiny [Control Systems and Computers] 3-4:111-116.
- Noskov, S. I. 1996. Tekhnologiya modelirovaniya ob"ektov s nestabil'nym funktsionirovaniem i neopredelennost'yu v dannykh [Technology for modeling objects with unstable functioning and uncertainty in data]. Irkutsk: Oblinformpechat'. 321 p.
- Bazilevskiy, M.P. 2020. Fundamental'nyy blok algoritmov postroeniya khorosho interpretiruemykh kachestvennykh regressionnykh modeley [The fundamental block of algorithms for constructing well-interpreted qualitative regression models]. Informat- sionnye tekhnologii i matematicheskoe modelirovanie v upravlenii slozhnymi sistemami [Information Technology and Mathematical Modeling in the Management of Complex Systems] 3(8): 1-10.
- Bazilevskiy, M. P. 2018. Povyshenie effektivnosti algoritma otbora po kriteriyu determinatsii informativnykh regressorov v regressionnykh modelyakh [Improving the efficiency of the algorithm for selecting informative regressors by the criterion of determination in regression models]. Prikladnaya matematika i informatika: sovremennye issledovaniya v oblasti estestvennykh i tekhnicheskikh nauk [Applied Mathematics and Informatics: Contemporary Research in Natural and Technical Sciences]. Tol'yatti. 196-202.
[+] About this article
Title
A PROGRAM FOR CONSTRUCTING OF QUITE INTERPRETABLE AND RTF-ADEQUATE LINEAR REGRESSION MODELS
Journal
Systems and Means of Informatics
Volume 31, Issue 4, pp 18-26
Cover Date
2021-12-10
DOI
10.14357/08696527210402
Print ISSN
0869-6527
Publisher
Institute of Informatics Problems, Russian Academy of Sciences
Additional Links
Key words
feature selection; ordinary least squares; quite interpretable and RTF-adequate regression; variable "informativeness" criterion; multicollineari- ty; Fisher's F-test; Student's t-test
Authors
M. P. Bazilevskiy
Author Affiliations
Department of Mathematics, Irkutsk State Transport University, 15 Chernyshevskogo Str., Irkutsk 664074, Russian Federation
|