The value of financial statements in predicting the innovation potential of SMEs in science and technology: a machine learning approach
Prof. Bin KE
Professor | Provost’s Chair
Accounting Department, Business School
National University of Singapore
Governments globally allocate billions to spur research and innovation through SMEs in science and technology. However, the effectiveness of these grant agencies in the award process remains a pertinent policy concern. Traditionally, entities submit grant proposals, subjected to third-party expert evaluations, with the funding going to top-rated ones, albeit with discretionary agency adjustments. A core challenge lies in the substantial time required for R&D investments to manifest outcomes, making innovation potential predictions intricate, even for adept experts. Moreover, critiques often point fingers at potential biases like nepotism.
To compete for government grants for innovation, SMEs are typically required to submit a detailed grant proposal containing the following blocks of information: (i) background information on the entity, including basic demographic information; top management team; current capabilities in innovation and management skills; history and risk management; (ii) detailed information on the technological innovation per se, including basic information, team, technology description, commercialization feasibility; (iii) product competition; (iv) business model; (v) financial forecasts; (vi) historical financial statements (typically most recent one or two years).
The objective of this study is to examine whether we can use only the most recent year’s financial statements included in SMEs’ grant application proposals and advanced machine learning to predict their innovation prowess better than conventional human expert evaluations. We test our idea using a sample of China’s InnoFund innovation grant applications submitted by SMEs in science and technology.
We focus on historical financial statements only because they are the only structured data from grant proposals readily available for quantitative analysis. Financial statements are the standardized representation of a firm’s financing, investing, and operating activities, but the authenticity and predictive value of financial statements from SMEs in China are an empirical question.
We measure a firm’s innovation potential using two complementary proxies: subsequent patent application volume and VC/PE funding post grant proposal submission. The machine learning model employed is XGBoost.
Our out-of-sample performance results show that a baseline model with expert evaluations alone (baseline model) surpasses random predictions. Yet, a model with financial statement data alone consistently beats the baseline model by a significant margin. Merging expert scores and financial data does not further improve performance. Our results suggest that financial statements for Chinese SMEs in innovative industries contain predictive information missed by human experts.
Delving deeper via SHAP values, we discern differences in evaluation criteria between experts and our machine learning models. While human experts tend to emphasize key financial components like net income and cash reserves, ML models use information from many more financial line items, particularly prioritizing expense and liability details. These findings demonstrate the machine learning model’s capability to detect and harness intricate data patterns often neglected by human evaluations, highlighting the potential synergy of combining expert discernment with ML-driven financial data insights for superior innovation assessment.