AI-BAAM: AI-Driven Bank Statement Analytics as Alternative Data for Malaysian MSME Credit Scoring
Abstract
Despite accounting for 96.1% of all businesses in Malaysia, access to financing remains one of the most persistent chal- lenges faced by Micro, Small, and Medium Enterprises (MSMEs). Newly es- tablished businesses are often excluded from formal credit markets as traditional underwriting approaches rely heavily on credit bureau data. This study investi- gates the potential of bank statement data as an alternative data source for credit assessment to promote financial inclusion in emerging markets. First, we propose a cash flow-based underwriting pipeline where we utilize bank statement data for end-to-end data extraction and machine learning credit scoring. Second, we in- troduce a novel dataset of 611 loan applicants from a Malaysian consulting firm. Third, we develop and evaluate credit scoring models based on application infor- mation and bank transaction-derived features. Empirical results demonstrate that incorporating bank statement features yields substantial improvements, with our best model achieving an AUROC of 0.806 on validation set, representing a 24.6% improvement over models using application information only. Finally, we will release the anonymized bank transaction dataset to facilitate further research on MSME financial inclusion within Malaysia’s emerging economy.