The art of categorising bank statements 

Categorising transactions extends beyond simple organisation; it entails better understanding the context of each transaction and grouping it with others of similar nature. Assigning transactions to specific groups enhances their identification and enables us to extract insights from vast amounts of data.  

We sat down with our co-founder, Christopher Ball, to discuss transaction categorisation, why we are doing this and the advancements our team has achieved in this area. 

How are we categorising bank statement transactions? 

Gathr employs supervised machine learning models to classify transaction categories, leveraging all available variables within the bank statement. These variables encompass transaction descriptions, amounts, dates, merchant information, along with a few others that form part of our secret sauce at Gathr.  

For instance, if a transaction description mentions “Restaurant” and the amount corresponds to a typical meal purchase, our model can accurately categorise it as a “Groceries & Meals” expense. 

What is the purpose of categorising bank statements? 

The objective behind categorising bank statements is multifaceted. Primarily, we believe that lenders and telecommunication companies underutilise the vast amount of data available to them for making risk assessments. 

By categorising this data, we aim to delve deeper into individuals’ or businesses’ financial profiles, allowing for more detailed insights and analysis. This granular understanding enables better-informed risk decisions and enhances overall operational efficiency. 

What insights are we seeking through this process?  

We’re interested in more than just income versus expenses; we’re delving into behavioural patterns of individuals. This includes examining where and how they shop, as well as their entertainment preferences and activities. By exploring such details, we gain a deeper understanding of consumer behaviour and preferences, which can inform various aspects of our decision-making processes. 

Insights from one million transactions 

Here are the top 40 words extracted from 1 million transactions. This snapshot offers a glimpse into spending habits, trends, and the diverse nature of transactions, painting a vivid picture of consumer behaviour and financial dynamics. Credit – Stephan Schoeman, Senior Software Engineer at Finch Technologies.  

one million transactions

Why where specific categorisation groups chosen, such as “Groceries and Meal” or “Prepaid Airtime?” 

The selection of core categories aligns closely with the guidelines set by the national credit regulator regarding affordability analysis. Additionally, these categorization groups enable us to detect both risk factors and positive trends in spending behaviour. By organising transactions into these categories, we can more effectively monitor financial patterns and identify potential areas of concern or opportunity. 

What are some of the challenges when categorising bank statements? 

The main consideration we’ve grappled with is how granular we want to go. Banking data has its boundaries – we won’t know the specifics of your KFC order, but we can find out if someone’s buying KFC or Nando’s? 

The good thing is, we can quickly iterate our product. We’ll find out where our models need improvement and work on making them better over time. It’s an ongoing process of refining our system to enhance its effectiveness. 

How do banks establish transaction naming conventions, considering the significant difference among them?

Exploring the data firsthand has been enlightening. There’s a clear distinction between traditional banks and newer ones in terms of data structure. Each bank has its unique approach and terminology, which presents challenges but also opportunities for us to develop a resilient, universal product. 

If you’d like to find out more about how our categorisations work and how this feature can benefit your business, book a live demo with our team.

Book a demo