Generating bank transaction sequences with tabular GAN models

Mehri, Hamideh (2024) Generating bank transaction sequences with tabular GAN models. Masters thesis, Memorial University of Newfoundland.

[img] [English] PDF - Accepted Version
Available under License - The author retains copyright ownership and moral rights in this thesis. Neither the thesis nor substantial extracts from it may be printed or otherwise reproduced without the author's permission.

Download (2MB)

Abstract

The digital age has equipped financial institutions with vast amounts of data. Privacy concerns have posed challenges to harnessing this data’s full potential. Generation of synthetic data is one of the most promising solutions for allowing analysis of the patterns and trends contained in this data without compromising privacy. Although initial methods for generating synthetic data were basic, emerging generative models have expanded the possibilities. However, generating synthetic data for unique datasets, like bank transaction sequences, remains challenging. These sequences exhibit complex variability driven by the various customer transaction behaviors, distinguishing them from the more predictable patterns in other data types. We propose BankGAN, an innovative conditional tabular GAN architecture designed specifically for synthesizing bank transaction sequences that exhibit non-uniform date patterns. We show that BankGAN outperforms a recurrent neural network (RNN)-based model in achieving superior statistical resemblance to real data. Moreover, it excels at replicating features of periodic transactions, surpassing both the RNN and transformer-based models. BankGAN distinguishes itself by generating privacy-preserving synthetic data without compromising data quality—a stark contrast to the existing models where adding privacy-preserving guarantees typically degrades performance.

Item Type: Thesis (Masters)
URI: http://research.library.mun.ca/id/eprint/16560
Item ID: 16560
Additional Information: Includes bibliographical references (pages 82-91)
Keywords: synthetic data, deep learning, generative models, sequential tabular data, decoder-only transformers
Department(s): Science, Faculty of > Computer Science
Date: April 2024
Date Type: Submission
Library of Congress Subject Heading: Data protection; Deep learning (Machine learning); Banks and banking--Data processing; Electronic data processing; Privacy, Right of

Actions (login required)

View Item View Item

Downloads

Downloads per month over the past year

View more statistics