Detecting Fraud Using Transaction Frequency Data

Roheena Khan, Andrew Clark, George Mohay, Suriadi Suriadi


Despite all attempts to prevent fraud, it continues to be a major threat to industry and government. In this paper, we present a fraud detection method which detects irregular frequency of transaction usage in an Enterprise Resource Planning (ERP) system. We discuss the design, development and empirical evaluation of outlier detection and distance measuring techniques to detect frequency-based anomalies within an individual user’s profile, relative to other similar users. Primarily, we propose three automated techniques: a univariate method, called Boxplot which is based on the sample’s median; and two multivariate methods which use Euclidean distance, for detecting transaction frequency anomalies within each transaction profile. The two multivariate approaches detect potentially fraudulent activities by identifying: (1) users where the Euclidean distance between their transaction-type set is above a certain threshold and (2) users/data points that lie far apart from other users/clusters or represent a small cluster size, using k-means clustering. The proposed methodology allows an auditor to investigate the transaction frequency anomalies and adjust the different parameters, such as the outlier threshold and the Euclidean distance threshold values to tune the number of alerts. The novelty of the proposed technique lies in its ability to automatically trigger alerts from transaction profiles, based on transaction usage performed over a period of time. Experiments were conducted using a real dataset obtained from the production client of a large organization using SAP R/3 (presently the most predominant ERP system), to run its business. The results of this empirical research demonstrate the effectiveness of the proposed approach.


Anomaly Detection; Enterprise Resource Planning Systems; Fraud Detection


A. G. Little and P. J. Best, “A framework for separation of duties in an SAP R/3 environment,” Managerial Auditing Journal, vol. 18, no. 5, pp. 419–430, 2003.

P. Bingi, M. K. Sharma, and J. K. Godla, “Critical issues affecting an ERP implementation,” Information Systems Management, vol. 16, no. 3, pp. 7–14, 1999.

Y. F. Musaji, Integrated Auditing of ERP Systems. New York: John Wiley and Sons, 2002.

J. D. O’Gara, Corporate Fraud: Case Studies in Detection and Prevention. New Jersy: John Wiley and Sons, 2004.

R. Bolton and D. Hand, “Statistical fraud detection: A review,” Statistical Science, vol. 17, no. 3, pp. 235–249, 2002.

W. S. Albrecht, C. Albrecht, and C. C. Albrecht, “Current trends in fraud and its detection,” Information Security Journal: A Global Perspective, vol. 17, no. 2, pp. 2–12, 2008.

P. J. Best, “Computer assisted auditing techniques,” Queensland University of Technology, Brisbane, QLD, 2007.

A. Clark, G. Mohay, and P. Best, “Integrated financial fraud detection in enterprise applications,” Information Security Institute, Queensland University of Technology, Brisbane, QLD, 2005.

J. T. Wells, Fraud Casebook: Lessons from the Bad Side of Business. New Jersy: John Wiley and Sons, 2007.

R. Khan, M. Corney, A. Clark, and G. Mohay, “Transaction mining for fraud detection in ERP Systems,” Industrial Engineering and Management Systems, vol. 9, no. 2, pp.141-156, 2010.

V. Barnett and T. Lewis, Outliers in Statistical Data, 3rd ed. England: Wiley and Sons, 1994.

E.W.T. Ngai, Y. Hu, Y.H. Wong, Y. Chen, and X. Sun, “The application of data mining techniques in financial fraud detection: A classification framework and an academic review of literature,” Decision Support Systems, vol. 50, no. 3, pp.559-569, 2011.

R. J. Bolton and D. J. Hand, “Unsupervised profiling methods for fraud detection” in Proc. Credit Scoring and Credit Control VII, London, 2001, pp. 5-7.

T. Fawcett and F. Provost, “Adaptive fraud detection,” Data Mining and Knowledge Discovery, vol. 1, no. 3, pp.291–316,1997.

P. Flach, Machine Learning: The Art and Science of Algorithms that Make Sense of Data. Cambridge: University Press, 2012.

R. E. Shiffler, “Maximum z scores and outliers,” The American Statistician, vol. 42, no. 1, pp.79–80, 1988.

E. Acuna and C. Rodriguez, “A meta analysis study of outlier detection methods in classification,” University of Puerto Rico, Mayaguez, 2004.

J. W. Tukey, Exploratory Data Analysis. Addison-Wesley, 1977.

D. C. Hoaglin, F. Mosteller, and J. W. Tukey, Understanding Robust and Exploratory Data Analysis. New York: John Wiley and Sons, 2000.

J. Han and M. Kamber, Data Mining Concepts and Techniques.Morgan Kaufmann, 3rd ed. MA: Elsevier, 2006.

O. Maimon and L. Rokach, Data Mining and Knowledge Discovery Handbook, 2nd ed. New York: Springer, 2005.

M. Juhola, J. Laurikkala, and E. Kentala, “Informal identification of outliers in medical data,” in 5th Int. Workshop on Intelligent Data Analysis Medicine and Pharmacology, 2000.

KPMG, “KPMG 2006 fraud survey,” Australia, 2006.

MATLAB, “K-means clustering,” 2010.

S. H. Oh and W. S. Lee, “An anomaly intrusion detection method by clustering normal user behavior,” Computers and Security, vol. 22, no. 7, pp.596–612, 2003.

Full Text: PDF


  • There are currently no refbacks.

Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 License.

IT in Innovation IT in Business IT in Engineering IT in Health IT in Science IT in Design IT in Fashion

IT in Industry � (2012 - ) � � ISSN (Online): 2203-1731; ISSN (Print): 2204-0595