this post was submitted on 14 Nov 2023
1 points (100.0% liked)

Machine Learning

1 readers
1 users here now

Community Rules:

founded 11 months ago
MODERATORS
 

I am building a system to separate fraudulent transactions, these will be then manually verified, helping me in turn build a labelled dataset over time.

For now I have transaction data and customer behavior Information.

I intend to do this like this:The possible fraud cases include:- Abuse of all cashbacks and discounts ( Coupons / Vouchers / Auto Refund)- Retailing

- Acquiring sensitive SKUs

Since I don't really have labelled data, I am going with the unsupervised learning approach (isolation forest).I plan on having 3 modules : Users, SKUs, Localities For the last 2 I am suffering with setting meaningful thresholds, I standardized slope of sales trend and intercept, then divided them to get a compound variable which I am using to compare. Please share thoughts and or Resources.

top 1 comments
sorted by: hot top controversial new old
[–] flamboyantkoala@programming.dev 1 points 10 months ago

This seems like it’ll be a really long journey without labeled data. Do you have an idea of how much fraud is in your sample set?