ICANN
Methodology to Classify Unsolicited Email Threats
Pages
16
Time to read
29 mins
Publication
Language
English
Pages
16
Time to read
29 mins
Publication
Language
English
This document is a technical report detailing a methodology for classifying unsolicited emails, specifically focusing on spam, scam, phishing, and adult content. The report outlines the challenges posed by unsolicited emails and the necessity for effective classification to enhance security measures. It describes the construction of a dataset comprising 10.8 million unsolicited emails collected over four and a half years, which serves as the foundation for the analysis. The methodology includes data collection, processing, and the generation of ground truth for classification purposes. The report also discusses the use of machine learning models, particularly Long Short-Term Memory (LSTM) networks and TF-IDF measures, to accurately classify emails across multiple languages. Additionally, it presents a longitudinal analysis of the evolution of unsolicited email threats, highlighting trends in spam and phishing activities. The findings emphasize the importance of identifying threat indicators to aid incident response teams in mitigating risks associated with unsolicited emails.