Term frequency inverse document frequency (TF-IDF) technique and artificial neural network in email classification system
Electronic mail has been a competent and widely accepted communication mechanism as the Internet community increases. This has inspired attention for urgent need to manage and maintain e-mail. Email messages are expected to be sent and gathered in a warehouse for recurring use as it ranges from inert institutional information to discussions and creates complications for users in making precedence for saved and new email content. This research classified accumulated electronic messages into three class dataset of trivial, important and non-trivial. Electronic mail contents are extracted and Term Frequency Inverse Document Frequency (TF-IDF) technique was used to determine keywords in email messages to determine constructive words to be used. Nuclass 7.1 artificial neural network software was used for email classification into user defined word identity classes in an associative learning approach as the network was trained with input and matching output patterns. The performance evaluation showed that neural networks are more successfully used for automated email classification.
Keywords: Electronic mail, Term Frequency Inverse Document Frequency, artificial neural network, email classification, email message.