gravatar

sheeerrrllly

Serlinda Vionita Dewi

Recently Published

From Spam to Ham: Linguistic Features in SMS for Scam Detection
This research aims to determine the language characteristics that separate spam messages from real (ham) SMS exchanges. Understanding these distinct qualities seeks to improve the efficiency of spam detection systems, lowering the prevalence of fraudulent communications techniques. The research looks into: 1. Identification of key words: Using log odds ratio analysis, the study detects phrases commonly used in spam, such as "claim" and "free," demonstrating how spammers craft communications to deceive. 2. Sentiment analysis: This study finds little emotional distinctions between spam and ham communications, with spam tending to show an overly positive sentiment. 3. Word significance analysis (TF-IDF): This statistical tool recognizes phrases that are unusually common in spam, such as "guaranteed," emphasizing their fraudulent use. 4. Bigram Analysis: Examining word pairings reveals significant patterns in spam, such as "free entry" and "guaranteed prize," which contrast with common statements in ham communications that uses day-to-day communication. This research adds to the continuing efforts in digital communication security by presenting a complete linguistic profile of spam SMS texts. The insights collected not only assist to develop spam detection technology, but also to educate users and regulators about the characteristics of potentially dangerous messages. The use of these results may result in more secure digital messaging settings and a better knowledge of fraudulent online communication techniques.