Scholarly open access journals, Peer-reviewed, and Refereed Journals, Impact factor 8.14 (Calculate by google scholar and Semantic Scholar | AI-Powered Research Tool) , Multidisciplinary, Monthly, Indexing in all major database & Metadata, Citation Generator, Digital Object Identifier(DOI)
Abstract—This systematic literature review examines age and gender detection from instant messaging (IM) data, with particular emphasis on cross-lingual and code-switched contexts. Drawing on 49 peer-reviewed studies (2000–2024) from six major databases, it identifies linguistic, behavioral, and paralinguistic features (e.g., syntax, emoji use, word choice, and message patterns) as key demographic markers. The field has shifted from traditional machine learning (SVMs, Naïve Bayes) to deep learning and transformer-based models (e.g., BERT, Text2Gender), which perform better with informal and multilingual data. Major challenges include scarcity of annotated IM datasets, lack of support for low-resource/code-switched languages, and ethical concerns around privacy, consent, and gender inclusivity. Current approaches often assume binary gender categories, limiting fairness. The review calls for inclusive, privacy-preserving, and culturally adaptable NLP frameworks, emphasizing future research on scalable multilingual solutions, diverse gender/linguistic representation, and ethically responsible design.
Keywords:
Instant Messaging, Age and Gender Detection, Cross-Lingual Language, Emoji Usage, Code-Switching, Machine Learning, Author Profiling.
Cite Article:
"Digital Linguistic Markers for Age and Gender Prediction from Chat Data - A Literature Review", International Journal for Research Trends and Innovation (www.ijrti.org), ISSN:2455-2631, Vol.10, Issue 11, page no.a558-a561, November-2025, Available :http://www.ijrti.org/papers/IJRTI2511066.pdf
Downloads:
000225
ISSN:
2456-3315 | IMPACT FACTOR: 8.14 Calculated By Google Scholar| ESTD YEAR: 2016
An International Scholarly Open Access Journal, Peer-Reviewed, Refereed Journal Impact Factor 8.14 Calculate by Google Scholar and Semantic Scholar | AI-Powered Research Tool, Multidisciplinary, Monthly, Multilanguage Journal Indexing in All Major Database & Metadata, Citation Generator