A Computational Study of Males’ and Females’ Patterns of Language Use in Arabic and English on Twitter in 2012 – 2013

SafinazMuhammedSaeedTawfiek;

Abstract


This study investigates the linguistic lexical choices made by 500 Egyptian Twitter users (250 males and 250 females) writing in MSA and ECA in a selected corpus of 30,000 tweets over the period 2012 to 2013. The study examines the validity of gender-based variations in computer-mediated discourse, and how this can help in authorship studies. Users are identified as males or females according to their names, alias or bio. Certain gender-preferential features, used in previous sociolinguistic and computational studies (e.g. the use of function words, words that denote insults, taboo words, intensifiers, interrogatives, etc.) are selected and applied to tweets. The research examines selected morphological, stylometric and sociolinguistic gender-based features. Perl programming language and bag of words (BoWs) model are used in running codes and representing documents as sets of words. Finally, statistical analysis is performed. On the morphological level, results show that the addition of ta ta’aneeth (the gender inflectional-suffix) to derived nouns and adjectives is a significant feature that characterizes female authors. On the stylometric level, it is revealed that the repetitive use of pronouns marks females’ style, while the recurrent use of demonstratives and prepositions marks males’ style. On the sociolinguistic level, results demonstrate that women tend to use insults and interrogatives more frequently, whereas males make recurrent use of taboo words and intensifiers more than females. Concerning authors’ choice of domains, results highlight that females prefer to talk about their bodies and life partners, while males prefer to discuss issues related to sports, economy, and politics, in addition to using more loanwords.
Keywords: Gender, Twitter, Arabic, BoWs model, Perl programming language, Morphological features, Stylometric features, Socioloinguistic features.


Other data

Title A Computational Study of Males’ and Females’ Patterns of Language Use in Arabic and English on Twitter in 2012 – 2013
Other Titles دراسة حاسوبية لٳستخدام الأنماط اللغوية للذكور والإناث في اللغتين العربية والإنجليزية عبر تويترفي 2012–2013
Authors SafinazMuhammedSaeedTawfiek
Issue Date 2020

Attached Files

File SizeFormat
BB2125.pdf807.09 kBAdobe PDFView/Open
Recommend this item

Similar Items from Core Recommender Database

Google ScholarTM

Check



Items in Ain Shams Scholar are protected by copyright, with all rights reserved, unless otherwise indicated.