AN INTELLIGENT MODEL FOR AGE AND GENDER PREDICTION FROM ARABIC TWITTER USER’S POSTS

المخلافي, فهد بن صادق ; AL MEKHLAFI, FAHAD (2017-06)

جامعة نايف العربية للعلوم الأمنية، كلية أمن الحاسب والمعلوات

Thesis

Today the number of users in twitter in Arabic countries increase every day which is one of big challenge to classification and prediction from a massive of data that increased every day. User generated text on social media sites that is contain a lot of information that can be accustomed establish completely different aspects of their author. However, this data contains variety kind of data and the posts talking about everything without roles of writing that may produce tweets with noise information. Furthermore, the noisy of data has a big negative impact in accuracy and performance of classifiers it needs some functions that prepare the data to be understand and very clear to assistant classifier to classification or prediction by right way. In this thesis our contributions proposed techniques are enhancing text preprocess that improve the accuracy and performance, enhancing way to combine multilabel in prepared dataset with extracting and selecting features for classifier. We applying our dataset to within two machine learning (ML) classification model naïve bays and datamining(DM) neural networks to predict age and gender for tweeter users and compare the result. Finally, this study concluded that the neural network classifier is better than naïve bays in multilabel classification.