Machine Understanding through Unsupervised Web Semantification

ميشيل نعيم نجيب جرجس

Machine Understanding through Unsupervised Web Semantification

ميشيل نعيم نجيب جرجس;

Abstract

This thesis summarizes our efforts to build three modules in the direction of machine understanding. The first module is a framework to build classifiers given any set of Wikipedia pages in any level of granularity possibly in any size. We named it ClassifyWiki. We tested our framework over more than 100 entity classes using our dataset based on schema.org. ClassifyWiki does not learn some specific classes like all previous systems but, theoretically, it can generate classifiers for any entity class. We report 83% macro-averaged f1-score using 50 positive training instances.
The second module, we present, is WikiTrends. WikiTrends creates a new analytics layer out of a source of semi-structured and unstructured data. WikiTrends can generate any mix of data to present a new understating of the world. Sample analytics reports were generated like assigning each country some unforgettable additions to humanity, the gender battle down to 1000 BC, tracking trending occupations, musical instruments, and film genres, and summarizing the world view in heat maps.
And the last one is ASU, a system submitted in COLING W-NUT workshop in 2016. The system tackled Twitter Named Entity Recognition task. Our system experimentally shows an incremental approach in designing two LSTM models: One for entity detection and the other for extracting and classifying on a set of 10 fine-grained classes. This study presents experimentally the eFFect of adding/removing many features in the input representation along with an analysis on the network design. We report a 39% f1-score for the typed model on the test set and a 55% for the non typed one bringing ASU to be the fifth system out of ten participants.

Other data

Title	Machine Understanding through Unsupervised Web Semantification
Other Titles	تمكين الحاسب من الفهم عن طريق تحديد دلالات الألفاظ للشبكة العنكبوتية بدون إشراف
Authors	ميشيل نعيم نجيب جرجس
Issue Date	2017

Attached Files

File	Size	Format
J2390.pdf	401.67 kB	Adobe PDF	View/Open

Recommend this item

Similar Items from Core Recommender Database

Google Scholar^TM

Check

views 22 in Shams Scholar

downloads 20 in Shams Scholar

Machine Understanding through Unsupervised Web Semantification

ميشيل نعيم نجيب جرجس;

Abstract

Other data

Attached Files

Google ScholarTM

Google Scholar^TM