School of Humanities and Social Science Division of Social Science 236 Identifying Protest Events in China with Social Media Data Supervisor: ZHANG Han / SOSC Student: WANG Yiwei / QSA Course: UROP2100, Fall This UROP2100 report is the following report of my previous UROP1100 project: Identifying Protest Events in China with Social Media Data conducted under the supervision of Professor Zhang Han, School of Humanities and Social Science. The data of this study is from the 9.5 million Weibo (the Chinese Twitter) posts from 2011 to 2017 classified by the CASM (collective action from social media) system (Zhang & Pan, 2019). This project continued the data coding in the last UROP project and has completed some basic analysis on source type, issue type, events duration, and events by provinces of the coded data. Among the 1,001 coded events, 91.6% of posts are from Weibo, issues related to minorities, taxi driver, and commercial fraud are most likely to become protests. The mean for the duration of events is 54.8 days, and the median is 4 days. Among all the provinces, “Guangdong'', “Sichuan”, and “Shandong” have the highest probability of events happening while there are no events in Tibet. Representation Learning of Social Survey Data Supervisor: ZHANG Han / SOSC Student: LYU Hanfang / MATH-AM Course: UROP1100, Spring For many years, social scientists have used surveys as a research technique. Quantitative analysis of survey data from a social science viewpoint has gained increased interest as a result of the newly emerging trend of computational social science. While classical statistical machine learning strategies have been extensively employed in survey analysis, few state-of-the-art innovative machine learning approaches have been applied in this field of research. In this report, we employed graph learning methods, such as Graph Convolutional Networks (GCN), to accomplish a pre-defined survey analysis tasks, cross domain response prediction, and we also offered results from similar analysis by statistical machine learning methods. To that purpose, we may do a thorough comparison of the two methodologies, which will help us better understand future survey analysis using GCN. Representation Learning of Social Survey Data Supervisor: ZHANG Han / SOSC Student: ZHANG Hao / COMP Course: UROP1100, Summer We aim to use graph convolutional networks (GCN) to analyze social survey data. Now we are focusing on the missing data problem. We have tried to apply different models and methods on the data set: zuobiao1214_clean.dta to predict some missing data. But so far, we didn’t find a such good way that the accuracy of prediction can exceed 65%. In this report, I will elaborate on how we constructed the graph and which model we have tried.
RkJQdWJsaXNoZXIy NDk5Njg=