Social Media Analysis Methods and Applications
CS7290 Fall 2021
Course Description Assignments Schedule
- Instructor: Silvio Amir
- Meetings: Tue & Fri 9:50am - 11:30am @ Ryder Hall 153
- Office hours: Thur 12pm - 2pm @ Office 2207, 177 Huntington Avenue and @ Zoom
- Discussion/Announcements: piazza
Course Description
Seminar class in which we read, critique and discuss recent research papers about the methods, applications and implications of social media analysis. Social media analysis is a multidisciplinary subject that encompasses a broad range of topics. In this class we will focus on: (i) cutting-edge Machine Learning/Deep Learning and Natural Language Processing methods to infer signals from single posts, individual users, and user networks; (ii) applications of social media analysis to real-world analyses in domains such as the social sciences, political science, mental-health and epidemiology; and (iii) the implications, as well the ethical and moral considerations, of deploying social media analysis systems and Human-centered AI systems more broadly.
Students will take turns presenting a paper each week (2 presentations per class) which we will discuss together critically during class meetings. The presentations should also cover any relevant background material; this is important because in general we will have a strong bias toward very recent work. Critical discussion will follow, and we will conclude class by talking about potential means of extending the work. Students will also be responsible for writing short summaries of the research papers we read each week. Student led presentations will be complemented with several guest lectures from top researchers working on different facets of social media analysis.
In addition to the presentations and paper reviews, students will propose and execute a mini project culminating in a research paper (draft). The results and findings of the project will be presented in a class, structured as a research talk. Therefore, this class will very much be research-oriented.
⚠️ As per university policy, the class the will be in-person only and everyone must be wearing a mask.
Prerequisites
Students should have an interest in conducting (or learning how to conduct) research. There are no official pre-reqs for this class, however students are expected to have some background in data science, machine learning and statistical natural language processing. This is mainly to make students life easier since we will not cover background knowledge related to the papers in class.
Assignments
Oral Presentations
Students will rotate responsibility for presenting (one of the) assigned papers and leading the ensuing discussion. The presentation should provide necessary background, the core contribution of the paper, (perceived) strengths and weaknesses, and ideas for improving the work.
Written Critiques
Prior to every class all students are to write a brief “review” of one of the assigned papers for that class (their choice). These should include a concise summary of the work and enumerate at least two strong points and at least two weak points therein. Furthermore, these critiques will note ways in which the work could potentially be extended/improved (which may be coupled with the weak points). Assessments are to be submitted via canvas by 11.59pm of the day prior to class (i.e. Mondays and Thursdays). Note that is assumed students read all the assigned papers, but may read only one in-depth. The idea is basically to ensure robust discussion during class meetings.
Project
The class will culminate in projects with accompanying papers (along with the respective code, when possible). The papers will be structured as research papers following a template, available here and the code should be submitted as a jupyter notebook (if possible). Students may work together but then a concomitant increase in complexity and a clear delineation of contributions is expected, for fairness sake.
Students will be able to choose which projects to work on. Projects may involve replicating a state-of-the-art paper we read, or, ideally, extending one of these or developing new ideas in the area. Students are to write a short (1 page) project proposal including the motivation, goals, expected results and a timeline of key milestones. Project proposals will then be reviewed and approved by the instructor. Some class time will be reserved for students to present their project ideas, results and findings (these will be structured as research talks).
Projects will be evaluated for their relevance, scientific rigour, and originality. The papers will be evaluated as paper draft submissions using the template available here.
Grades
- 55% Project and write-up
- 20% In class presentations of papers
- 15% Participation
- 10% Written summaries/critiques of papers (on canvas)
Schedule
Tentative weekly schedule. The assigned papers are unlikely to change but the presentation dates might move to accommodate the (busy) schedules of the invited speakers.
Date | Agenda | Speaker | ||
---|---|---|---|---|
Fri 9/10 | 🎉 Introduction | Silvio | ||
Hate-speech Detection | ||||
Tue 9/14 | 🎓 Lecture: Social Media Analysis: methods, applications and implications | Silvio | ||
📃 Contextualizing Hate Speech Classifiers with Post-hoc Explanation, Kennedy, B., Jin, X., Davani, A. M., Dehghani, M., & Ren, X. (2020) | Silvio | |||
additional readings | 📚 A Survey on Automatic Detection of Hate Speech in Text; Fortuna, P., & Nunes, S. (2018). | |||
📚 Towards generalisable hate speech detection: a review on obstacles and solutions; FYin, W., & Zubiaga, A. (2021). | ||||
📚 A Survey on Hate Speech Detection using Natural Language Processing; Schmidt, A., & Wiegand, M. (2017). | ||||
Misinformation Detection | ||||
Fri 9/17 | 📃 Deep Structure Learning for Rumor Detection on Twitter, Huang, Q., Zhou, C., Wu, J., Wang, M., & Wang, B. (2019). | Sanjana | ||
📃 Detecting Propaganda Techniques in Memes, Dimitrov, D., Ali, B.B., Shaar, S., Alam, F., Silvestri, F., Firooz, H., Nakov, P. & Da San Martino, G., (2021) | Hye Sun | |||
additional readings | 📚 Combating fake news: A survey on identification and mitigation techniques;Sharma, K., Qian, F., Jiang, H., Ruchansky, N., Zhang, M., & Liu, Y. (2019). | |||
📚 A survey on fake news and rumour detection techniques; Bondielli, A., & Marcelloni, F. (2019). | ||||
📚 The science of fake news; Lazer, D.M., Baum, M.A., Benkler, Y., Berinsky, A.J., Greenhill, K.M., Menczer, F., Metzger, M.J., Nyhan, B., Pennycook, G., Rothschild, D. & Schudson, M., (2018) | ||||
Digital Activism | ||||
Tue 9/21 | 📃 Reclaiming Stigmatized Narratives: The Networked Disclosure Landscape of #MeToo, Gallagher et al. (2019) | Xiaoyu | ||
📃 Say Their Names: Resurgence in the Collective Attention toward Black Victims of Fatal Police Violence Following the Death of George Floyd, Wu, H.H., Gallagher, R.J., Alshaabi, T., Adams, J.L., Minot, J.R., Arnold, M.V., Welles, B.F., Harp, R., Dodds, P.S. & Danforth, C.M., (2021) | Mayur | |||
additional readings | 📚 Storywrangler: A massive exploratorium for sociolinguistic, cultural, socioeconomic, and political timelines using Twitter, Alshaabi, T., Adams, J.L., Arnold, M.V., Minot, J.R., Dewhurst, D.R., Reagan, A.J., Danforth, C.M. and Dodds, P.S., (2021) | |||
Fri 9/24 | 📃 Demographic Representation and Collective Storytelling in the Me Too Twitter Hashtag Activism Movement, Mueller, A., Wood-Doughty, Z., Amir, S., Dredze, M., & Nobles, A. L. (2021) | Sinjini | ||
🎓 Guest Lecture: #HashtagActivism: Networks of Race and Gender Justice | Brooke Foucault Welles | |||
additional readings | 📚 # HashtagActivism: Networks of race and gender justice; Jackson, S. J., Bailey, M., & Welles, B. F. (2020). | |||
Deep Learning | ||||
Tue 9/28 | 📃 Compositional Demographic Word Embeddings, Welch, C., Kummerfeld, J. K., Pérez-Rosas, V., & Mihalcea, R. (2020) | Michael | ||
📃 Developing a Twitter bot that can join a discussion using state-of-the-art architectures, Çetinkaya, Y. M., Toroslu, İ. H., & Davulcu, H. (2020). | Grainne | |||
Fri 10/1 | 📃 SentiBERT: A Transferable Transformer-Based Architecture for Compositional Sentiment Semantics, Yin, D., Meng, T., & Chang, K. W. (2020) | Shijia | ||
📃 Adversarial Learning for Zero-Shot Stance Detection on Social Media, Allaway, E., Srikanth, M., & McKeown (2021) | Jessica | |||
Mental Health | ||||
Tue 10/5 | 📃 Suicide Ideation Detection via Social and Temporal User Representations using Hyperbolic Learning, Sawhney, R., Joshi, H., Shah, R., & Flek, L. (2021) | Aldo | ||
📃 Characterizing Anxiety Disorders with Online Social and Interactional Networks, Dutta, S., & De Choudhury, M. (2020). | Carlos | |||
Fri 10/8 | 📃 Inferring Social Media Users’ Mental Health Status from Multimodal Information, Xu, Z., Pérez-Rosas, V., & Mihalcea, R. (2020) | Alex | ||
Person-Centered Predictions of Psychological Constructs with Social Media Contextualized by Multimodal Sensing, Saha, K., Grover, T., Mattingly, S.M., Swain, V.D., Gupta, P., Martinez, G.J., Robles-Granda, P., Mark, G., Striegel, A. & De Choudhury, M., (2021). | Sanjana | |||
additional readings | 📚 Methods in predictive techniques for mental health status on social media: a critical review, Chancellor, S., & De Choudhury, M. (2020) | |||
📚 Do Models of Mental Health Based on Social Media Data Generalize?, Harrigian, K., Aguirre, C., & Dredze, M. (2020) | ||||
Multimodal Social Media Analysis | ||||
Tue 10/12 | 📃 MARMOT: A Deep Learning Framework for Constructing Multimodal Representations for Vision-and-Language Tasks, Wu, P. Y., & Mebane Jr, W. R. (2021) | Xiaoyu Fan | ||
📃 Reasoning with Multimodal Sarcastic Tweets via Modeling Cross-Modality Contrast and Semantic Association, Xu, N., Zeng, Z., & Mao, W. (2020) | Grainne | |||
Public Health | ||||
Fri 10/15 | 📃 Quantifying Community Characteristics of Maternal Mortality Using Social Media, Abebe, R., Giorgi, S., Tedijanto, A., Buffone, A., & Schwartz, H. A. (2020) | Manaswini | ||
📃 Examining Peer-to-Peer and Patient-Provider Interactions on a Social Media Community Facilitating Ask the Doctor Services , Nobles, A. L., Leas, E. C., Dredze, M., & Ayers, J. W. (2020) | Jessica | |||
additional readings | 📚 Social Monitoring for Public Health; Paul, M. J., & Dredze, M. (2017). Synthesis Lectures on Information Concepts, Retrieval, and Services [preprint] | |||
Covid-19 |
||||
Tue 10/19 | 📃 Explaining the ‘Trump Gap’ in Social Distancing Using COVID Discourse, Van Loon, A., Stewart, S., Waldon, B., Lakshmikanth, S.K., Shah, I., Guntuku, S.C., Sherman, G., Zou, J. & Eichstaedt, J., 2020 (2020) | Hye Sun | ||
📃 COVID-19 Surveillance through Twitter using Self-Supervised and Few Shot Learning, Lwowski and Rad (2020) | Sinjini | |||
Fri 10/22 | 🚀 Discussion of Project Ideas | |||
🎓 Discussion: From #HashtagActivism to Data Justice | Brooke Foucault Welles | |||
additional readings | 📚 Twitter and Facebook posts about COVID-19 are less likely to spread false and low-credibility content compared to other health topics, Broniatowski, D. A., Kerchner, D., Farooq, F., Huang, X., Jamison, A. M., Dredze, M., & Quinn, S. C. (2020). | |||
Social Sciences | ||||
Tue 10/26 | 📃 Psychosocial Effects of the COVID-19 Pandemic: Large-scale Quasi-Experimental Study on Social Media, Saha, K., Torous, J., Caine, E. D., & De Choudhury, M. (2020) | Shijia | ||
🎓 Guest Lecture: The Light and Dark Side of Social Media | Nick Beauchamp | |||
Fri 10/29 | 📃 Who Says What with Whom: Using Bi-Spectral Clustering to Organize and Analyze Social Media Protest Networks, Joseph et al. (2020). | Michael | ||
🎓 Guest Lecture: Measuring Algorithmically Infused Societies [paper] | Tina Eliassi-rad | |||
🚀 Project proposal deadline | ||||
Political Science | ||||
Tue 11/2 | 📃 RAFFMAN: Measuring and Analyzing Sentiment in Online Political Forum Discussions with an Application to the Trump Impeachment, Tachaiya, J., Gharibshah, J., Esterling, K. E., & Faloutsos, M. (2021). | Jessica | ||
🎓 Guest Lecture: Measuring Misinformation on Social Media | David Lazer | |||
Fri 11/5 | 📃 How Metaphors Impact Political Discourse: A Large-Scale Topic-Agnostic Study Using Neural Metaphor Detection, Prabhakaran, V., Rei, M., & Shutova, E. (2021) | Mayur | ||
📃 Unsupervised User Stance Detection on Twitter, Darwish, K., Stefanov, P., Aupetit, M., & Nakov, P. (2020) | Carlos | |||
Bias and Fairness | ||||
Tue 11/9 | 📃 The Risk of Racial Bias in Hate Speech Detection, Sap, M., Card, D., Gabriel, S., Choi, Y., & Smith, N. A. (2019) | Aldo | ||
📃 Social Bias Frames: Reasoning about Social and Power Implications of Language, Sap, M., Gabriel, S., Qin, L., Jurafsky, D., Smith, N. A., & Choi, Y. (2020) | Grainne | |||
Fri 11/12 | 📃 A Taxonomy of Ethical Tensions in Inferring Mental Health States from Social Media, Chancellor, S., Birnbaum, M. L., Caine, E. D., Silenzio, V. M., & De Choudhury, M. (2019) | Manaswini | ||
📃 Gender and racial fairness in depression research using social media, Aguirre, C., Harrigian, K., & Dredze, M. (2021) | Xiaoyu Fan | |||
Projects I | ||||
Tue 11/16 | 🎉 No class; work on projects | |||
Fri 11/19 | 🚀 Project Presentations (Preliminary) | |||
Ethics & Moral I | ||||
Tue 11/23 | 📚 How Twitter Gamifies Communication, Nguyen, C. T., & Lackey, J. (2021) | |||
📚 Big Data’s End Run around Anonymity and Consent, Barocas, S., & Nissenbaum, H. (2014) | ||||
🎓 Guest Lecture | John Basl, Vance Ricks and Meica Magnani | |||
Fri 11/26 | 🎉 Thanks Giving | |||
Ethics & Moral II | ||||
Tue 11/30 | 🎓 Guest Lecture: Ethics and Fairness of Mental Health Research using Social Media | Stevie Chancellor and Carlos Aguirre | ||
Fri 12/3 | 🎉 No class; work on projects | |||
Projects II | ||||
Tue 12/7 | 🚀 Project Presentations (Final) |