BigSurv20 program


Friday 6th November Friday 13th November Friday 20th November Friday 27th November
Friday 4th December

Back

Can we make computers "think" like us?

Moderator: Amelia Burke-Garcia (burkegarcia-amelia@norc.org)
Slack link
Quick Zoom

Detailed zoom login information
Friday 13th November, 10:00 - 11:30 (ET, GMT-5)
7:00 - 8:30 (PT, GMT-8)
16:00 - 17:30 (CET, GMT+1)

Thinking like a computational social scientist: Organising thoughts and organising data

Dr J. Kasmire (UK Data Service) - Presenting Author
Dr Diarmuid McDonnell (UK Data Service)

Download presentation

Download presentation

Computational social science requires a unique blend of knowing how to do social science research and knowing how to do computationally intensive research. These are, in some interesting ways, very difficult skills to blend. Social science research demands an ability to think like a human, with abstract concepts, context-dependence, inference and awareness of the probabilities of shared background knowledge. In contrast, computationally intensive research demands an ability to think like a computer, with concrete terms, absolutes, hierarchies, and potentially unique frames of reference. Thus, computational social science requires a capacity to think in multiple ways that are at best occasionally difficult to combine and at worst fundamentally incompatible. For example, social scientists might a variety of traditional social science methods to research fuzzy and very human concepts like “trust”. To do so, they could use natural languages, semi-structured processes and multiple theoretical frameworks, all without needing a singular or universal definition of “trust”. However, were they to sue computational methods, they would have to explicitly define how trust would be represented in a computer-readable format, perhaps as a number between 0 and 100, between –1 and 1, or as statistical model with multiple factors, each measured numerically. Further, they might need to explicitly define what is understood as a “default level” of trust alongside clear rules by which that default value changes over time as a consequence of behaviour or interaction.
The disparate computational social science skills present a challenge; those drawn to social science by an affinity for abstract concepts, human communication, context-dependence or applying generalised societal knowledge might struggle to gain or use computational research skills. At the same time, those drawn to computational research through an intuitive grasp of how information can be structured, organised, or manipulated might became very frustrated by research topics that have ill-defined boundaries, that change meaning in different contexts, or that rely on assumptions or shared experience for interpretation.

Although challenging, this dichotomy can be at least partially resolved through careful training in that fully acknowledges how some trainees may feel some concepts to be blindingly obvious while others are totally impenetrable. This paper compares and contrasts the two independent skill sets needed for computational social science and highlights why researchers may be unevenly adept at each, especially if they have already established a history of work in one or the other. It reviews the evidence underpinning pedagogical approaches for developing computational skills, presents a framework for training future computational social scientists in the basics of both skill sets, and lays out some considerations for further training and skill development.



The (ro)bots are coming! detecting, preventing, and remediating bots in surveys

Dr Ashley Amaya (RTI International) - Presenting Author
Dr Stephanie Eckman (RTI International)
Dr Craig Hill (RTI International)
Mr Ron Thigpen (RTI International)

Non-probability surveys have made a resurgence, as researchers demand more data on rarer populations, web access becomes more prevalent, and data collection costs rise. Many non-probability surveys are open to anyone and recruit respondents via posts on social media or crowd sourcing platforms. While these recruitment methods offer several advantages, they also create risks to data quality.

One risk is the introduction of fraudulent data into the survey by bots. Individuals or groups may build or buy a bot to complete the survey and collect the incentive. Limited research has been conducted to quantify the proportion of responses that are fraudulent, but the proportion is hypothesized to be sizable.

Some researchers or survey platforms implement methods designed to minimize the number of fraudulent responses. However, little research has been conducted on best practices 1) to deter bots from attempting to take the survey; 2) to detect bots once they have started the survey; or 3) to remediate or remove bot responses.

To advance our knowledge about the level of risk introduced by bots in surveys and provide guidance on best practices, we launched a 3x3 test in which we released three bots on three survey platforms. We tested our ability to deter, detect, and remediate bot-perpetrated fraud using methods including browser fingerprinting, honeypot questions, and captcha challenges. In this presentation, we report on the success on these methods and how they vary by bot and survey platform. We provide recommendations for reducing fraudulent data in surveys which will interest all researchers conducting online research via social media or crowd sourcing platforms.

Representativeness and weighting of web archive data

Mr Matouš Pilnáček (Institute of Sociology of the Czech Academy of Sciences)
Ms Paulína Tabery (Institute of Sociology of the Czech Academy of Sciences) - Presenting Author
Mr Martin Šimon (Institute of Sociology of the Czech Academy of Sciences)

Web archives that capture the changing nature of the Internet are potentially a useful source of data for social scientists. However, the size of the current web is enormous, and its content is changing so quickly that it is impossible to archive the entire web. Therefore, the representativeness of the archived data is a crucial issue for researchers in social sciences using data from web archives.
This paper tests the representativeness of archived data using an experiment. Randomly selected second-order domains from the first-order Czech domain are downloaded firstly in a full form, and secondly, in several variations of partial download, that will imitate different crawler settings used by the web archive. We will compare the distribution of various characteristics that might be interesting for researchers such as word frequency, readability score or text sentiment for different data collection methods.
In the second step, we determine the distribution of the key weighting variables of the second-order domains and web pages (domain size, filename extensions, page depth, etc.) from complete data. Based on these distributions, we adjust the representativeness of incomplete data by data weighting known mainly from survey methodology. After this procedure, we compare the distributions of target variables between weighted and unweighted data. The results of the experiment allow better insight into the possibilities of inference from data archived in the web archive.

Topic Identification in Web Archives

Mr Matouš Pilnáček ( Institute of Sociology of the Czech Academy of Sciences) - Presenting Author
Mr Jan Lehečka (Faculty of Applied Sciences of the University of West Bohemia)
Ms Paulína Tabery (Institute of Sociology of the Czech Academy of Sciences)
Mr Pavel Ircing (Faculty of Applied Sciences of the University of West Bohemia)

The content of web archives can serve as a useful source of data that helps researchers in social sciences draw a picture of the dynamic change of contemporary society and its communication. However, entering web archives and analyzing its content might be a difficult task for social scientists because the data are incredibly voluminous and heterogeneous. Therefore, it is worth considering whether the web archive infrastructure should also provide web page topic identification that enables thematic selection of data or tracking thematic trends over time.
On the example of an interface for the Czech web archive, this paper focuses on the development of such an automated web page topic identification through supervised learning. The aim is to create a classifier that assigns one or more of the 658 topics to any web page from a web archive. The paper presents an experiment that attempts to find the best source of externally annotated data suitable for training such a classifier. Three training data sources are available: a) news articles, b) articles from Wikipedia, and c) websites found by Google search engine for specified keywords.
We are currently running the evaluation of the classifier performance only on a subset of 15 environmental topics, comparing the automatically assigned topics with the manually assigned ones. The results will help us to select the appropriate data source(s) for training the classifier that will be able to assign all 658 topics.

Can recurrent neural networks code interviewer question-asking behaviors across surveys?

Mr Jerry Timbrook (University of Nebraska-Lincoln) - Presenting Author

Survey researchers commonly use behavior coding to identify whether or not interviewers read survey questions exactly as worded in the questionnaire (a potential source of interviewer variance). This method allows researchers and data collection agencies to identify: 1) interviewers who regularly deviate from exact reading and need re-training (Fowler and Mangione 1990), and 2) survey questions that may need to be revised because interviewers deviate from exact reading when asking them (Oksenberg et al. 1991). However, manually behavior coding an entire survey is expensive and time-consuming.

Recurrent Neural Networks (RNNs; a machine learning technique) have been used to partially automate this coding process to save time and money (Timbrook and Eck 2019). These RNNs demonstrated reliability comparable to humans when coding question-asking behaviors on many items within one survey. However, these networks were trained using data collected under a particular set of essential survey conditions (e.g., a particular organization fielded the survey during a specific time frame using a particular group of interviewers). It is unknown if these trained networks can be used to code question-asking behaviors on the same questions administered under different essential survey conditions.

In this paper, I use RNNs that are trained using data from the Work and Leisure Today 2 telephone survey (WLT2; AAPOR RR3=7.8%) to code question-asking behaviors on the same items in a different telephone survey fielded at a different time, Work and Leisure Today 1 (WLT1; AAPOR RR3=6.3%). To create the RNN training dataset, human behavior coders use transcripts from 13 items (n=9,215 conversational turns) in WLT2 to identify when interviewers asked questions: 1) exactly as worded, 2) with minor changes (i.e., changes not affecting question meaning), or 3) with major changes (i.e., changes affecting question meaning). Using these same transcripts, I train RNNs to classify interviewer question-asking behaviors into these same categories. Then, I use these WLT2-trained RNNs to code question-asking behaviors on the same 13 questions administered in WLT1 (n=4,678 conversational turns). Undergraduate research assistants also code these same WLT1 question-asking behaviors (i.e., a dataset that represents the traditional method of human behavior coding).

All WLT1 question-asking turns were also master-coded by graduate research assistants to evaluate inter-coder reliability (i.e., kappa). I compare the reliability of RNN coding on WLT1 versus the master coders to the reliability of human undergraduate coding on WLT1 versus the master coders. Preliminary results indicate that for a question asking about the number of individuals in a respondent’s household, an RNN trained on WLT2 data coded WLT1 question-asking behaviors for this item with reliability equal to the human undergraduate coders (p<.05). I conclude with implications for using RNNs to behavior code across different telephone surveys.