The role of theory in the research process is different across the five schools of computational social science (see Table 1.1). For some, theory was not the main focus of the computational transformation (data and methods-oriented approaches), others sought to loan the research paradigm from other disciplines (for example, using physics-based models). As a social scientist, I have been though to think about theory as a more important activity: for me, it is critical to integrate research with theory. (With any and all meanings of theory - disciplines within social sciences and computational social science may have different ideas about what constitutes theory.) However, there are many legitimate approaches to do this integration.
Within social sciences, theory can be integrated to empirical research either by driving the research or building new theory based on the results. Theory-driven research is often seen as deductive: hypothesis or explicit research questions are defined and the data is used to answer those. Theory-building are often more open-ended research processes often seen as inductive: evidence is gathered during the research to make proposals of the phenomena. While many scholars have strong opinions about both research approaches, these research approaches do not exist in such a clear form when research is done. Deductive researchers may have many different hypotheses or they may conduct additional analysis post-hoc: after they already answered their research question or hypothesis - thus appearing more inductive. Inductive researchers may drive the analysis towards particular academic discussions and extensive theoretical considerations work might take place ex ante: before any empirical work has taken place - thus appearing more deductive. Computational social sciences seem to have elements of both inductive and deductive research process: their co-existence is acknowledged. However, computational social sciences have brought these tensions further and allowed to discuss and challenge them. For example, Kitchin (2014) speaks how “data-driven science seeks to hold to the tenets of the scientific method, but is more open to using a hybrid combination of abductive, inductive and deductive approaches to advance the understanding of a phenomenon” while Halford and Savage (2017) highlights how computational social sciences establishes “a new mode of argumentation that reconfigures the relationship between data, method and theory.” These highlights invite us to consider new ways of doing research, relaxing the traditional strong division between theory-first (deductive) and theory-last (inductive) thinking.
However, even with this freedom it often academics build on previous traditions and scholarship. In these cases, the traditions may establish also how the topic should be approached. For example, in media and communication studies analysis of media frames is well-established procedure (e.g., Gamson, 1989). In these cases computational methods can help to make the analysis process more efficient through using machine learning techniques (among others Burscher et al., 2014,2016), either by increasing the scale of datasets, decrease time required for the analysis, or, decreasing costs - often all three together. However, classification is not the only area where researchers can build on traditions and scholarship. Similarly, exploring in which conditions societies achieve democratic leadership (Jurek and Scime, 2014) is a well-established research domain in political science. The research goals are the same as the established tradition even when advanced algorithmic data analysis is used. Also more qualitatively oriented approaches, like grounded theory, can provide baseline which is mimicked using computational methods (which success is still an open question, e.g., Muller et al., 2016; Baumer et al., 2017; Nelson et al., 2018). Sometimes the role of theories is to define also the research process: the traditional process is translated into computation through computational thinking and finally, algorithm. In these cases, computational social scientists primary follow established practices but utilise computational methods in different stages of the research process to improve scholarship.
Another opportunity is to see computational methods as a tool to push the boundaries and rethink traditions and even how problems are formulated. For example, simulation models could be used as a curiosity-driven theoretical inquiry, but forcing these thoughts to be expressed in a formal manner. Network analysis serves as lenses to examine any phenomena through nodes and their connections. One of the shift in the traditions is reconsidering the role of theory more broadly. McFarland et al. (2016) highlight how more descriptive studies are legitimate, which they call computational ethnography. They refer to a discovery-oriented process, where “an initial hunch or intuition” about social processes invite scholars to process the data, seeking for interesting patterns and results. McFarland et al. (2016) even suggest that these results are presented like ethnographers present observations and patterns to demonstrate their thinking process. Sometimes the boundary-pushing can be a smaller step, proposing a new perspective on how to approach the topic. When examining affective economy, Hokka and Nelimarkka (2020) translated the question as a network analysis challenge, thus conceptually slightly evolving the idea of affective content being sticky and to understand affect by examining on large scale content which circulated - as affective content is often said to be sticky.
To summarise: computational social sciences provide many ways of doing science: either mimicking established work and translating the ideas into an algorithm or being more open-ended about the research process and its translations. The difference between these relates mostly on how to presents' the argumentation: if the work focuses on translating conceptual tools, then care should be put to ensure they are translated correctly and operationalisation holds. If the aim is to push boundaries, careful focus on what is actually done and reflecting what can be said from those is required.
However, common to all formulations is the importance of guiding concepts, or hunches and intuition in the process: these methods are versatile and can give different meanings to results. At the same time, looking the needle from the haystack is difficult: the data can be large, require time and effort to manipulate, or methods may have many different parameters and tuning opportunities. Exploratively using these methods with the hope that something interesting emerges can therefore be difficult. To avoid open exploration, scholars can use something less strong than theories and hypothesis but more than nothing. I call them concepts: it is easy to say that the work is going to be about media agenda - but this focus on media agenda already limits a lot of other kinds of interesting or potentially more fruitful perspectives to the data. A valid question is that how to choose the concepts focused on. The idea of sociological imagination can help in these: being able to think between individual and sociological levels and seeing connections between them. When thinking of the concepts, I also try to envision a bit on what kind of outcomes might look like and what could be said about them. This requires some technical competences to know what is possible, what is easy and what is difficult. For example, in Hokka and Nelimarkka (2020) we envisioned the network plots when thinking about the research problem: would it be interesting to see circulation within the groups and examine the international connections each group had? Naturally at this stage, we did not have the results ready, but even the idea of seeing the connections was interesting. In Pantti et al. (2019) the idea of comparing different mediums using something like Table 3 was present already early in the process, followed by ideas about timelines. Again, while we did not know what the results were, the idea of seeing differences in discoursive contexts seemed interesting for all of us. The guiding concepts help to narrow the research and think about the what results might look like can help to navigate these.