The second perspective highlights computational social science through its novel methods. This is an area that seems to be highlighted by statisticians and methodologists. For example, Cioffi-Revilla (2010) states that `automated information extraction systems, social network analysis, social geographic information systems (GIS), complexity modelling, and social simulation modelsâ are the basis for computational social sciences. He says that computational social science is analysis conducted `through the medium of advanced computational systems'. The examples discussed above show how the analysis is conducted through such systems; thus, computational social science is framed through using such methods. He linked computational social science in a longer history of using computing technology in quantitative research through Statistical Packages for Social Scientists (SPSS) in the 1960s. Hindman (2015) similarly suggests that previously used and standardised approaches in social sciences, such as ordinary least squares regression, are performing worse than more advanced machine learning methods. He argues that methods familiar to big data analysis ought to be applied to mid-to-small-sized data. Hindman (2015, 49) states that `machine learning methods now provide us with better alternatives. [- -] [M]ost data sets should be subjected to the kind of ensemble machine learning techniques now standard in computer scienceâ. This perspective bears similarity to the data-driven perspective (Lazer et al., 2009), but the core idea is not the novel data but advanced and more rigorous methods that allow closer investigation than before.
An aspect not focused on extensively by the above authors is novel methods to qualitative data. By understanding `text as data' (Grimmer and Stewart, 2013), social scientists are increasingly studying textual data using novel methods emerging from computational research communities. We discussed already that automated text and image classification allows analysing text and image data in scales not previously accessible for researchers (Bosch et al., 2019; Gonzalez-Bailon and Paltoglou, 2015; Rossini et al., 2018). Beyond that, automated methods can be used to analyse data without known or a priori categories (Grimmer and Stewart, 2013). That means that the method can generate categories through data-driven analysis.
For example, Farrell (2016) studied polarisation of climate change. He examines if there are differences in background for actors who received corporate funding and those who did not have it. His data consisted of `every text about climate change produced by every organisation between 1993 and 2013', which amounted to over 40,000 texts. Using automated methods, he created 26 categories in the data. Through an analysis of these data, he claimed that `corporate funding influences the actual thematic content of these polarisation efforts, and the discursive prevalence of that thematic content over time'. This gave novel insights to the study of polarisation.
Similarly, a study of democratic leadership in states (Jurek and Scime, 2014) is an example applying novel methods to address a research question. As elaborated above, they studied which types of rules matched up with democratic leadership, as indicated in the Freedom House data set. The rules were emerging from the data using an association rules approach, a machine learning technique, to generate potential rules and explore their suitability to the data set. This paper serves as an example of a small data study applying novel methods. The data set consisted of countries () and less than 10 variables. This emphasises the ability of the methods to address questions and themes even with small data sets and generate insights to the data.
Put together, there is a fair share of truth with this perspective, similar to the data perspective presented above. Methods for computational social science can execute analysis on larger scales of data and faster than through traditional, hand-driven methods. Some methodology approaches create `agency' for the computing technologies to explore data in similar fashions to manual inductive research methods. However, computational methods do not replace manual efforts. For example, the grounded theory method and common computational approaches used to find categories from the data lead to different research methods (Baumer et al., 2017). Through these methods, the researchers can conduct data analysis on the above-mentioned large-scale data sets. However, the scope is larger through this lens. The focus on methods â instead of data â does not limit the scope of these methods to digital trace data but to various other sources of data, including small register-based data sets or non-digital data sets. It is even argued that these methods excel in this area (Hindman, 2015).
However, I argue that this perspective has its limitations â like the data perspective above. The discussion on the role of social science theory we had is relevant critique to this perspective as well. Second, the focus on computing can become a slippery slope. Many traditional techniques for statistical analysis are computationally based but are also re-innovated thanks to advanced computational systems. For example, bootstrapping techniques, where a synthetic data set is created to compute the extraordinary distributions we have observed, are replacing comparisons to traditional normal or distributions. What â if any â differences are there between advanced quantitative research methods and computational research methods? Furthermore, some methods suggested have been extensively used already before any advanced computational infrastructures, especially in the area of network analysis.