In the current digital society, social scientists and computer scientists alike see challenges of algorithmic decision making, ranging from discrimination and bias to their opaque nature. These challenges are also related to computational social sciences. Some consider algorithmic discrimination and biases to be the foci of computational social sciences, but even for those with a different conceptualisation of computational social sciences, these concerns limit interpretations made through algorithmic analysis. For example, many algorithms are black boxes. That is, it can be unclear how and why particular outcomes emerge based on specific inputs. In algorithmic decision making this is a challenge, as organisations cannot provide justifications for made decisions. Computing scholars have worked to develop methods and techniques to increase the transparency and explainability. However, the discussion has been focused on the use of algorithms in public and private organisations, in areas where algorithmic systems interact with laypersons.
The same black box challenge is essential for scholars. If we cannot understand what an algorithm does, to what degree should we trust that it leads to correct or meaningful results? This can be partly addressed through engaging in computational thinking to help decipher algorithms. However, we often rely on tools and libraries developed by others. These can be black boxes in our analysis. We use a tool that takes an input, does some processing based on the input and produces an output. We may trust previous research that they were correct but might not check these ideally. It may even remain unclear to us how the output is produced and what kinds of threads to validity and reliability it presents. We cannot trust only computational thinking in these cases. The code used in libraries or tools might be extensive in length or not available for our inspection, or we might not be able to access the data. There are examples that show that issues in not understanding the algorithms or their limitations or observing mistakes have led to requests to redact published papers. Eklund et al. (2016) suggest that for over 20 years, tools used for statistical analysis of brain scans used tests that were too conservative. They argue that this might have led to the wrong interpretation of several of the 40,000 published papers and suggest that the results are incorrect. In scholarly discussion, the total scale and implications of the conservative statistical test were debated for the selected papers. Similarly, Hoffman et al. (2017) showed how a politeness analysis tool was moved outside of its original context, leading to mistakes. The reason might not be directly an algorithmic black box or lack of transparency of the analysis code. Rather, there was a lack of clarity on the context of the algorithm, and this led to mistakes in its use.
Beyond these types of mistakes, additional challenges emerge as many tools or libraries are not only tied into the specific algorithm. Rather, the outcomes may be determined by data used during the development of the tools or libraries as well as the pre-processing stages done for the data. Even if we could examine the code and validate that the code implementation correctly followed the algorithmsâ ideas and its statistic libraries, we cannot be automatically ensured that data used to fuel these algorithms are correct or conceptually valid. This is clearly visible in the algorithmic data analysis. As we already discussed above, context matters for these cases. For example, classification of the politeness or party affiliation from text can fail because of differences in the learning data (Yu et al., 2008; Hoffman et al., 2017). This is because machine learning models focus on predicting cases - not explaining them (Wallach, 2018). However, algorithmic data analysis is not the only method that could be impacted through these cases. If simulation models' parameter estimations are based on others' work, it could be a black box. If visualisation in a network uses advanced libraries, those libraries are often in practice black boxes for their users.
However, these comments should not be seen as criticising the use of external libraries or tools in the analysis. It is impractical not to use them, even when they add some opaqueness to the process. Rather, acknowledging these factors and cultivating skills to understand potential treats to validity are the main aims of this section. We discuss later approaches to mitigate these challenges.