The simplest approach for algorithmic data analysis is a rule-based approach. In this process, researchers define the rules of data analysis. Therefore, they need to operationalise the phenomena under investigation using computational thinking approaches. In practice, these approaches often lead to many if--then types of settings. This means that each rule must be a logical statement: It must be either true or false. Therefore, each variable must be dichotomised to provide a clear outcome like this.
One area where such rule-based data management can be easily adapted is computational data selection and filtering.
For example, Nelimarkka et al. (2020) needed to select social media posts that demonstrated interaction between a candidate and a constituent during an election period.
This was defined as a Facebook post and comment threads or Twitter message threads that
. They foresee the last criteria as essential to ensure that interaction occurred between the candidate and constituent. These three steps could each be measured computationally (see Example 3.1).
A rule-based approach is common also for simple computational text analysis. Dictionary-based methods depend on researcher-defined lists of words (or dictionaries) to quantify the text. Simple tools to quantify the emotional value of texts are based on expert-curated lists of words and their emotional values. An expert could determine that words like `happy' and `joyful' have a positive emotional value and words like `sad' and `unhappy' have negative emotional value. The emotional value of the sentence, `I am happy and joyful to learn computational social science but sad that it rained today.' would then be measured by looking at the number of positive emotional value words (2) and the number of negative emotional value words (1). Therefore, the sentence is positive in emotional value.
Another application area for rule-based approaches beyond data selection and analysis is the development of interactive systems (we discuss them more in chapter XXXxxx).
For example, Munger (2017) ran an experiment where Twitter bots responded to tweets that were offensive with messages that sought to decrease the offensiveness.
His data selection was based on the following conditions:
. Again, a complex set of criteria is used to evaluate to which users the bot would react.
These three examples show various application areas for rule-based algorithmic analysis for social sciences. At the same time, they illustrate the difficulty of these models: They depend on domain expertise only. For example, the definition of interaction could be challenged and various additions to racist slurs could be proposed. These approaches are as effective and as limited as the researcher who chose the criteria. The following two sections illustrate alternative approaches where the rule-generation process is different.