The main research question driving Levy and Franklin (2014) is to understand if citizens and interest group organisations raise different concerns during public policy discussions. They studied a total of 3569 comments made at different public hearings for U.S. trucking regulations, which included meta data about if the comment originated from a citizen or interest group. To answer their question, they need to know if there are differences about what kinds of issues are raised in comments made by the groups.
To understand the potential issues raised in the comments, they applied unsupervised machine learning to identify eight different themes from the comments. During the process they ran `multiple instances of the model with different numbers of topics, starting from randomly selected seeds.' This means they did not run only one unsupervised machine learning for analysis. They ran several that they `for interpretability' selected the eight-theme approach. After creating the eight-theme classification, they return to analyse each document. The unsupervised machine learning method allowed them to determine how much of each theme was represented in each comment, so a comment could be, for example, 80% of theme 1 and 20% of theme 2. Furthermore, the metadata included information about if the comment was written by an individual citizen or on behalf of an organisation. Therefore, they could evaluate to which degree different themes were raised by citizens and organisations by computing how much on average both groups raised each of the eight topics.