Storing a network for computation requires the format to express both nodes and ties. This is the only meaningful manner to capture a network for computational analysis. There are two main formats4.1to represent networks: edge lists and adjacency matrices (see Figure 4.3 for examples). An edge list is a list of nodes that are connected in the network. We list all ties in the network, and for each tie we identify both nodes connected in the tie. An adjacency matrix instead presents the network in a table format. The rows and columns correspond to nodes in the network. The value of a cell depends if there is a tie between the node indicated by the column and the node indicated by the row. In my experience, the storage format is not that important, especially if the network is fairly small. However, sometimes it is easier to transform the data into an adjacency matrix, and other times it is easier to form an edge list from the data.
To represent a network in any of these formats, one first needs to build it from the data. There are many ways to collect such data. The versatility of the network paradigm means that many different phenomena can be explored as a network.
Many researchers have utilised surveys when exploring social interaction (for classical and contemporary examples, see Cross et al., 2002; Bearman et al., 2004; Wellman, 1979; Laumann, 1973; Reagans et al., 2004). However, it is not trivial to ask people about their social contacts. McCallister and Fischer (1978) suggested that one must be methodological about how to collect networks. They and others highlight how different question formulations may lead to a collection of different types of networks (for organisational studies, see Cross et al., 2002). For example, a social network could express friendships or collaboration depending on how the question is formulated in the survey. For small groups, it is possible to ask for a complete network, as everyoneâs relationship with everyone in the group. For a group sized this requires questions, such as: `How close do you feel with N. N.?â (with a Likert scale). As different questions may capture different theoretical concepts, you may ask, for example, three different questions and ask each person to rate all the other people on all of them. For a group of 10 people, this would already be 27 different questions - a lot for any survey. Alternatively, researchers ask for social connections through name generators, which are questions that help people think about relevant contacts. Formulations for name generators can include priming questions, such as listing all persons `who would care for your home if you went out of town?â (McCallister and Fischer, 1978) or `with whom you talk about personal mattersâ, or even to ask about access to resources, such as `who could fix your car?â (Van Der Gaag and Snijders, 2005). These questions and varied forms show that surveying social networks may require some considerations.
Nowadays, similar types of data about social relationships can be collected through digital traces and sensors. These include call records, messaging, social media activity and even with whom a person hangs out with (Karikoski and Nelimarkka, 2010; Raento et al., 2005; Eagle and Pentland, 2006). However, these different data sources naturally represent different perspectives of one's social contacts. Analysis results may differ depending on what data source was used to establish ties between individuals (Karikoski and Nelimarkka, 2010). Figure 4.4 shows how an organisational network differs depending on what sources of data are used to create it. Each layer on the figure shows an individual data source. Furthermore, the use of a particular communication channel is also culturally bounded. Telephone calls may be used among very good friends while face-to-face interaction was used more among acquaintances, at least by teens in Belgium in the late 2000s (Van Cleemput, 2010). These limitations must be accounted for when collecting social network data. The inferred social network is as good as the data. The common rule of garbage in, garbage out is true with networks as well.
Beyond social networks, data can shape different networks for analysis. We already discussed many ways that textual networks can be transformed, seeking relationships between actors and vocabulary use (Bail, 2016) or a collection of words in the documents (Baumer et al., 2018). Seeing real-world networks (e.g. flight routes and train lines) as networks often requires less transformation and consideration for network analysis.
The final opportunity when building a network is to combine the relational networked data with other types of data that describe the nodes or ties. Friendship networks could include attributes such as age or gender, the number of runways might describe airports in a flight route network and organisational position and tenure could be relevant for studying networks in organisations. These attributes can support analysis or visualisation. For example, in an organisational context, one might study the relationship between tenure and degree in the network. Therefore, researchers must be able to connect an individual node (and its degree) to another data set where employees' tenures are listed. There are several ways that this can be achieved in practice, depending on how data are stored and how algorithms are implemented. However, additional attributes can be used in a versatile manner to support and enrich the analysis. They also create additional dimensions for the network analysis.