Often, network analysis focuses on a descriptive analysis of the network, such as computing key descriptive numbers, similar to mean, median or variance with a statistical data set. Networks can also be visualised - many examples have already been shown above. Key indicators of networks focus either on elements of the network or the network as a whole. The former focuses on individual nodes or individual ties, while the latter examines every node or every tie.
Network analysis can use descriptive indictors to tell how connected a node is in the network. The degree measures how many connections a node has to other nodes. It is the number of ties for each node. In a directed network, we can further separate a tieâs origin from the node (out-degree) and destined to the node (in-degree). For example, in Figure 4.5, node A has a degree of three. The node has three connections. Node E has only a degree of two. However, the degree only examines the immediate environment of a node. Figure 4.1 shows how this is problematic. Node E bridges two isolated clusters together (A-D and F-I), but its degree is smaller than any other node's degree in the network. Because of this, network scientists also focus on other types of centrality measurements to better capture the node's position and role in the network. That is, there are many different ways to define centrality in a network. For example, one can compute how many shortest paths between two nodes are connected through the node and, therefore, compute the node's betweenness centrality. Another definition explores what the average shortest distance is between the node and other nodes in the network. In this closeness centrality, highly central nodes are those that are close, least ties away, from other nodes. Given all these definitions differ, a valid concern is which one to use in the research. There is no trivial answer to this question. Different approaches to computational social science (see Chapter 1) and different disciplines may have practices and preferences. My own experience with network analysis suggests the importance of theoretic insights and the formulation of the research question.
When exploring the network as a whole, obvious descriptive indicators include the number of nodes in the network and the number of ties in the network. More advanced indicators for network analysis include density and average path length. The network density compares how many ties the network has to the number of ties it could have if all ties between all nodes in the network existed. The average path length computes the average length of all shortest paths between nodes. It is the average length of the shortest routes between all nodes in the network, two notes analysed per time.
These descriptive indicators must be calculated from the network data. For example, the node's degree is computed from the edge using the gatherer-variable (see Example 4.1). The code goes through all ties (separated by line breaks, similar to Figure 4.3b) and checks if the node, of which degree we are counting, is part of that tie. If it is, we can increase the nodeâs degree. Similarly, the average path length is computed in the similar use of gatherer-variables (see Example 4.2). As the definition says that this is the shortest path between all nodes, we need to go through all nodes in the network using two for-structures. We then compute the shortest path between these two nodes and use those to compute the average of all shortest paths. (This is a non-trivial task but follows the idea described later in Section 4.4.1).