Graph Distance Concepts
Hop Distance
The hop distance (or geodesic distance) between two nodes is the minimum number of edges in any path connecting them. It's the fundamental distance metric in unweighted graphs.
Shortest Path
The path with minimum hop distance between two nodes. Multiple shortest paths may exist. Key algorithms:
- BFS: For unweighted graphs
- Dijkstra: For weighted graphs with non-negative weights
- Floyd-Warshall: All-pairs shortest paths
Graph Diameter
The maximum shortest path length between any two nodes. Indicates the "width" of the graph.
Centrality Measures
Degree Centrality
C_degree(v) = degree(v) / (n-1)
- Measures: Direct connections
- Identifies: Highly connected hubs
- Use Case: Social influence, popular users
Closeness Centrality
C_closeness(v) = (n-1) / Σ d(v,u)
- Measures: Average distance to all nodes
- Identifies: Nodes that can quickly reach others
- Use Case: Facility location, information spread
Betweenness Centrality
C_betweenness(v) = Σ σ_st(v) / σ_st
- Measures: Fraction of shortest paths through node
- Identifies: Bridge nodes, bottlenecks
- Use Case: Critical infrastructure, gatekeepers
Eigenvector Centrality
x_v = (1/λ) Σ x_u for u ∈ N(v)
- Measures: Importance based on neighbor importance
- Identifies: Influential nodes in networks
- Use Case: PageRank, influence propagation
Path Types
Simple Path
No repeated vertices. Most paths in analysis are simple paths.
Cycle
Path that starts and ends at the same vertex. Indicates feedback loops.
Hamiltonian Path
Visits each vertex exactly once. NP-complete to find.
Eulerian Path
Uses each edge exactly once. Exists if graph has 0 or 2 odd-degree vertices.
Clustering Coefficient
Local Clustering Coefficient
CC(v) = 2e / (k(k-1))
Where:
- e = edges between v's neighbors
- k = degree of v
Measures how connected a node's neighborhood is.
Global Clustering Coefficient
Average of all local clustering coefficients. Indicates overall clustering tendency.
Transitivity
T = 3 × (number of triangles) / (number of connected triples)
Alternative global measure weighted by high-degree nodes.
Graph Properties
Connectivity
- Connected: Path exists between all node pairs
- k-Connected: Remains connected after removing k-1 nodes
- Components: Maximal connected subgraphs
Density
Density = 2|E| / (|V|(|V|-1))
Ratio of actual edges to possible edges.
Degree Distribution
- Regular: All nodes have same degree
- Power Law: P(k) ∼ k^(-γ), scale-free networks
- Poisson: Random graphs
Applications
Social Networks
- Identify influencers (high centrality)
- Detect communities (clustering)
- Predict information spread (paths)
Transportation
- Find critical intersections (betweenness)
- Optimize routes (shortest paths)
- Measure accessibility (closeness)
Biology
- Protein interaction networks
- Gene regulatory networks
- Neural connectivity
Web & Internet
- PageRank (eigenvector centrality)
- Network robustness (connectivity)
- Routing optimization
Computational Complexity
| Metric | Time Complexity |
|---|---|
| Degree Centrality | O(1) per node |
| BFS/DFS | O(V + E) |
| Shortest Path (Dijkstra) | O((V + E) log V) |
| All-Pairs Shortest Path | O(V³) |
| Betweenness Centrality | O(VE) |
| Clustering Coefficient | O(V × d²) |
Best Practices
- Choose Appropriate Centrality: Different measures capture different importance aspects
- Consider Graph Type: Directed vs undirected, weighted vs unweighted
- Normalize Measures: For comparing across different graphs
- Use Approximations: For large graphs (sampling, parallel algorithms)
- Combine Metrics: Multiple centralities give fuller picture
Related concepts
Learn adaptive tiling in vision transformers: dynamically partition images based on visual complexity to reduce token counts while preserving detail.
Learn batch normalization in deep learning: how normalizing layer inputs accelerates training, improves gradient flow, and acts as regularization.
BatchNorm normalizes over the batch and spatial axes; LayerNorm normalizes over the channel and spatial axes for each sample. The choice changes whether your model trains stably with batch=1, depends on batch composition at inference, and behaves consistently across train and eval.
How the Calinski-Harabasz index evaluates clustering quality by measuring the ratio of between-cluster to within-cluster variance — fast, intuitive, and ideal for k-selection with convex clusters.
Understanding complete, dimensional, and cluster collapse — the failure modes that every self-supervised method must prevent. Learn why collapse happens and how contrastive, asymmetric, regularization, and masking approaches solve it.
Understand contrastive loss for representation learning: interactive demos of InfoNCE, triplet loss, and embedding space clustering with temperature tuning.
