Graph Embeddings and Node2Vec

Summary: Learning low-dimensional vector representations of graphs through random walks, DeepWalk, Node2Vec, and skip-gram models

What are Graph Embeddings?

Graph embeddings transform nodes, edges, or entire graphs into dense vector representations that preserve structural and semantic properties. These embeddings enable machine learning on graph data by converting discrete graph structures into continuous vector spaces.

Random Walk Strategies

DeepWalk

Uniform Random Walks: Equal probability to all neighbors
Captures Structural Equivalence: Nodes with similar local structures get similar embeddings
Simple and Scalable: Works well for large graphs

Node2Vec

Biased Random Walks: Controlled by parameters p and q
Return Parameter (p): Controls likelihood of returning to previous node
- Low p: Encourages revisiting, local exploration
- High p: Discourages backtracking
In-Out Parameter (q): Controls exploration vs exploitation
- Low q: DFS-like behavior, explores graph globally
- High q: BFS-like behavior, stays local

Skip-Gram Training

The skip-gram model learns embeddings by predicting context nodes given a center node:

Generate Random Walks: Sample sequences of nodes
Create Training Pairs: (center node, context node) within window
Optimize Embedding: Maximize probability of observing context given center
Loss Function: Negative log-likelihood with negative sampling

Key Properties Preserved

Homophily

Nodes that are close in the graph should have similar embeddings.

Structural Equivalence

Nodes with similar structural roles should have similar embeddings.

Community Structure

Nodes in the same community cluster together in embedding space.

Applications

Node Classification: Predict node labels using embeddings as features
Link Prediction: Compute similarity between node embeddings
Community Detection: Cluster nodes in embedding space
Graph Visualization: Use 2D/3D projections of embeddings
Recommendation Systems: Find similar items via embedding similarity

Comparison of Methods

Method	Walk Strategy	Training	Pros	Cons
DeepWalk	Uniform	Skip-gram	Simple, scalable	No edge weights
Node2Vec	Biased (p,q)	Skip-gram	Flexible	More parameters
LINE	1st/2nd order	Direct	Preserves proximity	Memory intensive
GraphSAGE	Sampling	Aggregation	Inductive	Complex

Best Practices

Walk Parameters: Start with walk_length=80, num_walks=10
Window Size: Use window_size=10 for skip-gram
Embedding Dimension: Typically 64-256 dimensions
Node2Vec Tuning:
- Grid search p, q ∈ 4
- p=1, q=1 gives DeepWalk behavior
Evaluation: Use downstream task performance

Deep Learning

Representation Collapse in Self-Supervised Learning

Understanding complete, dimensional, and cluster collapse — the failure modes that every self-supervised method must prevent. Learn why collapse happens and how contrastive, asymmetric, regularization, and masking approaches solve it.

Deep Learning

Contrastive Loss for Representation Learning

Understand contrastive loss for representation learning: interactive demos of InfoNCE, triplet loss, and embedding space clustering with temperature tuning.

Embeddings & Retrieval

Contrastive Learning

Master contrastive learning for vector embeddings: how InfoNCE loss and self-supervised techniques train models to create high-quality semantic representations.

Transformers & LLMs

The Modality Gap in Multimodal AI

The modality gap in CLIP and vision-language models: why image and text embeddings occupy separate regions despite contrastive training.

Deep Learning

Adaptive Tiling: Efficient Visual Token Generation

Learn adaptive tiling in vision transformers: dynamically partition images based on visual complexity to reduce token counts while preserving detail.

Deep Learning

Batch Normalization in Deep Learning

Learn batch normalization in deep learning: how normalizing layer inputs accelerates training, improves gradient flow, and acts as regularization.