🔍

Embeddings & Vector Search

Dense and sparse embeddings, quantization techniques, and advanced retrieval methods for semantic search.

Concepts

All Embeddings & Vector Search Concepts

August 16, 2024

Contrastive Learning

Master contrastive learning for vector embeddings: how InfoNCE loss and self-supervised techniques train models to create high-quality semantic representations.

contrastive-learning self-supervised representation-learning infonce simclr

No direct links0 refs

August 16, 2024

Cross-Lingual Alignment

Learn cross-lingual embedding alignment techniques like VecMap and MUSE for multilingual vector retrieval and zero-shot language transfer in search systems.

cross-lingual multilingual alignment translation vecmap

No direct links0 refs

August 16, 2024

Domain Adaptation

Domain adaptation for embeddings: transfer learning to fine-tune retrieval models across domains while preventing catastrophic forgetting.

domain-adaptation transfer-learning fine-tuning distribution-shift

No direct links0 refs

August 16, 2024

Binary Embeddings

Learn how binary embeddings use 1-bit quantization for ultra-compact vector representations, enabling billion-scale similarity search with 32x memory reduction.

binary-embeddings quantization hashing compression retrieval

No direct links0 refs

August 16, 2024

Hybrid Retrieval Systems

Build hybrid retrieval systems combining BM25 sparse search with dense vector embeddings using reciprocal rank fusion for superior semantic search performance.

hybrid-retrieval fusion sparse-dense search ranking

No direct links0 refs

August 16, 2024

BM25 Algorithm

Master the BM25 algorithm, the probabilistic ranking function powering Elasticsearch and Lucene for keyword-based document retrieval and search systems.

bm25 retrieval ranking sparse-retrieval tf-idf search

No direct links0 refs

January 23, 2025

ANN Algorithms Comparison

Compare all approximate nearest neighbor algorithms side-by-side: HNSW, IVF-PQ, LSH, Annoy, and ScaNN. Find the best approach for your use case.

embeddings search ann comparison benchmarks algorithms

No direct links0 refs

January 23, 2025

HNSW: Hierarchical Navigable Small World

Interactive visualization of HNSW - the graph-based algorithm that powers modern vector search with logarithmic complexity.

embeddings search hnsw graph ann vector-database

No direct links0 refs

January 23, 2025

Vector Index Structures

Explore the fundamental data structures powering vector databases: trees, graphs, hash tables, and hybrid approaches for efficient similarity search.

embeddings index data-structures trees graphs databases

No direct links0 refs

January 23, 2025

IVF-PQ: Inverted File with Product Quantization

Learn how IVF-PQ combines clustering and compression to enable billion-scale vector search with minimal memory footprint.

embeddings search ivf product-quantization compression clustering

No direct links0 refs

January 23, 2025

LSH: Locality Sensitive Hashing

Explore how LSH uses probabilistic hash functions to find similar vectors in sub-linear time, perfect for streaming and high-dimensional data.

embeddings search lsh hashing probabilistic streaming

No direct links0 refs

January 23, 2025

Vector Quantization Techniques

Master vector compression techniques from scalar to product quantization. Learn how to reduce memory usage by 10-100× while preserving search quality.

embeddings quantization compression pq scalar-quantization optimization

No direct links0 refs

January 21, 2025

Cross-Encoder vs Bi-Encoder

Understand the fundamental differences between independent and joint encoding architectures for neural retrieval systems.

cross-encoder bi-encoder retrieval reranking neural-search transformers

No direct links0 refs

January 21, 2025

Dense Embeddings Space Explorer

Interactive visualization of high-dimensional vector spaces, word relationships, and semantic arithmetic operations.

embeddings word2vec glove bert semantic-search vector-space

No direct links0 refs

January 21, 2025

Matryoshka Embeddings

Matryoshka embeddings: nested representations enabling dimension reduction by simple truncation without model retraining for flexible retrieval.

matryoshka embeddings dimension-reduction multi-scale efficient-retrieval

No direct links0 refs

January 21, 2025

Multi-Vector Late Interaction

Explore ColBERT and other multi-vector retrieval models that use fine-grained token-level matching for superior search quality.

colbert retrieval multi-vector late-interaction dense-retrieval search

No direct links0 refs

January 21, 2025

Quantization Effects Simulator

Embedding quantization simulator: explore memory-accuracy trade-offs from float32 to int8 and binary representations for retrieval.

quantization embeddings compression int8 binary optimization

No direct links0 refs

January 21, 2025

Sparse vs Dense Embeddings

Compare lexical (BM25/TF-IDF) and semantic (BERT) retrieval approaches, understanding their trade-offs and hybrid strategies.

sparse-embeddings dense-embeddings bm25 tfidf bert hybrid-search retrieval

No direct links0 refs