The E2EE Search Paradox: Implementing Blind Indexing
You have successfully built an End-to-End Encrypted (E2EE) application. Your user's data is scrambled on their device using AES-GCM before it ever touches your server. You have achieved Zero-Knowledge.
And then, the inevitable feature request arrives: "I need a search bar."
The E2EE Paradox
How do you run a SQL SELECT * WHERE text LIKE '%query%' when the server only sees random, cryptographically secure blobs of text? You can't.
Historically, developers fell back to two terrible solutions:
- The Performance Killer: Download the entire encrypted database to the client, decrypt it all in memory, and search it locally. This ruins bandwidth and crashes mobile browsers.
- The Security Compromise: Send the search query to the server, decrypt the data temporarily in RAM to search it, and send the results back. This completely violates the Zero-Knowledge promise.
The Solution: Blind Indexing
Blind Indexing (or Searchable Symmetric Encryption) allows the server to look up records without knowing what those records contain.
Instead of just encrypting the payload, the client extracts keywords, creates cryptographic hashes (HMACs) of those keywords using a separate, secret Index_Key, and sends those hashes to the server alongside the encrypted payload.
Implementation in JavaScript (Web Crypto API)
To do this securely, we cannot use a simple SHA-256 hash. If we did, the server could easily run a rainbow table attack to reverse-engineer common words. We must use an HMAC (Hash-based Message Authentication Code) that requires a key the server does not have.
// Step 1: Derive a secondary key just for indexing.
// Do NOT use your AES encryption key for this!
async function deriveIndexKey(masterPassword) {
const enc = new TextEncoder();
const keyMaterial = await window.crypto.subtle.importKey(
"raw",
enc.encode(masterPassword),
{ name: "PBKDF2" },
false,
["deriveKey"]
);
// We derive an HMAC key, not an AES key
return await window.crypto.subtle.deriveKey(
{
name: "PBKDF2",
salt: enc.encode("static_blind_index_salt"),
iterations: 100000,
hash: "SHA-256"
},
keyMaterial,
{ name: "HMAC", hash: "SHA-256", length: 256 },
false,
["sign"]
);
}
// Step 2: Generate the Blind Index token for a keyword
async function generateBlindIndex(keyword, hmacKey) {
// Normalize data: lowercase, strip whitespace, remove punctuation
const normalizedWord = keyword.toLowerCase().trim().replace(/[^\w\s]/gi, '');
const signatureBuffer = await window.crypto.subtle.sign(
"HMAC",
hmacKey,
new TextEncoder().encode(normalizedWord)
);
// Convert the buffer to a hex string for database storage
const hashArray = Array.from(new Uint8Array(signatureBuffer));
return hashArray.map(b => b.toString(16).padStart(2, '0')).join('');
}
How the Database Schema Looks
Your server database now stores two things: the encrypted payload (AES-256), and an array of Blind Index hashes.
-- Supabase / PostgreSQL Schema
CREATE TABLE encrypted_notes (
id UUID PRIMARY KEY,
aes_encrypted_blob TEXT NOT NULL,
-- Store the blind indices in a JSONB array or secondary table
blind_indices JSONB NOT NULL
);
-- Creating a GIN index on JSONB makes querying hashes lightning fast
CREATE INDEX idx_blind_indices ON encrypted_notes USING GIN (blind_indices);
Executing the Search
When the user types "finance" into the search bar, the client-side JavaScript normalizes the word and hashes it using the secret hmacKey.
The browser then sends a request to the server:
GET /api/search?q=a8f5f167f44f4964e6c998dee827110c
The server simply looks for that exact hex string in the blind_indices column and returns the matching encrypted blobs. The server never knew the user searched for "finance". It only knows they searched for a specific hash.
Secure Your Payloads with ZeroKey
Building E2EE architecture is incredibly complex. If you just need to share sensitive data securely without the headache, use our zero-knowledge encryptor.