Search Shortcut cmd + k | ctrl + k
lsh

Extension for locality-sensitive hashing (LSH)

Maintainer(s): yoonspark, ericmanning

Installing and Loading

INSTALL lsh FROM community;
LOAD lsh;

Example

-- Create toy data
CREATE TEMPORARY TABLE temp_names AS
SELECT * FROM (
    VALUES
        ('Alice Johnson'),
        ('Robert Smith'),
        (NULL),
        ('Charlotte Brown'),
) AS t(name);

-- Apply MinHash
SELECT lsh_min(name, 2, 3, 2, 123) AS hash FROM temp_names;

About lsh

For more information regarding usage, see the documentation.

Added Functions

function_name function_type description comment examples
lsh_min scalar Computes band hashes for each input string (or list of existing shingles) based on its MinHash signature Produces list of 64-bit band hashes NULL
lsh_min32 scalar Computes band hashes for each input string (or list of existing shingles) based on its MinHash signature Reduces each band hash to 32 bits NULL
lsh_euclidean scalar Computes band hashes for each input point based on its Euclidean LSH signature Produces list of 64-bit band hashes NULL
lsh_euclidean32 scalar Computes band hashes for each input point based on its Euclidean LSH signature Reduces each band hash to 32 bits NULL
lsh_jaccard scalar Computes Jaccard similarity for each input string pair Accepts ngram argument, unlike core Jaccard function NULL