Mathematical notation for Hash Tables

1.8k Views Asked by At

A matrix is nice for modelling arrays, eg

const A = [1, 2, 3, 4]
const a1 = A[0]
const a2 = A[1]
const a3 = A[2]
const a4 = A[3]

can be modelled with $$ A = \begin{bmatrix} 1\\ 2\\ 3\\ 4 \end{bmatrix} $$

$$ a1 = A_1\\ a2 = A_2\\ a3 = A_3\\ a4 = A_4 $$

I am curious if there is some similar construct to this for modelling hash maps.

const A = {
    "one": 1,
    "two": 2,
    "three": 3,
    "four": 4
}
const A_one = A["one"]
const A_two = A["two"]
const A_three = A["three"]
const A_four = A["four"]

in a such a way that

$$ a1 = A_{"one"} \\ a2 = A_{"two"} \\ a3 = A_{"three"}\\ a4 = A_{"four"} $$

I was considering augmenting a matrix, but it's not standard as far as I know.

$$\left[ \begin{array}{c|c} "one"&1\\ "two"&2\\ "three"&3\\ "four"&4 \end{array} \right]$$

I know that hash tables are not matrices; as matrices are ordered, hash tables are unordered, and matrices have a bunch of properties/operations that do not map directly 1:1 to hash tables.

I know that one way a hash tables can be constructed is given a bunch of key, value pairs a input

eg.

$$ \left[ \begin{array}{c@{}c@{}} \left[ \begin{array}{cc} "one"\\ 1 \end{array} \right] \\ \left[ \begin{array}{cc} "two"\\ 2 \end{array} \right] \\ \left[ \begin{array}{cc} "three"\\ 3 \end{array} \right] \\ \left[ \begin{array}{cc} "four"\\ 4 \end{array} \right] \end{array} \right] $$

But an associator function to make this a proper map is still needed.

Surely there must be some existing way of representing hash tables/"spare" arrays without inventing my own notation for them, as maps are more natural in mathematics than matrices, as a matrix can be represented as an ordered map, but a maps can't be represented by matrices because the matrix construct cannot "skip" rows, and cannot have noninteger indexing.

Note:

It seems that unordered map can be represented by discontinuous functions, and ordered maps can be represented as a product of a function and an ordered key vector; eg

const f = x => 
    x == "one" ? 1 :
    x == "two" ? 2 :
    x == "three" ? 3 :
    x == "four" ? 4 
    : undefined
    ;

const keys = ["one", "two", "three", "four"];
const m = keys.map(x => [x, f(x)]);

// [ [ 'one', 1 ], [ 'two', 2 ], [ 'three', 3 ], [ 'four', 4 ] ]
console.log(m);

The above is effectively the same as

const f = {
    "one": 1,
    "two": 2,
    "three": 3,
    "four": 4
};
const m = Object.keys(f).map(x => [x, f[x]]);

// [ [ 'one', 1 ], [ 'two', 2 ], [ 'three', 3 ], [ 'four', 4 ] ]
console.log(m);

and they can be composed naturally to add entries

'use strict';

const UnorderedMap = (f, domain) => {
    let res;
    res = x => res[x];
    domain.forEach(x => res[x] = f(x));
    res['domain'] = domain;
    res.composeWith = (f2, domain2) => {
        var domainSet = new Set();
        domain.forEach(x => domainSet.add(x));
        domain2.forEach(x => domainSet.add(x));

        return UnorderedMap((x => {        
            return res(x) !== undefined ? res(x) : f2(x);
        }), domainSet);
    };

    return res;
};    

let myMap = UnorderedMap((x => x === "zero" ? 0 : undefined), new Set(["zero"]));
myMap = myMap.composeWith((x => x === "one" ? 1 : undefined), new Set(["one"]));
myMap = myMap.composeWith((x => x === "two" ? 2 : undefined), new Set(["two"])); 
myMap = myMap.composeWith((x => x === "three" ? 3 : undefined), new Set(["three"])); 

console.log(myMap("zero")); // 0
console.log(myMap["zero"]); // 0
console.log(myMap("one")); // 1
console.log(myMap["one"]); // 1
console.log(myMap("two")); // 2
console.log(myMap["two"]); // 2
console.log(myMap("three")); // 3
console.log(myMap["three"]); // 3
console.log(myMap["domain"]); // Set { 'zero', 'one', 'two', 'three' } 
1

There are 1 best solutions below

1
On BEST ANSWER

If $V$ is a set of values and $K$ is a set of keys, then a hash is just a surjective map $h : V \to K$. If $k \in K$ is a key, then the subset $h^{-1} (k)$ is made up of all the values associated with the key $k$.

Notice that if $h$ is injective, then (since it is assumed surjective) it is bijective, therefore invertible, which defeats the practical purpose of a hash. Therefore, a hash function must be "as non-injective as practically possible". Notice, though, that a delicate balance must be attained, because if a hash function is "too non-injective", then it produces too many collisions (if $h(k) = h(k')$ the it is said that $k$ and $k'$ collide under $h$).

In computer science one usually takes $V$ and $K$ to be finite, and the cardinal of $K$ to be significantly smaller than the one of $V$. One also introduces some topology on $V$ and $K$ (such as the order topology) and requires $h$ to be "very discontinuous", in order to make approximation of zeros difficult (i.e. finding approximate solutions to the equation $f(v) = k$), i.e. making the hash function difficult to break.