This question arose from an informatics problem, but I do believe Math SE is the right stack to ask because I am not asking for a algorithm in a specific language but for properties to check using informatics.
Note: I am not used to discrete mathematics and relations notation. I do my best to make it understandable, logic and use the right terminology. Please do not hesitate to rephrase my post in order to make it consistent.
I have data which are always associated to a status word of 8 bytes (bigint, 64 bits), they are stored into a table $X$. What I can say from those bits:
- Some of them have a meaning, others do not (unknown or not used);
- Some groups of bits are mutually exclusive, others are compatible;
- Some bits are redundant (they contain similar informations).
I have created a set of rules that maps those status words to another subset of status called Quality Code (char, 8 bits, 1 byte) using a masked comparison technique. The set of rules registers a final Quality Code ($s$) in $\mathbb{B}^{8}$ based on a binary mask ($m$) and a binary value ($v$) coming from $\mathbb{B}^{64}$. Each rule is in the form:
$$ (m,v,q) \in \mathbb{B}^{64}\times \mathbb{B}^{64}\times \mathbb{B}^{8} $$
And are stored into a table $Q$.
As far as I understand my operation, I am defining a relation that generates classes, and I am defining it in the following way:
The status word $s$ is in relation with quality code $q_i$ if, when masked by $m_i$ ($\odot$ stands for bit-wise and operator), it equals $v_i$. I think in mathematics it should be:
$$ s \,R\, q_i \equiv (s,q_i)\in\mathbb{B}^{64}\times \mathbb{B}^{8} \quad|\quad (s \odot m_i) = v_i $$
Technically the operation is performed by a bit-wise JOIN, using Standard SQL and the result is stored into a table called $J$:
CREATE TABLE J AS (SELECT * FROM X JOIN Q ON ((X.s & Q.m)=Q.v));
Where tables $X$ and $Q$ are defined as:
CREATE TABLE X(t TIMESTAMP, x FLOAT, s BIGINT);
CREATE TABLE Q(q CHAR(1), m BIGINT, v BIGINT);
Then I have investigated what behaviour this mapping could have, and I have reached the following conclusions. For a given set of rules in $Q$, three cases are possible:
- The relation is not injective, then not all status words $s$ are in relation with at least one quality code $q_i$ (sub-partitioning). Table $X$ contains more records than table $J$, some records are not mapped, leads to missing data;
- The relation is injective and single-valued, then all status words $s$ are associated to exactly one quality code $q_i$ (exact partitioning). Table $X$ contains the same amount of records than table $J$, all records are mapped, no data loss, no duplicate.;
- The relation is injective and multi-valued, then all status words $s$ are associated to one or more quality codes $q_i$ (over partitioning, there is overlaps between classes). Table $X$ contains less records than table $J$, there are duplicate records with different quality codes.
Computational Example:
For the sake of clarity we will play in $\mathbb{B}^8 \times \mathbb{B}^8$
We have two status words:
s1 = 1000 1011
s2 = 1100 0011
And a set of three rules:
(m1,v1,q1) = (0000 1111, 0000 1011, 'A')
(m2,v2,q2) = (0000 0111, 0000 0011, 'B')
(m3,v3,q3) = (0000 1111, 0000 0011, 'C')
Matching relation:
s1 = 1000 1011
m1 = 0000 1111
---------
s1 * m1 = 0000 1011
=
v1 = 0000 1011
It means $s_1$ is in relation with $q_1$.
Not matching relation:
s1 = 1000 1011
m3 = 0000 1111
---------
s1 * m3 = 0000 1011
!
v3 = 0000 0011
It means $s_1$ is not in relation with $q_3$.
Join results, table $J$ contains the following couples:
$$x \quad R\quad q = \left\{ (s_1,q_1), (s_1,q_2), (s_2,q_2), (s_2,q_3) \right\}$$
This example shows an over-partitioning jointure.
I hope I have been clear enough, please comment if you need clarifications.
I would like to know: How can I verify that a set of rules leads to an exact partitioning? I am asking for a solution involving records in $Q$ (mask $m_i$ and value $v_i$) only. I do not want to perform my check by comparing the amount of lines between data table $X$ and jointure table $J$.