I'm studying computational learning and trying to wrap my head around a statement I stumble upon
A k-CNF function is a function f : {0, 1} n → {0, 1} of the form: f(x1, . . . , xn) = C1 ∧ C2 ∧ . . . ∧ Ct where each clause Ci is of the form ∨k j=1 ℓj and the ℓj s are literals (to get less than k literals, the same literal can appear more than once).
later , assuming k is constant it possible to reduce K-CNF to learning-Monomial.
I assume it can be done with De Morgan's rules
my question is : Am I correct? and how can it be proved ?
EDIT clarification about reduction
this learning problem can be reduced to learning monomials. That is, you can use the monomial learning algorithm (as is) to solve this problem
Suppose you want to learn a $2$-CNF over features $x_1, x_2, x_3, x_4$. There are quadratically many clauses with $2$ literals:
$$ (x_1 \vee x_1), (x_1 \vee \neg x_1), (\neg x_1 \vee \neg x_1), (x_1 \vee x_2), \ldots, (\neg x_4 \vee \neg x_4) \enspace. $$
Some of these clauses are tautologies (e.g., $x_1 \vee \neg x_1$) and you need at most one of them, but the total number is still quadratic. More generally, for $k$-CNF, the number of clauses is bounded by $(2n)^k$, where $n$ is the number of features. The factor of $2$ comes from the positive and negative literal.
Introduce a new variable for each possible clause. A simple way to denote name such variables is to use the clause they correspond to as a subscript:
$$ x_1 \vee \neg x_3 \text{ corresponds to } y_{x_1 \vee \neg x_3} \enspace. $$
This notation makes it clear that we have a bijection.
Then it is a simple matter to associate a monomial in the $y$ variables to a given $k$-CNF formula. For example,
$$ (x_2 \vee \neg x_3) \wedge (x_4 \vee x_4) $$
corresponds to
$$ y_{x_2 \vee \neg x_3} \wedge y_{x_4 \vee x_4} \enspace. $$
Once a monomial in the $y$ variables is learned, the corresponding $k$-CNF is derived by replacing each $y$ variable with the corresponding clause.
In the context of PAC learning, the size of the hypothesis space $H$ determines whether a concept is considered learnable. Specifically, $\log |H|$ has to be polynomial. For monomials of at most $n$ variables, $|H| = 3^n$, so that $\log|H| = n \log 3$ and all is well. For $k$-CNF, we get
$$ \log|H| = \log 3^{(2n)^k} = (2n)^k \log 3 \enspace. $$
This is where the fixed-$k$ assumption comes in, because for fixed $k$, $(2n)^k \log 3$ is a polynomial.