Introduce the function $g(n)$ defined for positive integer arguments $n$ as the count of squares of positive integers among the numbers that can be formed by taking some subsequence of the digits of the binary representation of $n$ and interpreting this subsequence as a binary number. All subsequences contribute.
For example, $5 = (101)_2$ and the subsequences of digits are $$ (1)_2 = 1, (0)_2 = 0, (1)_2 = 1, (10)_2 = 2, (11)_2 = 3, (01)_1 = 1, (101)_2 = 5$$ and this includes the square "one" three times, so $g(5) = 3.$
When $n = 9 = (1001)_2$, we obtain the following subsequences: $$ (1)_2 = 1, (0)_2 = 0, (0)_2 = 0, (1)_2 = 1, (10)_2 = 2, (10)_2 = 2, (11)_2 = 3, (00)_2 = 0, (01)_2 = 1, (01)_2 = 1$$ and $$ (100)_2 = 4, (101)_2 = 5, (101)_2 = 5, (001)_2 = 1, (1001)_2 = 9$$ and this includes seven squares, so $g(9) = 7.$
The behavior of $g(n)$ is highly erratic, ranging from linear for $n=2^k$ to logarithmic for $n=2^k-1,$ which motivates us to ask the following question. What is the first term of the asymptotic expansion of the average order of $g(n)$? I.e. find the asymptotics of $$ \frac{1}{n} \sum_{k=1}^n g(k).$$
It might be useful to establish and prove closed form expressions for special values of $g(n)$, e.g. we have $g(2^k) = 2^{k-1}$ and $g(2^k-1) = k$ as pointed out earlier (prove these). Here are the first few terms of $g(n):$ $$1, 1, 2, 2, 3, 2, 3, 4, 7, 4, 5, 4, 4, 3, 4, 8, 15, 9, 12, 8, 9, 6, 7, 8, 11$$
This problem is not suited to numeric experimentation because $g(n)$ fluctuates so rapidly. For those who want to try anyway here is some C code. (I was going to memoize this the same way I memoized the Maple routine, but never got around to it. This is why the code looks as it does.)

#include <stdio.h>
#include <stdlib.h>
#include <math.h>
#include <readline/readline.h>
unsigned long long isqrt(unsigned long long n)
{
if(!n) return 0;
unsigned long long a = 1, b = n, mid;
do {
mid = (a+b)/2;
if(mid*mid>n){
b = mid;
}
else{
a = mid;
}
} while(b-a>1);
return a;
}
unsigned long long g(unsigned long long n)
{
int bits[256], len = 0;
while(n>0){
bits[len++] = n%2;
n >>= 1;
}
unsigned long long res = 0, ind;
for(ind=0; ind<(1<<len); ind++){
unsigned long long varind = ind;
int subseq[256], srcpos=0, pos=0;
while(varind>0){
if(varind%2){
subseq[pos++] = bits[srcpos];
}
srcpos++;
varind >>= 1;
}
unsigned long long val = 0;
int seqpos = 0;
while(seqpos<pos){
val += (1<<seqpos)*subseq[seqpos];
seqpos++;
}
unsigned long long cs = isqrt(val);
if(val>0 && cs*cs == val){
res++;
}
}
return res;
}
unsigned long long gavg(unsigned long long max)
{
unsigned long long res = 0, n;
for(n=1; n<=max; n++){
res += g(n);
}
return res;
}
int main(int argc, char *argv)
{
unsigned long long max;
char *line;
while((line = readline("> ")) != NULL){
if(sscanf(line, "%llu", &max)==1){
unsigned long gsum = gavg(max);
long double asympt = (long double)gsum;
asympt = asympt/(long double)max;
printf("%llu %lle\n", gsum, asympt);
}
free(line);
}
exit(0);
}
This is the Maple code, which is noticeably slower.
g :=
proc(n)
option remember;
local dlist, ind, flind, sseq, len, sel, val, r, res;
dlist := convert(n, base, 2);
len := nops(dlist);
res := 0;
for ind from 0 to 2^len-1 do
sel := convert(ind, base, 2);
sseq := [];
for flind to nops(sel) do
if sel[flind] = 1 then
sseq := [op(sseq), dlist[flind]];
fi;
od;
val := add(sseq[k]*2^(k-1), k=1..nops(sseq));
r := floor(sqrt(val));
if val>0 and r*r = val then
res := res+1;
fi;
od;
res;
end;
avg := n -> add(g(k), k=1..n);
This is an answer to the (corrected) bounty question, but I would still like to know if there is an asymptotic formula for the average value of $g(k)$ or whether it fluctuates between multiples of $n^b$.
Let $n = 2^m$ be a power of two, and let $F(n)$ be the number of squares found as subsequences of all binary strings of length $m$, and let $G(n)$ be the exact value of $\sum_{k=0}^{n-1} g(k)$. (It is convenient to start counting at $0$ rather than $1$ but it does not affect the asymptotics.)
It is certainly not true that $F(n) = G(n)$, but it is a useful approximation in two ways. On one hand, $F(n) \ge G(n)$, because every binary string of length $m$ corresponds to some integer in $[0,2^m-1]$ but it has strictly more subsequences because of zero-padding. On the other hand, $F(n) \le \sum_{k=n}^{2n-1} g(k) = G(2n) - G(n)$, because adding a leading $1$ to each binary string of length $m$ yields a legitimate binary integer in $[2^m,2^{m+1}-1]$.
The benefit of this simplified model is that it is very easy to estimate $F(n)$. For each length $1 \le \ell \le m$, there are $m\choose\ell$ subsequences of length $\ell$ within any $m$-bit string. For each choice there will be about $\sqrt{2^\ell}$ (more precisely, $\lfloor \sqrt{2^\ell-1}\rfloor+1$) ways to choose a perfect square to fill in that subsequence. Then there are $2^{m-\ell}$ ways to fill in the remaining bits. Therefore
$$F(n) \approx \sum_{\ell=1}^m {m\choose\ell} (\sqrt{2})^{\ell} 2^{m-\ell} = (\sqrt{2}+2)^m - 2^m,$$ by the binomial formula. Thus $F(n) \sim n^c$ where $c = \log_2(2+\sqrt{2})$, and $\tfrac1n F(n) \sim n^b$ where $b = c-1 = 0.77155\ldots$.
Recall that $G(n) \le F(n) \le G(2n)-G(n)$. Since $F(n) \sim n^c$, the left inequality easily gives $G(n) = O(n^c)$, while the right inequality yields $G(n) \ge G(n)-G(n/2) \ge F(n/2) = \Omega(n^c)$. Therefore $G(n) = \Theta(n^c)$ when $n$ is a power of $2$, and since it is monotonically increasing, this growth rate can be seen to extend to all $n$. The desired function $q(n)$ is thus $n^b$.