In the following snippet of Python code using @ and GradientTape, I cannot understand how the x vector of shape (3,1) and the w vector of shape (1,2) produce [6.,6.]. Can anyone please explain it to me?
import tensorflow as tf
x = tf.constant([[1.], [2.], [3.]]) #shape=(3, 1)
w = tf.Variable(tf.ones((1, 2))) #shape=(1, 2)
b = tf.Variable(tf.ones((3,2)))
with tf.GradientTape(persistent=True) as tape:
y = x @ w + b
dy_dw = tape.gradient(y,w)
print(dy_dw)