The gradient of multiplication of matrices

69 Views Asked by At

In the following snippet of Python code using @ and GradientTape, I cannot understand how the x vector of shape (3,1) and the w vector of shape (1,2) produce [6.,6.]. Can anyone please explain it to me?

import tensorflow as tf

x = tf.constant([[1.], [2.], [3.]])    #shape=(3, 1)
w = tf.Variable(tf.ones((1, 2)))       #shape=(1, 2)
b = tf.Variable(tf.ones((3,2)))

with tf.GradientTape(persistent=True) as tape:
    y = x @ w + b 

dy_dw = tape.gradient(y,w) 
print(dy_dw)