TensorFlow
The result of reduction operations (i.e. tf.math.reduce_sum, tf.math.reduce_std, torch.sum,
torch.mean, etc…), highly depends on the shape of the Tensor provided.
import tensorflow as tf
x = tf.constant([[1, 1, 1], [1, 1, 1]])
tf.math.reduce_sum(x)
In the example above the reduction of the 2 dimensional array will return the value 6 as all the elements are added together. By
default TensorFlow’s reduction operations are applied across all axis. When specifying an axis the result will be completely different.
import tensorflow as tf
x = tf.constant([[1, 1, 1], [1, 1, 1]])
tf.math.reduce_sum(x, axis=0)
Here the result will be [2,2,2] as the reduction is applied only on the axis 0.
TensorFlow’s default behavior can be confusing, especially when the reducing array of different shapes.
Considering the following example:
import tensorflow as tf
x = tf.constant([[1], [2]])
y = tf.constant([1, 2])
tf.math.reduce_sum(x + y)
Here the result will be 12 instead of the 6 that could be expected. This is because the implicit broadcasting reshapes
the first array to [[1,1], [2,2]] which is then added to the y array [1,2] resulting in [[2,3],
[3,4]]. As the reduction happen across all dimensions the result is then 2 + 3 + 3 + 4 = 12. It is not clear by looking at the
example if this was intentional or if the user made a mistake.
This is why a good practice is to always specify the axis on which to perform the reduction.
For example:
import tensorflow as tf
x = tf.constant([[1], [2]])
y = tf.constant([1, 2])
tf.math.reduce_sum(x + y, axis=0)
In the example above, specifying the axis clarifies the intent, as the result now is [5, 7]. If the intent was to effectively reduce
across all dimensions the user should provide the list of axis axis=[0,1] or clearly state the default behavior should be applied with
axis=None.
The PyTorch equivalent
The same behavior occurs in PyTorch, but the argument is called dim instead of axis.