# Automatic Differentiation

Automatic differentiation is achieved through the use of two primary functions in the Torch.Autograd module: makeIndependent and grad.

## Independent Tensors

makeIndependent is used to instantiate an independent tensor variable from which a compute graph is constructed for differentiation, while grad uses compute graph to compute gradients.

makeIndependent takes a tensor as input and returns an IO action which produces an Torch.Autograd.IndependentTensor{.haskell .identifier}:

makeIndependent :: Tensor -> IO IndependentTensor


What is the definition of the IndependentTensor type produced by the makeIndependent action? It's defined in the Hasktorch library as:

newtype IndependentTensor = IndependentTensor {toDependent :: Tensor} deriving (Show)


Thus IndependentTensor is simply a wrapper around the underlying Tensor that is passed in as the argument to makeIndependent. Building up computations using ops applied to the toDependent tensor of an IndependentTensor will implicitly construct a compute graph to which grad can be applied.

All tensors have an underlying property that can be retrieved using the Torch.Autograd.requiresGrad function which indicates whether they are a differentiable value in a compute graph.[^requires-grad]

let x = asTensor ([1, 2, 3] :: [Float])
y <- makeIndependent (asTensor ([4, 5, 6] :: [Float]))
let y' = toDependent y
let z = x + y'

requiresGrad x =>
False

requiresGrad y' =>
True

requiresGrad z =>
True


In summary, tensors that are computations of values derived from tensor constructors (e.g. ones, zeros, fill, randIO etc.) outside the context of a IndependentTensor are not differentiable. Tensors that are derived from computations on the toDependent value of an IndependentTensor are differentiable, as the above example illustrates.

Once a computation graph is constructed by applying ops and computing derived quantities stemming from a toDependent value of an IndependentTensor, a gradient can be taken by using the grad function specifying in the first argument tensor corresponding to function value of interest and a list of Independent tensor variables that the the derivative is taken with respect to:

grad :: Tensor -> [IndependentTensor] -> [Tensor]


Let's demonstrate this with a concrete example. We create a tensor and derive an IndependentTensor from it:

a <- makeIndependent (ones' [2, 2])
let a' = toDependent a

a' =>
Tensor Float [2,2] [[ 1.0000   ,  1.0000   ],
[ 1.0000   ,  1.0000   ]]


Now do some computations on the dependent tensor:

let b = a' + 2

b =>
Tensor Float [2,2] [[ 3.0000   ,  3.0000   ],
[ 3.0000   ,  3.0000   ]]


Since b is dependent on the independent tensor a, it is differentiable:

requiresGrad b =>
True


Applying more operations:

let c = b * b * 3
let out = mean c

c =>
Tensor Float [2,2] [[ 27.0000   ,  27.0000   ],
[ 27.0000   ,  27.0000   ]]


grad out [a] =>