# Automatic Differentiation

Automatic differentiation is achieved through the use of two primary
functions in the `Torch.Autograd`

module: `makeIndependent`

and
`grad`

.

## Independent Tensors

`makeIndependent`

is used to instantiate an independent tensor variable
from which a compute graph is constructed for differentiation, while
`grad`

uses compute graph to compute gradients.

`makeIndependent`

takes a tensor as input and returns an IO action which
produces an `Torch.Autograd.IndependentTensor`

{.haskell .identifier}:

```
makeIndependent :: Tensor -> IO IndependentTensor
```

What is the definition of the `IndependentTensor`

type produced by the
`makeIndependent`

action? It's defined in the Hasktorch library as:

```
newtype IndependentTensor = IndependentTensor {toDependent :: Tensor} deriving (Show)
```

Thus `IndependentTensor`

is simply a wrapper around the underlying
Tensor that is passed in as the argument to
`makeIndependent`

. Building up computations using ops applied to the
`toDependent`

tensor of an `IndependentTensor`

will implicitly
construct a compute graph to which `grad`

can be applied.

All tensors have an underlying property that can be retrieved using
the `Torch.Autograd.requiresGrad`

function which indicates whether
they are a differentiable value in a compute graph.[^requires-grad]

```
let x = asTensor ([1, 2, 3] :: [Float])
y <- makeIndependent (asTensor ([4, 5, 6] :: [Float]))
let y' = toDependent y
let z = x + y'
```

```
requiresGrad x =>
False
```

```
requiresGrad y' =>
True
```

```
requiresGrad z =>
True
```

In summary, tensors that are computations of values derived from
tensor constructors (e.g. `ones`

, `zeros`

, `fill`

, `randIO`

etc.)
outside the context of a `IndependentTensor`

are not
differentiable. Tensors that are derived from computations on the
`toDependent`

value of an `IndependentTensor`

are differentiable, as
the above example illustrates.

## Gradients

Once a computation graph is constructed by applying ops and computing
derived quantities stemming from a `toDependent`

value of an
`IndependentTensor`

, a gradient can be taken by using the `grad`

function specifying in the first argument tensor corresponding to
function value of interest and a list of `Independent`

tensor
variables that the the derivative is taken with respect to:

```
grad :: Tensor -> [IndependentTensor] -> [Tensor]
```

Let's demonstrate this with a concrete example. We create a tensor and
derive an `IndependentTensor`

from it:

```
a <- makeIndependent (ones' [2, 2])
let a' = toDependent a
```

```
a' =>
Tensor Float [2,2] [[ 1.0000 , 1.0000 ],
[ 1.0000 , 1.0000 ]]
```

Now do some computations on the dependent tensor:

```
let b = a' + 2
```

```
b =>
Tensor Float [2,2] [[ 3.0000 , 3.0000 ],
[ 3.0000 , 3.0000 ]]
```

Since `b`

is dependent on the independent tensor `a`

, it is differentiable:

```
requiresGrad b =>
True
```

Applying more operations:

```
let c = b * b * 3
let out = mean c
```

```
c =>
Tensor Float [2,2] [[ 27.0000 , 27.0000 ],
[ 27.0000 , 27.0000 ]]
```

Now retrieve the gradient:

```
grad out [a] =>
[ Tensor Float [2,2] [[ 4.5000 , 4.5000 ],
[ 4.5000 , 4.5000 ]]]
```