# ADCME Basics: Tensor, Type, Operator, Session & Kernel

## Tensors and Operators

`Tensor`

is a data structure for storing structured data, such as a scalar, a vector, a matrix or a high dimensional tensor. The name of the ADCME backend, `TensorFlow`

, is also derived from its core framework, `Tensor`

. Tensors can be viewed as symbolic versions of Julia's `Array`

.

A tensor is a collection of $n$-dimensional arrays. ADCME represents tensors using a `PyObject`

handle to the TensorFlow `Tensor`

data structure. A tensor has three important properties

`name`

: Each Tensor admits a unique name.`shape`

: For scalars, the shape is always an empty tuple`()`

; for $n$-dimensional vectors, the shape is`(n,)`

; for matrices or higher order tensors, the shape has the form`(n1, n2, ...)`

`dtype`

: The type of the tensors. There is a one-to-one correspondence between most TensorFlow types and Julia types (e.g.,`Int64`

,`Int32`

,`Float64`

,`Float32`

,`String`

, and`Bool`

). Therefore, we have overloaded the type name so users have a unified interface.

An important difference is that `tensor`

object stores data in the row-major while Julia's default for `Array`

is column major. The difference may affect performance if not carefully dealt with, but more often than not, the difference is not relevant if you do not convert data between Julia and Python often. Here is a representation of ADCME `tensor`

There are 4 ways to create tensors.

`constant`

. As the name suggests,`constant`

creates an immutable tensor from Julia Arrays.

```
constant(1.0)
constant(rand(10))
constant(rand(10,10))
```

`Variable`

. In contrast to`constant`

,`Variable`

creates tensors that are mutable. The mutability allows us to update the tensor values, e.g., in an optimization procedure. It is very important to understand the difference between`constant`

and`Variable`

: simply put, in inverse modeling, tensors that are defined as`Variable`

should be the quantity you want to invert, while`constant`

is a way to provide known data.

```
Variable(1.0)
Variable(rand(10))
Variable(rand(10,10))
```

`placeholder`

.`placeholder`

is a convenient way to specify a tensor whose values are to be provided in the runtime. One use case is that you want to try out different values for this tensor and scrutinize the simulation result.

```
placeholder(Float64, shape=[10,10])
placeholder(rand(10)) # default value is `rand(10)`
```

`SparseTensor`

.`SparseTensor`

is a special data structure to store a sparse matrix. Although it is not very emphasized in machine learning, sparse linear algebra is one of the cores to scientific computing. Thus possessing a strong sparse linear algebra support is the key to success inverse modeling with physics based machine learning.

```
using SparseArrays
SparseTensor(sprand(10,10,0.3))
SparseTensor([1,2,3],[2,2,2],[0.1,0.3,0.5],3,3) # specify row, col, value, number of rows, number of columns
```

Now we know how to create tensors, the next step is to perform mathematical operations on those tensors.

`Operator`

can be viewed as a function that takes multiple tensors and outputs multiple tensors. In the computational graph, operators are represented by nodes while tensors are represented by edges. Most mathematical operators, such as `+`

, `-`

, `*`

and `/`

, and matrix operators, such as matrix-matrix multiplication, indexing and linear system solve, also work on tensors.

```
a = constant(rand(10,10))
b = constant(rand(10))
a + 1.0 # add 1 to every entry in `a`
a * b # matrix vector production
a * a # matrix matrix production
a .* a # element wise production
inv(a) # matrix inversion
```

## Session

With the aforementioned syntax to create and transform tensors, we have created a computational graph. However, at this point, all the operations are symbolic, i.e., the operators have not been executed yet.

To trigger the actual computing, the TensorFlow mechanism is to create a session, which drives the graph based optimization (like detecting dependencies) and executes all the operations.

```
a = constant(rand(10,10))
b = constant(rand(10))
c = a * b
sess = Session()
run(sess, c) # syntax for triggering the execution of the graph
```

If your computational graph contains `Variables`

, which can be listed via `get_collection`

, then you must initialize your graph before any `run`

command, in which the Variables are populated with initial values

`init(sess)`

## Kernel

The kernels provide the low level C++ implementation for the operators. ADCME augments users with missing features in TensorFlow that are crucial for scientific computing and tailors the syntax for numerical schemes. Those kernels, depending on their implementation, can be used in CPU, GPU, TPU or heterogenious computing environments.

All the intensive computations are done either in Julia or C++, and therefore we can achieve very high performance if the logic is done appropriately. For performance critical part, users may resort to custom kernels using `customop`

, which allows you to incooperate custom designed C++ codes.

## Summary

ADCME performances operations on tensors. The actual computations are pushed back to low level C++ kernels via operators. A session is need to drive the executation of the computation. It will be easier for you to analyze computational cost and optimize your codes with this computation model in mind.

## Tensor Operations

Here we show a list of commonly used operators in ADCME.

Description | API |
---|---|

Constant creation | `constant(rand(10))` |

Variable creation | `Variable(rand(10))` |

Get size | `size(x)` |

Get size of dimension | `size(x,i)` |

Get length | `length(x)` |

Resize | `reshape(x,5,3)` |

Vector indexing | `v[1:3]` ,`v[[1;3;4]]` ,`v[3:end]` ,`v[:]` |

Matrix indexing | `m[3,:]` , `m[:,3]` , `m[1,3]` ,`m[[1;2;5],[2;3]]` |

3D Tensor indexing | `m[1,:,:]` , `m[[1;2;3],:,3]` , `m[1:3:end, 1, 4]` |

Index relative to end | `v[end]` , `m[1,end]` |

Extract row (most efficient) | `m[2]` , `m[2,:]` |

Extract column | `m[:,3]` |

Convert to dense diagonal matrix | `diagm(v)` |

Convert to sparse diagonal matrix | `spdiag(v)` |

Extract diagonals as vector | `diag(m)` |

Elementwise multiplication | `a.*b` |

Matrix (vector) multiplication | `a*b` |

Matrix transpose | `m'` |

Dot product | `sum(a*b)` |

Solve | `A\b` |

Inversion | `inv(m)` |

Average all elements | `mean(x)` |

Average along dimension | `mean(x, dims=1)` |

Maximum/Minimum of all elements | `maximum(x)` , `minimum(x)` |

Squeeze all single dimensions | `squeeze(x)` |

Squeeze along dimension | `squeeze(x, dims=1)` , `squeeze(x, dims=[1;2])` |

Reduction (along dimension) | `norm(a)` , `sum(a, dims=1)` |

Elementwise Multiplication | `a.*b` |

Elementwise Power | `a^2` |

SVD | `svd(a)` |

`A[indices] = updates` | `A = scatter_update(A, indices, updates)` |

`A[indices] += updates` | `A = scatter_add(A, indices, updates)` |

`A[indices] -= updates` | `A = scatter_sub(A, indices, updates)` |

`A[idx, idy] = updates` | `A = scatter_update(A, idx, idy, updates)` |

`A[idx, idy] += updates` | `A = scatter_add(A, idx, idy, updates)` |

`A[idx, idy] -= updates` | `A = scatter_sub(A, idx, idy, updates)` |

In some cases you might find some features missing in ADCME but present in TensorFlow. You can always use `tf.<function_name>`

. It's compatible.