# What is a Logit?

The term logit has different meanings in math and in the TensorFlow library. In Ten-
sorFlow it means “Per-label activations, typically a linear output. These activation energies are interpreted as unnormalized log probabilities.” What this means is: the logits are the
vector you get as output of the last layer of your neural network. It is the quantity you
apply the softmax function to in order to transform the raw output into a probability dis-
tribution. If this doesnt make sense to you, you may want to read about neural networks,
or logistic regression for a start. A very nice introductory article can be found here.

In math a Logit is a number that can be assigned to a probability. If $p$ is a probability, then
\begin{equation}
L(p) =\log \left( \frac{p}{(1-p)}\right)
\end{equation} is called the log-odds or logit. Log-odds because the term \begin{equation}
\frac{p}{(1-p)}
\end{equation}
is called the odds. It is the ratio between the probability that an event occurs and the probability that it doesn’t occur. If $p$ is a large value, i.e. close to 1, then 1-p is close to 0 and the odds is a (possibly infinitely) large number. If $p$ is close to 0 then $1-p$ is close to one and the odds will be a number close to 0. Taking the logarithm of a large number has the effect of damping that number but the result can still be infinitely large. The logarithm of a positive number that is $\le 1$ is a negative number which is the larger the closer the input is to 0. So we see that a logit maps a probability which is between 0 and 1 to an output which can be infinitely large or small. In math: $L(p): [0,1] \mapsto [-\infty, \infty ]$.

As a side remark, note that the inverse of this operation is thus a function that maps from the real numbers to a probability: $f(L): [-\infty, \infty ]\mapsto [0,1]$. This inverse function is $p=\frac{e^l}{1+e^l} = \frac{1}{1+e^{-l}}$, which is known as the logistic function, or expit, or sigmoid.