You need to understand that logistic regression is a special case of a more general non-linear paradigm (neural network) which can solve problems that are impossible to be solved by linear relations. This does not mean that NN is the silver bullet as linear approaches are still used heavily to solve many problems; what is important though is to have the knowledge and experience to apply the most applicable algorithm for each case.

To get a simplified and intuitive understanding of the multinomial logit you can think that the initial assumption is that the ability of each horse (for example expressed in speed figures) is following a normally distributed curve whose mean and sigma depends on the features of each individual horse.

For example, for two horses that are facing each other using the logit method you can imagine that each of them is assigned a mean speed figure and a sigma. Let’s say that these values are as follows:

A: mean 98 sigma 6

B: mean 102 sigma 12

If you plot both of them in the same graph, the probability of A beating B equals the probability of A running a larger figure that B and vice versa:

Note that in this example, although the 102/12 is faster than the 98/6, you can visualize that the latter is still winning many times (when the blue are is higher than the orange).

Try to understand how this can be extracted from the graph and think of why this approach might be wrong and you will see that this approach has a lot of room for improvement.