![]()
In order to illustrate the concept of graphical models and in particular Bayesian Networks, we shall take a more thorough look at one of the examples mentioned in the course description. The pregnancy test problem was selected for illustration. We shall assume that we are dealing with a sow, but the example is easily transformed cover a cow, ewe etc.
To the
left a very simple Bayesian network is shown. It has only two variables (shown
as ellipses), "Inseminated" and "Pregnant" and an edge
(arrow) between them. Both variables may take the values "Yes" and
"No". The possible values of a variable are referred to as states,
and the variables themselves are called nodes of the network. The edge
from "Inseminated" to "Pregnant" represent a causal
relation. It indicates, that the value of "Inseminated"
("Yes" or "No") influences the (probability distribution of
the) value of "Pregnant". This dependence may be expressed as a
conditional probability table like this:
| P("Pregnant"=x | "Inseminated"=y) | "Inseminated" = "Yes" | "Inseminated" = "No" |
| "Pregnant" = "Yes" | 0.85 | 0.00 |
| "Pregnant" = "No" | 0.15 | 1.00 |
The table expresses the for sows realistic assumption, that if it has been inseminated, there is a probability of 0.85 that it has actually conceived and a probability of 0.15 that it did not conceive. On the other hand, if the sow has not been inseminated, we know for certain that it is not pregnant.
Even though "Inseminated" and "Pregnant" are both variables each having a distinct value, there is nevertheless a marked difference between them. The difference is, that whereas "Inseminated" may be directly observed by the manager (i.e. he will know whether or not the sow has been inseminated) then the true value of "Pregnant" is unobservable. We only know, that if the sow has been inseminated then there is a 0.85 probability that it has become pregnant. We shall refer to the probability distribution (0.85, 0.15) as our belief in the true state of the "Pregnant" variable given that "Inseminated" = "Yes". If we extend the initial network slightly, we will see that we may make other observations that can change the belief. Assume that the sow is observed for heat after 1, 2 and 3 oestrus cycles. We may include a variable for each heat observation in the network as shown below:

All three new variables also have to states: "Yes" or "No" depending on whether or not the manager observes a heat. We have edges from "Pregnant" to each of the "Heat x" variables illustrating that the probability of observing heat in an oestrus cycle depends on whether or not the sow is pregnant. If the sow is pregnant (i.e. "Pregnant" = "Yes") we assume that the probability that the manager (erroneously) observes heat is 0.05. If the sow is not pregnant the probability of observing a heat is assumed to be as high as 0.80. The corresponding probability table is shown below:
| P("Heat a"=x | "Pregnant"=y) | "Pregnant" = "Yes" | "Pregnant" = "No" |
| "Heat a" = "Yes" | 0.05 | 0.80 |
| "Heat a" = "No" | 0.95 | 0.20 |
An interesting trait of the revised network is that the three variables "Heat 1", "Heat 2" and "Heat 3" are directly observable for the manager. Each time he performs a heat detection and determines whether or not (he thinks that) the sow is in heat, the outcome will change his belief in the true pregnancy status of the sow. If he actually observes heat, his belief in the true value of "Pregnant" will shift towards "No". In sow production, the manager has also (after some time) a pregnancy test at his disposal. The outcome may be "Positive" or "Negative", and we may include the variable as a new node of our network:

The probability table corresponding to the edge from "Pregnant" to "Test" will depend on the sensitivity and the specificity of the test. It may for instance be as specified below:
| P("Test"=x | "Pregnant"=y) | "Pregnant" = "Yes" | "Pregnant" = "No" |
| "Test" = "Positive" | 0.95 | 0.10 |
| "Test" = "Negative" | 0.05 | 0.90 |
In other words, the probability of a false positive test result is 0.10, and the probability of a false negative result is 0.05. Again, the "Test" variable is directly observable and the value will influence our belief in the true state of "Pregnant".
It should be noticed, that the graphical figures shown above are very well-defined. A node (ellipse) means a variable with a specified number of possible states, and an edge (arrow) is a conditional probability table with specified values. There is a comprehensive theory behind the concept, and given that some of the variables are observed, algorithms are available for calculating the probability distribution of the remaining variables (in the example only the "Pregnant" variable) given the observations. Baye's Theorem plays a central role in these algorithms. Several software systems are available for building and using Bayesian Networks. In this course we shall use the free Hugin Lite™ system.
If we build the network just described, and successively enter observations of "Heat 1", "Heat 2", "Heat 3" and "Test", we obtain the following belief in the state "Pregnant" = "Yes":
| Observation | "Inseminated" = "Yes" | "Inseminated" = "No" |
| None | 0.85 | 0.00 |
| "Heat 1" = "No" | 0.96 | 0.00 |
| "Heat 2" = "No" | 0.99 | 0.00 |
| "Heat 3" = "No" | 0.998 | 0.00 |
| "Test" = "Positive" | 0.9998 | 0.00 |
As long as we only know that the sow has been inseminated, the probability that it is pregnant is just 0.85. If we have further observed that it was not in heat on cycle later, the probability increases to 0.96. Further observations in the same direction will increase our belief in pregnancy to almost certainty as it appears from the table. On the other hand, if the sow has not been inseminated, the probability of pregnancy remains zero no matter what is observed. In the table all observations in some sense confirm pregnancy. By use of the Bayesian network we may, however, calculate the pregnancy probability for any combination of the observations ("Heat 1", "Heat 2", "Heat 3", "Test"). If we observe ("No", "Yes", -, -) the probability of pregnancy is 0.63. The sequence ("No", "Yes", "No", -) will increase it to 0.89, and if furthermore, there is a positive test ("No", "Yes", "No", "Positive") it becomes 0.99.
For references to real world applications in agriculture, environment and biology you should consult the list provided.
![]()