Bayesian Networks in Python

AI/Data Science Digest
4 min readAug 23, 2021

--

Probability Refresher

  • Random variable — upper case letters (e.g. B)
  • Values on a random variable — lower case letters (e.g. b)
  • Conditional probability: P(a | b) = P(a,b)/P(b) (i.e. The probability of a given b = The joint probability of a and b divided by The probability of b)
  • Product rule: P(a,b) = P(a|b)P(b) (This is actually joint probability, which measure the probability of the two events happening at the same time)
  • Bayes Theorem: P(b|a) = P(a|b)P(b)/P(a)

A Simple Bayesian Network

Bayesian Network with three random variables. Notice that it is a directed acyclic graph. Each variable can take two values: True or False

What we want to calculate is P(L, R, W).

We can write this probability as P(L) x P(R) x P(W | R).

  • Winning the lottery has no influence on rain or ground being wet
  • Ground being wet depends on Rain
P(R, C, W, S) = P(R) x P(C) x P(W | R, C) x P(S | W)
A Real-World Example of a Bayesian Network. Example Level (difficult, hard), IQ (high, low). Factorizing the Joint Probability Distribution P(a, m, i, e, s) = P(e) P(i) P(m|e, i) P(s|i)P(a|m).
The probability of a random variable depends only on its parents.

Bayesian Network vs. Naive Bayes Classifier (NBC)

You probably have used NBC to perform classification tasks. A key assumption in NBC is that all variables are independent, which is rarely the case. BNs on the other hand can take dependencies between variables into consideration. Also, when the prior probabilities of variables changes or new variables are introduced, NBs can be updated with the new knowledge.

Inference

  • Given Bayesian Network describing P(X, Y, Z), what is P(Y)?

Two methods:

  • Enumeration
  • Variable Elimination

Enumeration

What is the probability of rain given it is slippery? Take the marginal and divide by P(s). n is the number of random variables. In this case it is only 2, but if there are many, this computation becomes really slow.
Computational Tree to calculate the probability. In the above, we are not taking advantage of the structure. One simply example is, we repeat the computation of P(c) in this case. How can we do better? Store computed values and lookup if it needed again.

Variable Elimination

Let’s say we want to eliminate the terms dependent on c. There are two terms: P(c) and P(w|c,r). We create a factor f_c(w) and then plug that in.
  • Every variable that is not an ancestor of a query variable or evidence variable is irrelevant to the query
  • The iterative procedure:
    Choose a variable to eliminate
    Sum terms relevant to variable, generate a new factor
    Repeat until no more variables to eliminate
  • Exact inference is #P-Hard (In tree structured BNs, linear time (in the number of entries)

Independence in Bayes Nets

Given we observe A, what variables are independent of C? B and D.
When you observe C and D, E is independent of A and B.
Markov Blanket of C. Given we observe the Markov blanket of C, C is conditionally independent of B (which is outside of the Markov blanket)

Modeling Monty Hall Problem

Monty Hall Problem: There are three doors. The price is hidden behind one of them. Guest selects a door. Monty (moderator) selects a door which is not the prize door. Now, the guest is given the chance to either stick with the old decision or switch.

We will model the problem as a BN.

We have three variables: guest, prize and monty. Each can take three values (three doors): A, B, C

# for creating Bayesian Belief Networks (BBN)
from pybbn.graph.dag import Bbn
from pybbn.graph.edge import Edge, EdgeType
from pybbn.graph.jointree import EvidenceBuilder
from pybbn.graph.node import BbnNode
from pybbn.graph.variable import Variable
#the guest's intitial door selection is completely random
guest = BbnNode(Variable(0, 'guest', ['A', 'B', 'C']), [1.0/3, 1.0/3, 1.0/3])
#the door the prize is behind is also completely random
prize = BbnNode(Variable(1, 'prize', ['A', 'B', 'C']), [1.0/3, 1.0/3, 1.0/3])
#monty is dependent on both guest and prize
monty = BbnNode(Variable(2, 'monty', ['A', 'B', 'C']), [0, 0.5, 0.5, #A, A
0, 0, 1, #A, B
0, 1, 0, #A, C
0, 0, 1, #B, A
0.5, 0, 0.5, #B, B
1, 0, 0, #B, C
0, 1, 0, #C, A
1, 0, 0, #C, B
0.5, 0.5, 0 #C, C
])

Above, we create three nodes, now, we are going to create the actual BN.

# Create Network
bbn = Bbn() \
.add_node(guest) \
.add_node(prize) \
.add_node(monty) \
.add_edge(Edge(guest, monty, EdgeType.DIRECTED)) \
.add_edge(Edge(prize, monty, EdgeType.DIRECTED))
# Convert the BBN to a join tree
join_tree = InferenceController.apply(bbn)

We use the following helper function to print marginal probabilities:

# Define a function for printing marginal probabilities
def print_probs():
for node in join_tree.get_bbn_nodes():
potential = join_tree.get_bbn_potential(node)
print("Node:", node)
print("Values:")
print(potential)
print('----------------')

What is the initial marginal probabilities (with no prior beliefs available):

print_probs()

The output is as follows:

Marginal probabilities of each node when there is no prior beliefs available. As you can see, each variable could chose A, B, C with equal probability.

We use the following helper function to update the beliefs.

# To add evidence of events that happened so probability 
# distribution can be recalculated
def evidence(ev, nod, cat, val):
ev = EvidenceBuilder() \
.with_node(join_tree.get_bbn_node_by_name(nod)) \
.with_evidence(cat, val) \
.build()
join_tree.set_observation(ev)

Let’s say that the guest selects A.

evidence('ev1', 'guest', 'A', 1.0)# Print marginal probabilities
print_probs()
Updated marginal probabilities after guest selects A. Notice that prize node has not changed as it is independent of the guest node. However, this changes the marginal probabilities of monty as now monty can only select either B or C.

Now, let’s say Monty selects B.

evidence('ev2', 'monty', 'B', 1.0)# Print marginal probabilities
print_probs()
Marginal Probabilities after Guest selects A and Monty selects B. Notice that the marginal probabilities of Price is changed.

After above prior beliefs, we can see that P(Prize behind A) = 1/3 and P(Price behind C) = 2/3. So, Guest is better of switching his/her decision!

I hope your updated prior beliefs convinced you that BNs are not only cool but also help you make informed decisions as the world around you is constantly changing.

References

--

--

AI/Data Science Digest
AI/Data Science Digest

Written by AI/Data Science Digest

One Digest At a Time. I value your time! #datascience #AI #GenAI #LLMs #dataanalyst #datascientist #probability #statistics #ML #savetime #digest

No responses yet