Using pyDatalog to join data simply

In this post, I try to motivate you to try pyDatalog as a simple means of performing “database” queries. PyDatalog is a Logic Programming module for Python, similar to Prolog, but more inclined towards Datalog. I will assume that you have the latest version installed.

Question: I ate 3 pork chops, 4 lamb chops, 2 peas and 1 bean. How much meat and veg did I eat?

Pork and lamb are meat, peas and beans are veg.

Let’s import the pyDatalog module and create some basic terms:

from pyDatalog import pyDatalog as pdl

# set up some basic terms
#        F      G
+what('pork', 'meat')
+what('lamb', 'meat')
+what('peas', 'veg')
+what('beans', 'veg')

#      F     Q
+eat('pork', 3)
+eat('lamb', 4)
+eat('peas', 2)
+eat('beans', 1)

The create_terms() function sets up a term for “internal” usage. The code above translates the statement about the facts of our problem into code. “+what(‘pork’, ‘meat’)” says that pork is a meat, for example. “+eat(‘pork’, 3)” means that I ate 3 pork chops.

All of the facts are of type “term”, which you can verify as follows:

>>> print(type(what))
<class 'pyDatalog.pyParser.Term'>

Note that I have created three extra terms: F, G, and Q, which I will use to represent Food, Group and Quantity respectively. I don’t have to do things this way, but I think it’s an approach which helps me reason about the problem as a human.

So far, we have related foods to groups and foods to quantities. What we need to do is relate groups to quantities:

geat(G, Q) <= (what(F, G) & eat(F, Q))
(sumg[G] == sum_(Q, for_each = G)) <= geat(G, Q)
qtys(G, Q) <= (sumg[G] == Q)

We can play with these terms, if we like. For instance, for what quantities have we eaten meat:

>>> print(geat('meat', Q))

Our real question, though, is “give me the groups and quantities satisfying qtys”:

>>> print(qtys(G, Q))
G    | Q
veg  | 3
meat | 7

So, we now have the answer to our oroginal question. I ate 3 portions of veg, and 7 of meat.

It is possible that we want to take this data and manipulate it further using regular Python. Fortunately, this is straighforward:

>>> result = qtys(G, Q).data
>>> print(result)
[('meat', 7), ('veg', 3)]

Happy pythoning!


  • pyDatalog home page – contains instructions on using the package
  • PiPy page – where you can install it from, should you need a link
  • – code for the example above, available as a gist

About mcturra2000

Computer programmer living in Scotland.
This entry was posted in Uncategorized. Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s