What is graph theory and why should you care?

0 13

A graphic visualization from the  Opte Project , a provisional mapping of the Internet

Graph theory can seem like an intimidating and abstract subject to you, so why should you even spend your time reading an article about it? However, while this may not seem very applicable, there is actually an abundance of useful and important applications of graph theory! In this article, I will try to briefly explain what some of these apps are. In doing so, I will do my best to convince you that having at least some basic knowledge on this topic can be helpful in solving some interesting problems you might encounter.

In this article, I will go through a concrete example showing how a route planning / optimization task can be formulated and solved using graph theory. Specifically, I will be considering a large warehouse with thousands of different items in various locations / pickup points. The challenge here is, given a list of items, which path do you have to take through the warehouse to pick up all the items, but at the same time minimize the total distance traveled? For those of you who are familiar with these kinds of issues, it sounds a lot like the famous traveling salesperson  problem . (A well-known problem in  combinatorial optimization , important in  theoretical computer science  and operational research ).

As you may have understood, the aim of this article is not to give a complete introduction to graph theory (which would be a huge task). Through a real world example, I will rather try to convince you that knowing at least some  basics  of graph theory can be very useful!

I'll start with a brief historical introduction to the field of graph theory and highlight the importance and wide range of useful applications in many very different fields. Following this more general introduction, I will then focus on the warehouse optimization example discussed above.

The history of graph theory

The basic idea of ​​graphics was first introduced in the 18th century by the Swiss mathematician Leonhard Euler , one of the most prominent mathematicians of the 18th century (and of all time, really). His work on the famous " Problem of the seven Königsberg bridges ", are generally cited as the origin of  graph theory .

The town of Königsberg in Prussia (now Kaliningrad, Russia) was located on both sides of the Pregel River, and included two large islands - Kneiphof and Lomse - which were connected to each other or to both mainland parts of the city, by seven bridges (as shown in the figure below left). The problem was to design a walk through the city that would cross each of these bridges once and only once.

Euler, recognizing that the relevant constraints were the four bodies of land and the seven bridges, drew the first known visual representation of a modern graph. A modern graphic, as shown in the image at the bottom right, is represented by a set of points, called  v ertices or nodes, connected by a set of lines called edges.

This abstraction of a concrete problem concerning a city and bridges, etc. to a graph makes the problem mathematically treatable, because this abstract representation includes only the information important to solve the problem. Euler actually proved that this specific problem has no solution. However, the difficulty he encountered was the  development of a suitable analytical technique , and subsequent tests which established this assertion with mathematical rigor. From there, the branch of mathematics known as graph theory lay dormant for decades. In modern times, however, its applications are finally exploding.

Introduction to graph theory

As mentioned earlier, I do not aim to give a complete introduction to graph theory. The next section still contains some of the basics when it comes to different types of charts etc., which is relevant to the example we will discuss about optimizing paths later.

Graph theory  is ultimately the study of  relationships . Given a set of nodes and connections, which can sum up everything from city layouts to computer data, graph theory provides a useful tool for quantifying and simplifying the many moving parts of dynamic systems. Studying graphics through a framework provides answers to many layout, networking, optimization, mapping, and operation issues.

Graphics can be used to model many types of relationships and processes in physical, biological, social, and information systems, and have a wide range of useful applications such as e.g.

  1. Find communities in networks, such as social media (friend / connection recommendations), or in recent days for a possible spread of COVID19 in the community through contacts.

  2. Ranking / ranking of hyperlinks in search engines.

  3. GPS / Google maps to find the shortest way home.

  4. Study of molecules and atoms in chemistry.

  5. DNA sequencing

  6. Computer network security

  7. ….. and much more….

A simple example of a 6-node graph

A slightly more complex social media network. Credit:  Martin Grandjean  Wikimedia

As mentioned, there are several types of charts that describe different types of issues (and their constraints). A nice presentation of different types of graphics can also be found in a  previous article  by  Kelvin José,  and the section below is a subset of that article.

Types of charts

There are different types of graphs available and we need to make sure that we understand the type of graph we are working with when we are programmatically solving a problem that includes graphs.

  • Undirected graphics

As the name suggests, there will be no direction specified between nodes. Thus, an edge from node A to B would be  identical  to the edge from B to A.

In the graphic above, each node could represent different cities and the edges show the two-way roads.

  • Directed Graphics (DiGraphs)

Unlike unoriented graphics, oriented graphics have an orientation or  direction  between different nodes. This means that if you have an edge from node A to B, you can only move from A to B.

As in the previous example, if we consider the nodes as cities, we have a direction of city 1 to 2. This means that you can drive from city 1 to city 2 but not back to city 1, because there is no direction back to city 1 from 2. But if we look closely at the graph, we can see two-way cities. For example, cities 3 and 4 have directions on both sides.

  • Weighted charts

Many graphs can have edges containing associated weight to represent real world implication such as cost, distance, quantity, etc.

Weighted charts can be a directed graph or not. The one we have in this example is an undirected weighted graph. The cost (or distance) between the green node and the orange node (and vice versa) is 3. As in our previous example, if you want to travel between two cities, say green and orange city, we will have to travel 3 miles. These metrics are self-defined and can be modified depending on the situation. For a more elaborate example, consider that you need to go to the pink city of green. If you look at the city graph, we cannot find any direct road or border between the two cities. So what we can do is travel via another city. The most promising routes would be from green to pink via orange and blue. If the weights are costs between cities,

There can be multiple weights associated with each edge, including distance, travel time, or monetary cost. These weighted charts are commonly used to program GPS and travel planning search engines that compare flight times and costs.

Graph theory → Route optimization

Having (hopefully) convinced that graph theory is worth knowing, now is the time to focus on our route planning example when picking items from our warehouse.


The challenge here is that, given an input 'pick list', we should find the shortest route that passes all the pickup points, but also respects the restrictions on where it is possible / allowed to go. drive. The assumptions and constraints here are that crossing between warehouse lanes is only allowed at marked “turning points”. In addition, the direction of travel must follow the legal driving direction specified for each lane.


This problem can be formulated as an optimization problem in graph theory. All pickup points in the warehouse form a “node” in the graph, where the edges represent the allowed lanes / lanes and the distances between the nodes. To introduce the problem in a more formal way, let's start with a simplified example.

The graphic below shows 2 lanes with 5 shelves / collection points per lane. All tablets are shown here as a node in the graph, with an address between 1 and 10. The arrows indicate the permitted driving direction, where the double arrows indicate that you can drive in both directions. Pretty simple, right?

Being able to represent the allowed driving routes in the form of a graph, means that we can use mathematical techniques known from graph theory to find the optimal "driving route" between nodes (i.e. say stock shelves in our warehouse).

The above example graph can be described mathematically by a " adjacency matrix ". The adjacency matrix on the right in the figure below is therefore a representation of our “warehouse graph”, which indicates all the driving routes allowed between the different nodes.

  • 1 Example:  You are allowed to travel from node 2 → 3, but not from node 3 → 2. This is indicated by the “1” in the adjacency matrix to the right.

  • 2 Example:  You are allowed to go both from node 8 → 3 and from 3 → 8, again indicated by the "1's" in the adjacency matrix (which in this case is symmetrical with respect to the direction of travel ).

Back to our warehouse problem:

A real warehouse is of course larger and more complex than the example above. However, the fundamentals of representing the problem through a graph remain the same. To make the real problem a bit simpler (and more visually relevant to this article), I reduced the total number of shelves / pickup points (roughly every 50 shelves included, marked with black squares in the figure below). below). All pickup points are assigned an address (“node number”) from 1 to 74. Other relevant constraints mentioned previously, such as driving directions allowed in each of the lanes, as well as “turn points” and shortcuts allowed between lanes are also shown in the figure.

Graphic representation of our simplified warehouse

The next step then consists in representing this graph in the form of a contiguity matrix. Since we are here interested in finding both the optimal route and the total distance, we also need to include the driving distance between the different nodes in the matrix.

Adequacy matrix for the "warehouse graph"

This matrix indicates all the constraints concerning at the same time the authorized direction of displacement, which “shortcuts” are authorized, all the other restrictions as well as the distance traveled between the nodes (illustrated by the color). By way of example, the "shortcut" between nodes 21 and 41 shown in the graphic representation can be clearly identified also in the adjacency matrix. The “white areas” of the matrix represent the unauthorized paths, indicated by an “infinite” distance between these nodes.

From graphical representation to path optimization

Just having an abstract representation of our warehouse as a graph, of course, does not solve our real problem. The idea is rather that through this graphical representation, we can now use the mathematical framework and the algorithms of graph theory to solve it!

Since graph optimization is a well-known field in mathematics, there are several methods and algorithms to solve this type of problem. In this example case, I have based the solution on the " Floyd-Warshall Algorithm ", which is a well known  algorithm  for finding  shortest paths  in a  weighted graph . A single run of the algorithm will find the lengths (weights added) of the shortest paths between all pairs of nodes.Although it does not return the details of the paths themselves, it is possible to reconstruct the paths with simple modifications to the algorithm.

If you give this algorithm as input an "order list" where you cycle through a list of items that you want to pick up, then you should be able to get the optimal route that minimizes the total distance traveled to collect all the items. list items.

Example:  Let's start by visualizing the results of a (short) picklist as follows: Start with node '0', pick up items at locations / nodes 15, 45, 58 and 73 (where these locations are shown in the figure below). The algorithm finds the shortest possible route between these points by calculating the "distance matrix",  D , which can then be used to determine the total distance traveled between all locations / nodes in the pick list.

  • Step 1:  D [0] [15] → 90 m

  • Step 2:  D [15] [45] → 52 m

  • Step 3:  D [45] [58] → 34 m

  • Step 4:  D [58] [73] → 92 m

Total distance = 268m

Optimized driving route from the pick list

Having tested several input “pick lists” and checking the suggested driving routes and the calculated distance, the algorithm was able to find the optimal route in all cases. The algorithm respects all the constraints imposed, such as the authorized direction of travel, and uses all authorized “shortcuts” to minimize the total distance.

From path optimization to useful information

As the example above shows, we have developed an optimization algorithm that calculates the optimal driving route through all the points in a list of picking orders (for a simplified version of the warehouse). By providing a list of pickup orders as input, one can thus relatively easily calculate statistics on typical mileage per. preparation order. These statistics can then also be filtered on various information such as item type, customer, date, etc. In the next section, I have therefore chosen some examples on how one can extract interesting statistics from such a path optimization tool.

By doing this, I first generated 10,000 1 picking order lists where the number of items per list varies from 30 to 3 items, located at random pickup points in the warehouse (address 74 to XNUMX in figure above). By performing the path optimization procedure on all these pick lists, we can then extract some interesting statistics.

1 Example: Calculate the mileage based on the number of units per. list of preparation orders. Here you would naturally assume that the total mileage increases with the number of items you have to choose. But, at some level, it will start to fade. This is because we eventually have to stop in all the halls of the warehouse to retrieve the goods, which then prevents us from using clever "shortcuts" to minimize the total distance traveled. This trend can be illustrated in the figure below to the left, which shows that for more than about 15-20 units per pick order, adding additional items doesn't add much to the total mileage (as you have to cross all the halls of the warehouse anyway). Note that the numbers show a "density graph" of the typical mileage distribution by. list of preparation orders.

Another interesting statistic, which shows the same trend, is the distribution of the distance traveled by element selected in the figure on the right. Here we see that for pick lists with few items, the typical mileage per. the item is relatively high (with great variance, depending on how "lucky" we are with some items being located in the same lane, etc.). For pick lists with multiple items, the mileage per. the article gradually decreases. This type of statistic can therefore be interesting to study in more detail, in order to optimize the number of items that each picking order list must contain in order to minimize the mileage per item picked

Estimate of the distance traveled per list / item compared to the number of items per list.

2 Example:  Here I have used real world data which also contains additional information in the form of a customer id (here shown for only two customers). We can then take a closer look at the breakdown in mileage by. list of preparation orders for the two clients. For example, do you typically have to travel longer distances to pick one customer's goods over another? And, should you charge this customer extra for that extra cost?

The figure below on the left shows the breakdown in mileage for “Customer 1” and “Customer 2” respectively. One of the things we can deduce from this is that for customer 2, most picking order lists have a significantly shorter driving distance than customer 1. This can also be shown by calculating the average mileage per. list of orders for the two customers (figure on the right).

This type of information can for example be used to implement pricing models where the price of the product to the customer is also based on the mileage per order. For customers whose order involves more driving (and therefore also more time and higher costs), you may consider charging extra compared to orders involving short driving distances.


In the end, I hope I have convinced you that graph theory is not just an abstract mathematical concept, but actually has many useful and interesting applications! Hopefully, the above examples will be helpful to some of you in solving similar problems later, or at least satisfying some of your curiosity about graph theory and some of its applications.

$ 0.00