Introduction
Motivating Communication Complexity
As networks and data grow in complexity and size, it will become increasingly difficult for individual machines to compute functions on such data. For example, suppose we wish to fly from San Diego to Santorini. There is no single airline that has both flights out of San Diego or into Santorini. Therefore, we will need to consult the flight database of various airlines to construct such a route. Do we need to ask every airline provider to communicate their entire schedule of flights to us? Are there techniques that we can employ to reduce the amount of flights we have to search through?
Bounding the amount of communication necessary to solve such problems is the primary objective of communication complexity. By studying communication complexity, we find lower and upper bounds on the minimum amount of communication necessary to solve such problems. In practice, these bounds can inform us whether certain algorithms are optimal, or if it is possible to optimize them further.
Communication Complexity Basics
Communication complexity studies the communication requirements to solve problems where inputs are distributed among at least two parties, each with unbounded computing power. The two-party problem is typically stated as follows:
There are two players, Alice and Bob, who respectively have an -bit string, and , that may or may not be the same. The goal is to compute a boolean function, , that depends on both and . It suffices for one player to find the answer , since communicating this to the other player can always be done in constant communication.
Under the two party model, it is always possible to compute by sending all of to , since we assume each party has unbounded computing power. This approach of sending the entire input from one party to another is what is called the trivial solution. We say problems are maximally hard or maximally complicated when we cannot do asymptotically better than the trivial solution.
We call such problems maximally hard because they have a deterministic communication complexity of bits. We call it deterministic as we do not allow error. Thus in maximally hard problems, Alice cannot do asymptotically better than sending the bits of to Bob, and likewise for Bob. An example of a maximally hard problem is 2-player equality ():
Claim. Let . That is, and are -bit strings. Define equality as if , and if . The deterministic communication complexity of two-party equality, denoted , is .
Proof. We can represent all possible -bit inputs to the two parties using a boolean matrix , whose columns are indexed by and whose rows by . Then let if and otherwise.
Therefore, is the identity matrix. It is a well-known fact that the deterministic communication complexity of a problem is at least the base-2 logarithm of the rank of its boolean matrix. Since equality’s matrix is the identity matrix with rank , we have .
Communication matrix for the equality function on two-bit inputs. The entry at column and row is when and otherwise.
As we will soon see, problems like equality are surprisingly useful for proving communication lower bounds for other, more complicated problems. We can do this by embedding equality into a more complex problem, such that an efficient solution to the more complicated problem can also efficiently solve equality. More generally, we say problem is at least as hard as problem if we can embed in . The communication complexity of is then at least the communication complexity of .
The Graph Traversal Problem
In this report, we study the problem of finding a path in a graph whose vertices may be owned by various players. This is an abstract version of the motivating problem in the section Motivating Communication Complexity, which we choose to work in so that any bounds we prove will be applicable to a wide range of problems.
Informally, the graph traversal problem presents a graph, where players may own vertices. There is a start and end vertex. The players wish to know whether there exists a path from the start to the end through only vertices owned by some player. Each player knows what the graph looks like, and which vertices they own though not which vertices are owned by other players.
Formally, the graph traversal problem for two players, denoted , is defined as follows:
Definition. Let be an unweighted, undirected graph, where is the set of all nodes and is the set of edges. Alice and Bob respectively own a vertex subset of the graph, as well as all edges and similarly for . These are all edges connected to some vertex in or , respectively. Thus each player respectively owns subgraphs and of . Note that multiple players can own the same vertices and edges, and some vertices and edges may have no owner whatsoever.
Let the graph have vertices. Then each player’s subgraph can be represented by -bit strings and , where at index indicates the player owns vertex , and means the player does not own . This definition of and means that we can use and interchangeably with and respectively. Now, given a fixed , start vertex , and end vertex , an instance of is defined as follows: , where iff there exists a path from to through only vertices and edges in , and otherwise.
The following figures illustrate what may look like.