Logo

Introduction

Motivating Communication Complexity

As networks and data grow in complexity and size, it will become increasingly difficult for individual machines to compute functions on such data. For example, suppose we wish to fly from San Diego to Santorini. There is no single airline that has both flights out of San Diego or into Santorini. Therefore, we will need to consult the flight database of various airlines to construct such a route. Do we need to ask every airline provider to communicate their entire schedule of flights to us? Are there techniques that we can employ to reduce the amount of flights we have to search through?

Bounding the amount of communication necessary to solve such problems is the primary objective of communication complexity. By studying communication complexity, we find lower and upper bounds on the minimum amount of communication necessary to solve such problems. In practice, these bounds can inform us whether certain algorithms are optimal, or if it is possible to optimize them further.

Communication Complexity Basics

Communication complexity studies the communication requirements to solve problems where inputs are distributed among at least two parties, each with unbounded computing power. The two-party problem is typically stated as follows:

There are two players, Alice and Bob, who respectively have an nn-bit string, aa and bb, that may or may not be the same. The goal is to compute a boolean function, f(a,b)f(a, b), that depends on both aa and bb. It suffices for one player to find the answer f(a,b)f(a,b), since communicating this to the other player can always be done in constant communication.

Under the two party model, it is always possible to compute f(a,b)f(a,b) by sending all of aa to bb, since we assume each party has unbounded computing power. This approach of sending the entire input from one party to another is what is called the trivial solution. We say problems are maximally hard or maximally complicated when we cannot do asymptotically better than the trivial solution.

We call such problems maximally hard because they have a deterministic communication complexity of Θ(n)\Theta(n) bits. We call it deterministic as we do not allow error. Thus in maximally hard problems, Alice cannot do asymptotically better than sending the nn bits of aa to Bob, and likewise for Bob. An example of a maximally hard problem is 2-player equality (EQ\mathsf{EQ}):

Claim. Let a,b{0,1}na, b \in \{0,1\}^n. That is, aa and bb are nn-bit strings. Define equality as EQ(a,b)=1\mathsf{EQ}(a,b)=1 if a=ba=b, and EQ(a,b)=0\mathsf{EQ}(a,b)=0 if aba \ne b. The deterministic communication complexity of two-party equality, denoted D(EQ)D(\mathsf{EQ}), is Ω(n)\Omega(n).

Proof. We can represent all possible nn-bit inputs to the two parties using a 2n×2n2^n \times 2^n boolean matrix MM, whose columns are indexed by aa and whose rows by bb. Then let M(b,a)=1M(b,a)=1 if a=ba=b and M(b,a)=0M(b,a)=0 otherwise.

Therefore, MM is the identity matrix. It is a well-known fact that the deterministic communication complexity of a problem is at least the base-2 logarithm of the rank of its boolean matrix. Since equality’s matrix MM is the 2n×2n2^n \times 2^n identity matrix with rank 2n2^n, we have D(EQ)=Ω(log22n)=Ω(n)D(\mathsf{EQ})=\Omega(\log_2 2^n)=\Omega(n).

An example communication matrix for the equality function.

Communication matrix for the equality function on two-bit inputs. The entry at column aa and row bb is 11 when a=ba=b and 00 otherwise.

As we will soon see, problems like equality are surprisingly useful for proving communication lower bounds for other, more complicated problems. We can do this by embedding equality into a more complex problem, such that an efficient solution to the more complicated problem can also efficiently solve equality. More generally, we say problem AA is at least as hard as problem BB if we can embed BB in AA. The communication complexity of AA is then at least the communication complexity of BB.

The Graph Traversal Problem

In this report, we study the problem of finding a path in a graph whose vertices may be owned by various players. This is an abstract version of the motivating problem in the section Motivating Communication Complexity, which we choose to work in so that any bounds we prove will be applicable to a wide range of problems.

Informally, the graph traversal problem presents a graph, where players may own vertices. There is a start and end vertex. The players wish to know whether there exists a path from the start to the end through only vertices owned by some player. Each player knows what the graph looks like, and which vertices they own though not which vertices are owned by other players.

Formally, the graph traversal problem for two players, denoted TRAV2\mathsf{TRAV}_2, is defined as follows:

Definition. Let G=(V,E)G = (V,E) be an unweighted, undirected graph, where VV is the set of all nodes and EE is the set of edges. Alice and Bob respectively own a vertex subset VA,VBVV_A, V_B \subseteq V of the graph, as well as all edges EA={(u,v)E{u,v}VA}E_A = \{(u,v) \in E \mid \{u, v\} \cap V_A \neq \emptyset\} and similarly for EBE_B. These are all edges connected to some vertex in VAV_A or VBV_B, respectively. Thus each player respectively owns subgraphs GA=(VA,EA)G_A = (V_A, E_A) and GB=(VB,EB)G_B = (V_B, E_B) of GG. Note that multiple players can own the same vertices and edges, and some vertices and edges may have no owner whatsoever.

Let the graph GG have V=n|V|=n vertices. Then each player’s subgraph can be represented by nn-bit strings xx and yy, where 11 at index jj indicates the player owns vertex jj, and 00 means the player does not own jj. This definition of xx and yy means that we can use xx and yy interchangeably with GAG_A and GBG_B respectively. Now, given a fixed GG, start vertex vv, and end vertex ww, an instance of TRAV2\mathsf{TRAV}_2 is defined as follows: TRAV2:{0,1}n×{0,1}n{0,1}\mathsf{TRAV}_2 : \{0,1\}^n \times \{0,1\}^n \to \{0,1\}, where TRAV2(x,y)=1\mathsf{TRAV}_2(x,y)=1 iff there exists a path from vv to ww through only vertices and edges in GAGBG_A \cup G_B, and TRAV2(x,y)=0\mathsf{TRAV}_2(x,y)=0 otherwise.

The following figures illustrate what TRAV2\mathsf{TRAV}_2 may look like.

Example where a valid path exists for TRAV2 Example where no valid path exists for TRAV2