Homework 10: Max flow programming
- This homework is due at 11:59.59pm on Wednesday, May 4, 2022 (it has been extended from Apr 27th). Please note that no late days can be used for this assignment.
Description and Deliverables
In this assignment, you will use your understanding of the max flow algorithms that we discussed in class to solve some programming problems. Start early because the semester end becomes hectic and we cannot extend deadlines in order to complete grading on time.
To receive full credit for this project, you will turn in the following files:
-
A file named
flows.py
that contains your solutions to this assignment. The solution should contain two functions,gold
androunding
that solves the two parts of this assignment. -
The autograder will run the example files for you, but like H7, your final grade will be determined by hidden test cases that we will set after the assignment is submitted.
You will need to program this assignment in Python for the autograder to work. We will provide several resources, and several starter files for you to begin your implementation. In particular, you can begin this assignment by downloading the following files:
Computing a max flow
The first task of this assignment is to write a routine that computes the maxflow of a graph. You will use this routine in both of the remaining parts of this assignment. Your solution should be based on the simplegraphs.py
library that we provide. Recall that graphs in that library allow edges to have arbitrary labels associated with them; here you can associate the edge capacity (instead of weight as we did in the previous assignment).
We have provided some basic code to get started.
-
Recall that the EK2 approach uses a breadth-first search to find augmenting paths. The
simplegraphs.py
library that we provide contains such a methodBFS(G,s)
which returns a dictionary of distances and parent pointers. You can use these distances and parent pointers in order to find short paths from the source to the sink. -
After determining if an augmenting path exists, the goal is to push flow across this path, while also adding the residual edges. Whenever you push flow along the edge $(x,y)$, you should add the same amount of capacity along the edge $(y,x)$. Similarly, if after pushing a flow, an edge has 0 remaining capacity, you should remove it from the graph so that the BFS routine does not use it. You can use the dictionary routines of python in this assignment to (a) check whether an edge exists, (b) check whether the remaining capacity is 0 and use
pop
to remove it if so, (c) add edges to teh graph, and increment their capacities as needed. -
Finally, you can compute a flow by simply comparing the initial edge capacity of each edge in the graph, with the edge capacities that remain in the final residual graph. This is why it is a good idea to copy the input graph in the first step of your maxflow solution.
-
Our solution consists of 3 functions,
maxflow
,augment
,find_path
and require 38 lines of Python code after removing comments.
Gold Bullion face
In the ruins of Pompeii one finds the House of the Tragic Poet which has a famous mosaic floor proclaiming visitors to ``Beware of the Dog." In Boston, a less tragic but wealthier poet has commissioned a mosaic using 1kg bars of solid gold, specifically the type CreditSuisse mints in the dimension 80mmx40mm.
The input will consist of a file that gives the coordinates of the squares on an imaginary grid that should be covered by goldbars. These coordinates will be in the range [1,2000] in both the x and y directions. The output should consists of a list of tiles, one per line. Each tile should be designated by one of its coordinates, followed by ‘–>’ followed by the second coordinate. The tiles can be presented in any order, as our autograder will ensure that the entire input is covered. If the input cannot be tiled, then your program should output ‘No solution exists’.
An example input file and ouptut:
$ cat gold.3.txt
2 3
2 4
2 5
3 5
4 5
4 6
4 7
4 8
4 9
5 5
$ python3 flows.py gold gold.3.txt
2 4 --> 2 3
3 5 --> 2 5
4 6 --> 4 7
4 8 --> 4 9
5 5 --> 4 5
Design an algorithm that takes such a specification as input and determines if the indicated squares in the design can be entirely covered with gold bullion bars. Note that gold bars can never be split in half! Each gold bar covers exactly two of the squares. As an example, consider the gold bars on the left, and the pixel art of a husky, and the tiling in gold bars. You can run this example using the husky.txt
file in the examples.
Ideas for solution
- We are going to make a bipartite graph out of the design and then check whether it has a maximal matching. If it has a maximal matching, then the design can be tiled. To construct the bipartite graph, imagine the design is inscribed in a checkerboard (i.e. the design fits into some $k \times k$ region). The left side of this bipartite graph will consist of all the white squares of the checkerboard that are part of the design. The right side consists of all of the black squares of the checkerboard that are part of the design. Draw an edge from every white square to the black squares that are adjacent. This is the bipartite graph.
Notice that each gold bar is going to cover one white square and one black square. So we are looking for a matching that pairs each white box of our graph with a black box—that pairing represents the placement of a gold bar. If the number of black and white squares is not equal in the $(L,R,E)$ graph, then no solution can exist because no matter how it is placed, each bar covers exactly one black and one white square. If they are equal, then we compute a bipartite matching. If it is a full matching, then for the same reason, the design can be tiled. Otherwise, some square will be left uncovered in every tiling. Conversely, if there is a tiling, then the tiling can easily be mapped to a bipartite matching in which each white square is matched with the black square that its tile covers.
As an example, consider the following overlay of a white-red checkerboard on a tiling of Bob. Every light-gray square (i.e., where a white square from the checkerboard overlays a black square in the image) corresponds to a left node in the bipartite graph, and every dark-red square (i.e., where a red square from the checkerboard overlays a black square in the image) corresponds to a right node in the bipartite graph.
-
To show this formally, we need to prove that the image can be tiled if and only if the $(L,R,E)$ instance has a matching of the appropriate size. Think about how you would go about proving that this statement is true, although that is not required for this assignment.
-
My solution takes roughly 25 lines to setup the graph, and 10 lines to compute and print a solution.
-
The simplegraphs library allows nodes to be named anything, including strings. You can consider using a string representation of a coordinate to make the graph creation process easier.
Matrix Rounding
Suppose we are given a large matrix $A[1 \ldots r][1 \ldots c]$ of population data (each entry is a non-negative number). We want to publish matrix $A$, but need to simplify it for the public by rounding the entries into tens by replacing each entry $x$ in $A$ with either $\lceil x/10 \rceil$ or $\lfloor x/10 \rfloor$ (the same trick can be used to round to thousands, etc). However, the matrix represents important statistical data, and we do not want to change the sums of entries in any row or column. For example:
$$ \left[ \begin{array}{ccc} 12 & 34 & 24 \\\ 39 & 40 & 21 \\\ 79 & 16 & 5 \end{array} \right] \rightarrow \left[ \begin{array}{ccc} 10 & 40 & 20 \\\ 40 & 40 & 20 \\\ 80 & 10 & 10 \end{array} \right] $$
Design an algorithm that takes as input a matrix, and outputs a rounded matrix or prints that the matrix cannot be rounded. The input will consist of the $r$ rows of the matrix on a separate line. Each row consists of the $c$ values that are separated by spaces. The output will consist of the printed matrix in the same form, one row per line, each element separated by spaces. Note that a given input matrix can have several valid roundings, we will check to make sure that yours is OK. For example, this is how we will run your program on the input above:
$ cat 3.3.txt
12 34 24
39 40 21
79 16 5
$ python3 flows.py rounding 3.3.txt
20 30 20
40 40 20
70 20 10
Hints
-
First, the algorithm should check if any row or column sums to a value that is not divisible by 10. If so, then there is no solution since it is not possible to round without changing that particular row/column sum and the program should output ‘No solution exists’. If all rows and columns have sums that are divisible by 10, then the algorithm proceeds to use max flow to check if rounding is possible.
-
Construct a graph similar to the bipartite matching problem. Create vertices $r_1, \ldots, r_m$, each representing a row of $A$ and vertices $c_1, \ldots, c_n$, each representing a column of $A$. We have a directed edge from $r_i$ to $c_j$ if and only if $A[i][j]$ is not divisible by 10. The capacity of the edge (if it exists) is $c(r_i, c_j) = 10$.
Additionally we add a source vertex $s$ and a sink vertex $t$. There is an edge from $s$ to every row vertex $r_i$ with capacity $c(s, r_i) = \sum_{j=1}^n (A[i][j] \bmod 10)$ i.e. sum of the remainders of entries in row $i$. There is also an edge from every column vertex $c_j$ to $t$ with capacity $c(c_j, t) = \sum_{i=1}^m (A[i][j] \bmod 10)$ i.e. sum of the remainders of entries in column $j$.
The algorithm computes the maximum flow on this network. If the maximum flow saturates all edges from $s$ then it declares rounding is possible. Otherwise, the algorithm declares rounding is not possible.
- How can you output an actual rounding? You will have to inspect the edges of the flow to determine how to round each matrix element.
The network constructed above has $m+n+2$ vertices and $O(mn)$ edges. The maximum flow value is no more than $mn$ (the maximum total capacities from $s$). Thus, the running time of the Ford-Fulkerson algorithm is $O(m^2 n^2)$.
- My solution to
rounding
takes 35 lines, and I have a few helper functions that span from 1 line to 10 lines.