I recently learned that a common interview question for software engineering positions is the Skyline problem, and people struggle a lot to solve it. It also took me a few tries before I had the final solution, but once I got the idea the problem became a lot easier than I thought it would be. Below I will present and explain the different approaches one can take to solve the problem. I am also going to give running time analysis for each of them. Before we go into details, let's start with the problem statement first.

### Problem statement

We have to design a program which helps drawing the skyline of a two-dimensional city given the locations of the rectangular buildings in the city. All of the buildings are built on a flat ground (that is they share a common bottom) and each building B_{i} is represented by a triplet of (l_{i}, r_{i}, h_{i}) where l_{i} and r_{i} are the left and right coordinates, respectively, of the ith building, and h_{i} is the height of the building. In the diagram below there are 8 buildings, represented from left to right by the triplets (1, 5, 11), (2, 7, 6), (3, 9, 13), (12, 16, 7), (14, 25, 3), (19, 22, 18), (23, 29, 13) and (24, 28, 4).

### Input

The input of the program is a sequence of building triplets. In some versions of the problem further restrictions are applied on the input, like all coordinates are non-negative integers less than 10,000, there is at least 1 and at most 5,000 buildings in the input sequence, and the triplets are sorted by l_{i} (the left x-coordinate of the building) in ascending order. Sometimes the input source is a file and the format of this file is also described, but we're going to ignore it because pre-processing the input sequence of buildings is completely irrelevant to the problem.

### Output

The output is a collection of points which describe the path of the skyline. In some versions of the problem this collection of points is represented by a sequence of numbers p_{1}, p_{2}, ..., p_{n}, such that the point p_{i} represents a horizontal line drawn at height p_{i} if i is even, and it represents a vertical line drawn at position p_{i} if i is odd. In our case the collection of points will be a sequence of P_{1}, P_{2}, ..., P_{n} pairs of (x_{i}, h_{i}) where P_{i}(x_{i}, h_{i}) represents the h_{i} height of the skyline at position x_{i}. Note that technically it is the exact same representation as the previous one, except having pairs of numbers representing the height at a given position is more meaningful than a parity-based interpretation of numbers. In the diagram above the skyline is drawn with a yellow line around the gray buildings and it is represented by the sequence of position-height pairs (1, 11), (3, 13), (9, 0), (12, 7), (16, 3), (19, 18), (22, 3), (23, 13) and (29, 0).

### The brute-force solution

Let's simplify the problem by not ignoring all the restrictions mentioned in the Input paragraph. The most important piece of information is that we know that the left and right coordinates of each and every building are non-negative integers less than 10,000. Now why is this important? Because we can assign a height-value to every distinct x_{i} coordinate where i is between 0 and 9,999. How do we do that? It's simple.

- Allocate an array for 10,000 elements and initialize all of the elements to 0. Let's call this array H.
- Iterate over all of the buildings and for every B
_{i}building iterate on the range of [l_{i}..r_{i}) where l_{i}is the left, r_{i}is the right coordinate of the building B_{i}. - For every x
_{j}element of this range check if h_{i}> H[x_{j}], that is if building B_{i}is taller than the current height-value at position x_{j}. If so, replace H[x_{j}] with h_{i}.

Notice that the result is not an *optimal skyline* (a skyline with minimum number of position-height pairs) because it will likely contain redundant information. What points in the skyline are considered redundant? Those points where the height didn't change compared to the preceding point's height. More formally, given the sequence of P_{1}, P_{2}, ..., P_{i-1}, P_{i}, ..., P_{n} pairs of (x, h), the point P_{i} can be left out from the skyline if h_{i} = h_{i-1}. To get an optimal skyline we need to map only those i indexes where the height changes. Here's a working implementation in Scala which - instead of allocating an array for 10,000 elements - creates an array that is exactly as big as needed to store all possible building coordinates.

### The implementation

First we're going to define some data structures to represent a **building**, a **point** in the skyline and the **skyline**:

```
1
2
3
4
5
6
7
8
9
```

```
case class Building(left: Int, right: Int, height: Int) {
require(left <= right, "left must not be greater than right")
require(height >= 0, "height must be non-negative")
}
case class Point(x: Int, height: Int)
class Skyline private (val points: List[Point])
```

Now we can implement our first solution:

```
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
```

```
object Skyline {
def apply(buildings: Building*): Skyline = if (buildings.length == 0) {
new Skyline(List.empty)
} else {
val (min, max) = buildings.foldLeft((Integer.MAX_VALUE, Integer.MIN_VALUE))((minMax: (Int, Int), b: Building) => (math.min(minMax._1, b.left), math.max(minMax._2, b.right)))
val heights = Array.ofDim[Int](max - min + 1)
buildings.foreach(b => Range(b.left, b.right).foreach(i => if (heights(i - min) < b.height) heights(i - min) = b.height))
val points = ListBuffer.empty[Point]
heights.foldLeft((0, 0)) { (res: (Int, Int), height: Int) => val (idx, prevHeight) = res
if (height != prevHeight) points += Point(idx + min, height)
(idx + 1, height)
}
new Skyline(points.toList)
}
}
```

- Computed skyline

### Time and space complexity

Now that we've solved the Skyline problem, let's have a look at the resources necessary to execute this algorithm. Even though the problem statement may define the maximum number of buildings and the biggest coordinate, we assume that the above algorithm is designed to work with inputs of arbitrary number of buildings and arbitrary (positive) integer coordinates. From now on n indicates the number of buildings in the input sequence and m indicates the maximum coordinate.

The best case is clearly when n = 0, that is the input sequence is empty (lines *3* and *4*). The running time is O(1) (we have to check if the input sequence is empty) and the used space is also O(1) (we create an empty skyline).

But what's the running time when there are actually some buildings in the input sequence? In lines *6* and *7* we first find the minimum and maximum coordinate values then allocate an array of size max - min + 1. To do so, we have to search the whole input sequence once, which means linear running time, that is O(n) for n buildings^{*}, and the allocated array uses O(m) space in the worst case (when min = 0 and max = m - 1). Next (in line *8*) we iterate over the input buildings and save the height value for the coordinates the actual building spans over. In the worst case we have n equal-size buildings, each having l = 0 left and r = m - 1 right coordinates, that is every building spans over the whole [0..m) interval. Thus the running time of setting the height of every position is O(n * m). Finally - in lines *9*-*13* - we go once more over the elements of this array to find those positions where the height changes and include them in the final result (in the `ListBuffer`

defined in line *9*). This is again O(n) running time. Therefore, the overall time-complexity is O(n) + O(n * m) + O(n) = O(n * (m + 2)), which is a lot larger than O(n^{2}) if m > n. It seems that our algorithm performs pretty badly (worse than quadratic running time) and it also uses a lot of space to calculate the final solution.

* If we know in advance that all coordinates are less than 10,000, we can skip this min-max search and just simply allocate an array for 10,000 elements. What this means is that there is no O(n) penalty on the running time but we may have a bigger array than needed.

Notice that even in this simplest solution we can completely ignore whether or not the buildings are sorted in the input sequence. But something is not quite right besides the very bad running time and used space. This solution only works if the building coordinates are positive integers (so that we can use an array to calculate the heights). And what if the maximum coordinate is way bigger than 10,000? Can we be sure that we can always allocate a big enough array to store all the heights? And if the coordinates are not even integers or perhaps negative? And finally, can we come up with a more time- and space-efficient algorithm? Can we do better?

### The inductive solution

Sure we can! It'd be a huge speed-up if somehow we could determine the skyline by calculating the height for those coordinates only where it matters, wouldn't it? Intuition tells us that if we can insert a building into an existing skyline then instead of all the coordinates the building spans over we only need to check the height at the left and right coordinates of the building plus those coordinates of the skyline the building overlaps with and may modify. Similarly to the mathematical induction we can construct an algorithm that does exactly that.

The base case is trivial, every building can be converted into a skyline: given B(l, r, h) the skyline is P(l, h), P(r, 0). Now the inductive step is that we already have a skyline into which we need to insert a new building. Obviously, if we can insert a building into the skyline then we can insert all the buildings to get the final solution. The algorithm is as follows:

- If the current skyline is empty add the two P(l
_{1}, h_{1}), P(r_{1}, 0) points representing B_{1}to the skyline. - Otherwise find the first P
_{i}(x, h) point of the skyline which immediately precedes l_{j}, the left coordinate of the building B_{j}, that is x_{i}< l_{j}≤ x_{i + 1}. Save h_{i}to H if there is such a P_{i}point, otherwise set H to 0. - Prepend the skyline with P(l
_{j}, h_{j}) if there is no such P_{i}point or insert P after P_{i}if h_{i}< h_{j}. - For every P
_{k}(x, h) point where l_{j}≤ x_{k}< r_{j}replace P_{k}with P(x_{k}, h_{j}) if h_{j}> h_{k}and h_{j}≠ h_{k - 1}, or remove P_{k}if h_{j}≥ h_{k}and h_{j}= h_{k - 1}. Save h_{k}to H. - Let P
_{l}(x, h) denote the first point in the skyline such that r_{j}≤ x_{l}. Append P(r_{j}, 0) to the skyline if there is no such point or insert P(r_{j}, H) before P_{l}if r_{j}< x_{l}and H < h_{j}.

This algorithm always produces an optimal skyline since in step 4 it removes those points which would be duplicates. This is done for every inserted building, so the final skyline is described with the minimum number of points. Notice that there is no need to store all the heights for all the coordinates of the tallest buildings in an array, so technically the coordinates can be negative or real numbers. Nevertheless, we are going to work with the previously defined `Building`

and `Point`

data structures which use integers. This, however, is not a requirement.

### The implementation

Now let's see some code, a functional, recursive implementation. There are some minor differences between those few lines of code below and the algorithm described above which are worth noting, even though they are only implementation details. While the algorithm above suggests an implementation where the actual skyline is being modified on the fly in every iteration by inserting and removing points, the Scala code below, however, emits a new skyline when a building is inserted and uses that as the input skyline for the next building. Another difference is that the described algorithm works with a building and a skyline, while the code below first converts the actual building into two `Point`

s as if it was a skyline. This is only done so that it's easier to work with the building's coordinates recursively.

```
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
```

```
object Skyline {
def apply(buildings: Building*): Skyline = new Skyline(
buildings.foldLeft(List.empty[Point]) { (points: List[Point], b: Building) =>
mergePoints(List(Point(b.left, b.height), Point(b.right, 0)), points, 0, 0, 0, ListBuffer.empty[Point])
}
)
@tailrec private def mergePoints(p1: List[Point], p2: List[Point], prevHgt1: Int, prevHgt2: Int, currHgt: Int, res: ListBuffer[Point]): List[Point] = if (p2.isEmpty) {
(res ++= p1).toList
} else if (p1.isEmpty) {
(res ++= p2).toList
} else {
val (firstPt1, firstPt2) = (p1.head, p2.head)
val (hgt1, hgt2) = if (firstPt1.x == firstPt2.x) (firstPt1.height, firstPt2.height) else if (firstPt1.x < firstPt2.x) (firstPt1.height, prevHgt2) else (prevHgt1, firstPt2.height)
val maxHgt = math.max(hgt1, hgt2)
val (points1, points2) = if (firstPt1.x == firstPt2.x) (p1.tail, p2.tail) else if (firstPt1.x < firstPt2.x) (p1.tail, p2) else (p1, p2.tail)
mergePoints(points1, points2, hgt1, hgt2, maxHgt, if (maxHgt != currHgt) res += (Point(math.min(firstPt1.x, firstPt2.x), maxHgt)) else res)
}
}
```

- Building
- Existing skyline
- New skyline

### Time and space complexity

How much did we improve the brute force solution with this more sophisticated approach? The main work is done inside the `mergePoints`

function. The trivial case, when the skyline is empty, is handled in lines *9*-*10* by simply adding all the points from the `p1`

list - which represents the building as skyline points and as such it contains maximum 2 points at any moment - to the result. This clearly runs in O(1) time. If the algorithm already inserted n - 1 buildings and it's about to insert the nth, things get a bit more complicated, but it'll become clear in a moment as I explain.

Since there is an existing skyline and there is a building to be inserted, the `mergePoints`

function skips the first few lines of code and continues in line *14* where it selects the first point (left-hand side) of the building and the first point of the skyline. Because of it always creates a new skyline from scratch when it inserts a building, the algorithm has to determine the height of the point to be added to the skyline (lines *15*-*16*). There are three different cases: the building starts right where the skyline does, or it is before or after the first skyline point. If the two coordinates are the same, then the taller one is selected. If the left-hand side of the building comes before the skyline, then the building's height is selected, otherwise the algorithm has not yet found the building's position in the skyline and the first skyline point's height is selected. In line *17* the algorithm decides how to proceed with the unprocessed points. If the building starts where the skyline does then both points are processed and only the right hand side of the building and the rest of the skyline points need to be merged in the next recursive call. If the building's left-hand side is before the skyline then only the building's left coordinate is processed, the skyline is still to be merged with the right-hand side of the building. Finally, if the whole building is positioned after the first skyline point then only this first skyline point is processed and the building still needs to be merged with the rest of the skyline. The last step is adding the point that comes first with the selected height to the newly built skyline (if its height is different than the previously added point) and recursively call the function with the unprocessed points, the actual heights of the building and the skyline and the current height of the newly built skyline. By doing so the algorithm finds the positions where the building's points are to be inserted and it also maintains the correct height throughout the recursion. If during the algorithm either the building or every point of the skyline is fully processed then it just appends the rest of the building / skyline points to the new skyline in lines *10* or *12*.

In the worst case none of the previous n buildings are overlapping so there is a gap between each building in the skyline, and the new building is also distinct. A building is described by 2 points in the skyline (P(l, h), and P(r, 0)) which means that the `p2`

list contains 2 * n elements. Since the new building is also not overlapping with any of the previous buildings and the above algorithm consumes one point at a time, its running time is O(2 * n + 2), where the +2 is coming from the two points representing the building to be inserted. For the sake of simplicity let's swap the number of buildings by the number of points such that we have n / 2 buildings and n points. For the analysis we can ignore the +2, since this is only a negligible constant. Therefore, the running time of the `mergePoints`

function is linear: O(n).

There is one more thing to consider, though: lines *4*-*6* where the algorithm iterates through all the buildings. Now that we know that `mergePoints`

is linear in terms of the number of buildings, we can say that when the algorithm inserts the (n + 1)th building, it has to scan through the skyline points of 0, then 1, then 2, ..., then n buildings. Hence the overall running time is 0 + 1 + 2 + ... + n = n * (n + 1) / 2 = O((n^{2} + n) / 2) ≈ O(n^{2}), which is still quadratic.

The used space is always linear, though, in terms of the number of buildings, and it does not depend on the size of the range the buildings span over. The actual skyline is stored in the `p2`

list and is garbage-collected at the end of every iteration where a new skyline is created from the points of the skyline and the inserted building. Even though the algorithm does not depend on m and the coordinates need not to be non-negative integers anymore, unfortunately it seems we didn't really improve the brute force solution. The question remains: can we do better?

### The divide and conquer solution

As a matter of fact, yes we can! We already represented a building in the previous algorithm as if it was a skyline so you might ask if merging two arbitrary skylines is a significantly different problem than merging a building and a skyline? No, it's not. Luckily, we already have everything in place! The above `mergePoints`

function has no idea about what the `p1`

list represents, be it only a building or a skyline of multiple buildings. The function merges the two skylines in linear time: if `p1`

contains n_{1}, `p2`

contains n_{2} points, then the time complexity is O(n_{1} + n_{2}). This is as good as it gets, it can't be improved because we need to check every point in both skylines to decide which one to keep and which one to drop. If we want to improve, we have to find it elsewhere.

Inserting the buildings one after the other is not the fastest way to solve this problem as we've seen it above. If, however, we first merge pairs of buildings into skylines, then we merge pairs of these skylines into bigger skylines, and then merge pairs of these bigger skylines into even bigger ones, then - since the problem size is halved in every step - after log(n) steps we can compute the final skyline. We can still use the `mergePoints`

function to merge two skylines, which eventually runs in O(n) time in every step (since in the worst case none of the buildings are overlapping), so the overall running time of the divide and conquer approach will be - as expected - O(n * log(n)), which is much-much faster than any of the previous two algorithms. And all this is achieved by merging partial skylines instead of inserting the buildings one by one. Let's see how.

### The implementation

Compared to the previous solution only the implementation of the `apply`

method changes, so I am not going to repeat the `mergePoints`

function here.

```
1
2
3
4
5
6
7
8
9
10
```

```
def apply(buildings: Building*): Skyline = {
def toSkyline(from: Int, to: Int): List[Point] =
if (to - from == 1) List(Point(buildings(from).left, buildings(from).height), Point(buildings(from).right, 0))
else {
val middle = (from + to) / 2
mergePoints(toSkyline(from, middle), toSkyline(middle, to), 0, 0, 0, ListBuffer.empty[Point])
}
new Skyline(if (buildings.length == 0) List.empty else toSkyline(0, buildings.length))
}
```

Unlike the `mergePoint`

function, `toSkyline`

is not tail recursive. This should not be a problem, though, for we don't expect this method to go very deeply by the recursive calls: it reaches maximum log(n) depth which is very small even for a very a large n.

- Skyline #1
- Skyline #2
- New skyline

### Final notes on the used Scala collections

In case you are not or less familiar with Scala, you may be interested in the performance characteristics of the different collection types used throughout the different approaches, namely the mutable `Array`

and `ListBuffer`

and the immutable `List`

and `Range`

. There is an implicit 5th type, the `WrappedArray`

, which is the actual type of the vararg method parameter `buildings: Building*`

, and is actually just an `Array`

, "victim" of Scala's implicit type-conversions.

The methods used on the arrays and the range are `foreach`

and `foldLeft`

. Both iterate through all the elements of the underlying collection, thus their running time is O(n).

The methods used on the lists are `isEmpty`

, `head`

and `tail`

. All of these methods are blazing fast, they run in O(1) time and basically have no impact on the performance of the algorithms.

Lastly, the methods used on the list buffer are `+=`

and `toList`

. Both of these methods require constant running time so they also don't change significantly the overall performance of the Skyline algorithm. We must be careful, though, with modifying a `ListBuffer`

after it has been exported by the `toList`

method: after an export, the next modifying method will run in O(n) time because all of the elements in the buffer must first be copied in order to not to modify the exported list. Fortunately, we don't have to be worried about it, because the `toList`

method is always the last operation after which the `ListBuffer`

will be garbage-collected.

You can download the complete source code and test cases from here.

## 4 comments:

Could implementation 2 not be run in n*logn time? Your implementation is n^2 because every building added to the skyline requires iterating over every existing coordinate in the skyline. If the skyline were stored as a sorted list or binary tree, you could find the closest coordinates to the new building in log(n) time.

What if every new building merged into the skyline makes it wider and wider (that is a new building overlaps with all the points of the Skyline)? Even if the skyline points stored in some sorted data structure so that we can look up the closest coordinates in linearithmic time, the height of all the existing skyline coordinates may need to be adjusted in which case the algorithm still runs in quadratic time.

Could you please explain why in the final version, a single combine step runs in O(n1+n2) and not O(n1*n2)? I'm not sure about steps 2 and 4 in the inductive solution:

2. Otherwise find the first Pi(x, h) point of the skyline which immediately precedes lj, the left coordinate of the building Bj, that is xi < lj ≤ xi + 1. Save hi to H if there is such a Pi point, otherwise set H to 0.

Wouldn't the search be at best O(logn1), if we use binary search? Then it would be repeated for each point in n2, so if it's O(logn1) repeated n2 times, wouldn't it contribute O(n2*logn1) to the final running time?

4.

For every Pk(x, h) point where lj ≤ xk < rj replace Pk with P(xk, hj) if hj > hk and hj ≠ hk - 1, or remove Pk if hj ≥ hk and hj = hk - 1. Save hk to H.

How many Pk points are there? Wouldn't the worst case be n1 points? Then wouldn't it be repeated for each point in n2, contributing O(n1*n2) to the final running time?

The combine step runs in linear time because the mergePoints function takes two lists and in each recursive step it either "consumes" the first element of the first list, or the first element of the second list, or if the first elements have the same x value, they are removed from both lists. The next recursive iteration is therefore called with lists where either one or both of them are shorter by one than in the previous iteration. This can be repeated n1 + n2 times at most.

As for your second question I believe that with a data structure that supports binary search the position of the build could indeed be found in log(n1) time. The inserted building's height, however, can adjust the points of the skyline that are to the right of the building, and in the worst case it changes all the points in the visible skyline.

And last but not least the number of Pk points depends on the number of buildings that are currently visible in the skyline and they overlap with the new building. I'm not sure if an exact number or formula can be given here. If, for example, every building inserted into the skyline so far got taller and narrower at the same time, then the shape of the skyline will look like a pyramid. If the nth building inserted into this skyline is a huge rectangle that overlaps with all the existing skyline points then the number of Pk points is going to be (n - 1) * 2. If, however, this huge rectangle building is inserted first and then the rest of the buildings then essentially there is no need to update any skyline points.

## Post a Comment