METU Mashup Volume 1

I prepared an editorial for the unofficial contest that was organized on Codeforces last weekend by the Computer Engineering Department of METU and decided to post it here as well. I am planning to post the editorial on the blog of the Codeforces group of METU, but people need to login first to view it. So, to provide easier access, I am posting it here as well.


The problem statements can be accessed through the links given below.

  1. Declined Finalists
  2. The Contest
  3. Dishonest Sellers
  4. Dijkstra?
  5. Lucky Pair
  6. Test
  7. Game of Stones
  8. Random Query
  9. Level Generation
  10. Nagini

A. Declined Finalists

There is not much to say about this problem. The basic idea is finding the maximum element in the array, say Amax. If Amax is greater than 25, then (Amax - 25) people would have to have declined. If Amax is smaller than or equal to 25, then nobody has to decline for the contestant with the rank Amax to participate in the finals.

B. The Contest

The problem is a typical greedy one, where you have to keep track of the problems you solved and the time you are at. The crux of the idea behind the solution is that it is optimal to submit the problems in the non-decreasing order of time it takes to solve them. It is usually hard to give official proof of greedy solutions, so I will omit the proof, but you may try to figure it out on your own. (I'd recommend proof by contradiction)

Then, you may just keep an event queue (i.e. a technique frequently used in computational geometry problems - commonly associated with plane/line sweep) of the starting and ending times of the submittable periods. You can keep the problem(s) you're working on (and the ones you have done) in some data structure( e.g. array, vector, set, etc.) and update it accordingly at each significant time. (i.e. when you enter or exit a period where you can submit a problem) Finally, you may record the times at which a submission is made, and return the last (or greatest) one of those.

C. Dishonest Sellers

The idea behind the idea is again, greedy, in a way. You need to buy k of n elements right now. You should basically purchase all the elements current prices of which are cheaper than their prices after the discount. Let's say you bought all those M elements that are more profitable to buy. If k > M, then you need to buy (k - M) more elements and have to suffer some losses. The strategy in buying those is trying to minimize your losses. To achieve that, you can basically sort the remaining items based on the difference between their current and post-discount prices, in non-decreasing order. Then just pick the first (k-M) elements, which you will buy now. The rest, you will purchase after the discount ends. The sum of the relevant prices based on what items you bought now yields the result.

D. Dijkstra?

This is a typical question that asks you to implement Dijkstra's algorithm to find the shortest path to a vertex on a weighted, undirected graph. There are two modifications to the typical Dijkstra's algorithm implementation. The first is the fact that the graph is not simple, meaning that there may be multiple edges between any pair of nodes u, v. This is relatively easy to handle. My implementation basically filters the adjacency list right after reading the input, and keeps only the edge with the least weight between any nodes u, v.

The second trick I used in my implementation was essential when I exceeded the memory limit in my first submission. Depending on your implementation, the priority queue used in Dijkstra's algorithm may swell, due to keeping multiple instances of a single vertex, only one of which is the current minimal route. It usually does not cause any issue in most problems where vertex and edge bounds are a little less, but apparently, a case in the tests is good enough to filter out bad implementations like mine. The trick I applied was using a std::set instead of a standard priority queue. For each node u, whenever I found a cheaper route to u, I took out the older optimal path inside that set if one existed. This causes a little more time burden on the algorithm. However, since insertion times of both the std::priority_queue and std::set are O(logN), and erasing an element from std::set also takes O(logN), the difference in performance was just a constant factor at each iteration. (N refers to the number of elements in the set)

E. Lucky Pair

To be added soon.

F. Test

My solution, unlike some others who used a KMP-based idea, turns the problem into a graph problem. Basically if each string is to be considered as a node in a graph, and the length of the maximal intersection are the directed edges between them, then the problem is reduced to finding the longest path in that graph. The length of the longest path can then be reduced from the sum of the length of all three strings, giving the result.

The challenge here was finding the maximal intersection of each pair of strings, which could naively be done in O(N2). I believe it would be possible do achieve this task using a method similar to the prefix-suffix trick used on KMP, but not knowing KMP, I tried to modify the naive approach to achieve O(NlogN) complexity. Sadly, techniques like binary searching on the maximum length of common suffix-prefix pair does not work in a problem like this, because the interval is not monotonic. For instance, for k=1 and k=3 there may be an intersection, but for k=2 there may be no intersection.

After a long while, I thought of precomputing a frequency array(i.e. an array each cell of which corresponds to a letter, and denotes how many times it appears in a given string) for each possible suffix and prefix of each string. Then, I modified my naive implementation to compare the substrings if only their frequency arrays are the same. Since the problem with the O(N2) approach was doing O(N) substring comparisons, each of which took O(N), reducing the number of times that comparison was made helped improve the complexity, and I got an accepted. Note that it is not sufficient to compare just the frequency arrays to consider two strings to be equal. As you do not compare the order the elements appear in, that test alone may give false-positives, leading you to an incorrect result.

Finally, you also need to take into account the cases where one (or two) of the given strings being subsets of the third. The strategy is to get the minimum of the results one of which was yielded by solving the graph problem, and the other was obtained by checking the subset cases.

G. Game of Stones

This problem is a modification of the Nim game, which is a rather popular game theory subject. To solve it, you need to familiarize yourself with finding the Grundy numbers of a game, and study how it is possible to combine the grundy numbers of multiple games to decide the winner. Instead of telling you in detail, let me provide you with some reading material that may help you internalize the intuition, and better understand the proofs regarding the idea. Impartial games notes by MIT is a relatively concise resource on the topic. Also, GeeksForGeeks website offers a narrative as well as an implementation of the idea.

H. Random Query

Expected value problems feel always tricky to me. In this one, I spent a considerable time on figuring out a way to query the unique elements in each range. However, even my best candidate for an idea, which was a segment-tree based one, failed for some of the possible ranges. It dawned on me, although rather lately, that as opposed to counting the unique elements for each edge, the trick is to count the number of intervals for which a particular element is unique. Then, basically, that element will have contributed that many times to the expected value. Iterating over each value v and keeping each occurrence of v in somewhere is a good way to start. Since the magnitude of the values are capped at 106, it is possible to achieve this in the given time.

One thing to keep in mind is taking into account the intervals where there are multiple instances of a single value v. Then, their combined contribution will be 1 for each such range. A neat way to find the number of intervals where there are multiple instances of the value v is subtracting from N2, which is the total amount of valid intervals, the number of intervals where there aren't any v values, and the number of intervals where v is unique.

Once the sum of each unique and combined contribution of each value is found, then that total may be divided to the number of available ranges N2, yielding the expected value.

I. Level Generation

Basically the problem asks to start with a Minimum Spanning Tree, where each edge is a bridge, and wants us to try adding as many new edges as possible with the constraint that the number of bridges is greater than or equal to the number of non-bridges.

Working on the sample cases, I initially thought that every new edge added to the graph would create a cycle, and every bridge included in that cyclic path would become non-bridges. So, a strategy with which it is optimal to add new edges could be by forming the most minimal cycles in the MST. Then, I noticed that, once you form a cycle C1, you need not switch to finding another one. You can keep adding new edges to C1 until it becomes fully connected. Once C1 becomes a fully connected graph with |V| vertices, you may then just introduce another vertex to your fully-connected graph, losing a single bridge, but gaining |V| possible edge additions. (i.e. until it becomes fully connected again)

It is easier to internalize this strategy by considering a sample case. Consider N = 10. Now, if we draw the most intuitive MST of an undirected graph with no weights, it would be a straight line. Then, introducing the shortest cycle is just connecting the last node with the 3rd from the last. Now, the graph has 7 bridges and 3 non-bridge edges. The difference between those two is large enough to let us add some more non-bridge edges. So, since the last 3 nodes already form a fully-connected graph, we have to introduce the 4th from the last node into that graph. That way, we can add up to 2 more new edges, and the only bridge we lost was the edge between 3rd node from the last and the 4th node from the last. Adding those edges, we end up with 6 bridges and 6 non-bridges. If we add any more edges, the bridges will not be the majority of the edges in the graph, so we need to stop there. So, for N = 10, the answer is 12.

Given that there are 105 queries, and the number of vertices may go up to 2.109, it is essential to find the number of vertices |V| of the fully (or almost fully) connected graph in less than linear time. If we loop through possible |V| values in linear fashion, we will exceed the time limit. The idea I implemented was using binary search on possible values, which has the complexity of O(logN). Once the range is reduced to less than 10, I halted the binary search and linearly went through the range to find the maximum number of edges it was possible to add. Since the length of the reduced range is negligible, the linear searching portion did not exceed the time limit.

J. Nagini

To be added soon.