Not many programmers are aware of this "small-to-large" trick. I've picked up this problem from CsAcademy Round 31: you are given a tree rooted at 1 where every node $i$ has some value, $val[i]$. You need to find the number of ways of selecting nodes so that the induced tree is uniform.
1. A uniform tree is one where for every node, its direct child have same value.
2. Induced tree from a subset of nodes $X$ is one where:
a. There exists exactly one node that doesn't have any ancestor present in $X$.
b. For every other node, its parent is its lowest ancestor that is present in $X$.
The DP solution is as follows:
Let dp[i][j] be the number of uniform forests(i.e it can be a single uniform tree, or a collection of uniform trees) that exist in the subtree of $i^{th}$ node such that the root of every tree in the forest has value $j$.
Now $dp[i][j] = \left(\prod_{x\in child[i]}(dp[x][j]+1)\right)-1$ , if we do not include node $i$
This is because from every child $x$, it can contribute forests with root value $j$ in $dp[x][j]$ ways, plus $1$ extra way, when we do not select any forest from $x^{th}$ child.
So we need to subtract $-1$ from the total because we have also counted a way where all the children of node $i$ contribute no forest.
Note that we can also include node $i$. This will add more ways in $dp[i][val[i]]$.
If this is not clear, think more by drawing on paper. Also you can refer to editorial here.
So this takes $O(n^2)$ time as well as space, and we need to do better.
Note that for given subtree of node $i$, if none of the nodes in its subtree has $v$ value, then $dp[i][v]$ will be $0$.
So we only iterate on the required values for given subtree using a map.
Now here is the small-to-large technique part. When we are combining all the values for childs of node $i$, we can do that in a smarter way. We always iterate on the smaller sized map, and put its entries in the larger sized map.
So whenever an item is being moved, it is always moved from smaller map to bigger map.
This makes the size of resultant map greater than twice the size of smaller map. So each item can be moved upto max of $log_2(n)$ times, and we have $N$ items in total (one value per node). So we get $O(NlogN)$ time complexity as well as space.
As usual, here is the code implementing the above idea.
I hope you've learnt a bit about this small-to-large technique. This is another problem based on it.
You can compare my 2 solutions for the same: based on Mo's, based on small-to-large. See how shorter and cleaner the code becomes using this trick.
Comment if you have any doubts. I'm always ready to help.