Starting Algorithms1538 words. Time to Read: About 15 minutes.
This week I am starting a class on algorithms (as part of the OSSU cousework). I currently know three things. Thing one: I am actually (pretty much) on schedule for blog posts! Booyah for every two weeks! It works great, because I tell myself that I’m actually going to be a go-getter and post an extra one in my off week, but by the time that I get to doing that, it’s the second week and I’m just regularly on time. So that works out. Thing two: I have a really hard time focusing on just one thing for very long, especially when learning coding things. Thing three: I’m trying really hard to limit the number of classes I’m doing. Currently, I’m only working on
- Program Design
- Discrete Mathematics
And on the side:
- Improving Ruby-ness
- More practice with idiomatic Python
- Wrestling Jekyll to make this blog look like how I want
- The in-house Django apps for work
Anyways. To what I wanted to talk about: this algorithms class! It’s really, really interesting! I think the first couple lessons are just warm-ups to get your feet wet and your mind thinking in algorithm mode, but they are still making me think really hard. The main thing I’ve learned about so far is the “Dynamic Connectivity Problem.” This is the idea that, given some number of nodes, is there some efficient way of checking if two are connected? Yes. Yes there is.
It’s called union find. You lump all of the connected entities into a blob, and then when you connect two more entities, you combine the blobs that they are in. Here’s the basic setup:
So essentially, we lay each node out into a single array. Here’s one way to connect them called quick-find. It makes finding out whether or not two nodes are connected really easy, but both setup and unions are linear (N) time, meaning that if you have 10x as many inputs it will take ~10x as long to run. I’ll talk more about this N notation in a different post. The take away is that loading is slow, but accessing is fast later.
This is great, but can we improve performance? The next one is called Quick-Union.
The way to fix the problems with Quick Union is to provide some weighting! If we make sure that we always make the bigger tree the root, then we’ll end up never placing an already tall tree below a smaller one. We should have a much flatter structure. In order to do this, we have to keep track of a size or depth for each connected blobtree. This is just a second array that keeps track of the depth. Since each union (by definition) makes that tree one step deeper, you just increment the size of the new root by 1!
Through some math, it turns out that this gets us from
N^2 time to setup and union things to
M*lg(N) time, which is an awesome improvement. I know this was kind of heavy, but I think it was good for me to do it out and explain it some. More in the next post.