While building a website, understanding what changes were made helps with advanced editing and Version Control.
The problem? How do we know what was changed between one publish of a world and the next?
My solution? Our very own diff checker algorithm tailored specifically to our worlds. I wrote a function called diff that would run every time a creator published their world. What this would do is find which parts of the world were added and removed, and save those id’s respectively.
My first challenge was trying to understand how I would be able to efficiently find the difference between two arrays. The easiest way to find the difference is to first figure out what’s the same- and that sounded like a popular programming problem!
I read up on diff algorithms, and came across Longest Common Subsequence- LCS.
LCS works by recursively using a hash map to keep a running track of similarities between two strings, so I figured that I could use that with our worlds (which were arrays) as well.
# frozen_string_literal: true
module TreeUtil
def self.max(arg1, arg2)
arg1 > arg2 ? arg1 : arg2
end
def self.lcs(old_tree, new_tree, m, n, hash, res)
return 0 if m.negative? || n.negative?
if hash[[m, n]] == -1
result = -1
if old_tree[m] == new_tree[n]
result = 1 + lcs(old_tree, new_tree, m - 1, n - 1, hash, res)
res << old_tree[m]
else
result = max(lcs(old_tree, new_tree, m - 1, n, hash, res), lcs(old_tree, new_tree, m, n - 1, hash, res))
end
hash[[m, n]] = result
else
result = hash[[m, n]]
end
result
end
After finding the LCS, I knew that to find the diff it would be as simple as doing a set difference from the old and new versions of the world.
In short:
removed = old - LCS