This commit is here for the history of my madness. Rather than trying
to teach the iterator API to be branching, I should add a management
wrapper around it, which can decide for each commit it sees whether it
would like to follow a branch or not. This way it's also much easier
to terminate a branch when we realise that we've seen it before (or
disable that functionality), without having to cram more features into
the same abstraction.
This is done by stepping through a branch iterator to diff commits to
see where they changed. This is somewhat computationally intensive,
and also doesn't work well yet because iterator stepping is done
internally and there is no API to extend iterators before they are
run. That means that histories can only be read from first-parent
iterators.
Ideally we would do two things here:
1. build an API to let iterators extend themselves, either breadth
first, or depth first. But maybe that would be too much
complexity for the iterator module?
2. figure out a better way to get the history of a file. At the
moment we are stepping through commits and diffing them with
parents to find the set of changed paths. I don't _think_ there is
a way to simply compare refs, but maybe there is.
This implementation is a bit weird, especially because it changes the
API from what it was previously. This works, for now, but some of the
relationships between types feel a bit bad. Especially that all
queries have to go via the FileTree, and we can't just give out
objects that represent some part of the tree that are then loaded when
needed.
For now this will work though. What's still missing is to turn a
Yield::Dir into a new FileTree.
The previous implementation tried to be an iterator over parents,
which ultimately was the wrong design choice and caused a lot of
headaches. Instead, this implementation is a simple branch iterator,
that will update it's position to the next parent, if one exists.
Not really sure why this wasn't my first approach, but I think I was
thinking about this in terms of parents, not actually commits. So I
didn't really see the forest for the trees.
This code implements a parsing strategy that uses lazy iterators to
traverse a branch graph. An iterator is constructed for a starting
point on a branch, and new iterators are spawned for every merge that
is encountered. To get all commits in a repository, simply do as
test.rs: queue new work to a channel, that you poll from until no more
branches have been discovered.
This code is somewhat suboptimal. For one, get_parent() is way too
complex, and could use some refactoring. Secondly, the semantics of
`BranchCommit::Branch(...)` are unclear from the outside, and the fact
that simple merge commits will be returned via `BranchCommit::Commit`,
while subsequent merge commits need to use `BranchCommit::Merge(...)`
is inconsistent and should be fixed before doing any sort of public
release!
This code is work-in-progress, and doesn't work on a repo that has a
branched history. The issue here is that after handling a merge
commit, keeping track of which commit to look at next is non-trivial.
This solution tries to isuse a "skip" command on the walker, but this
can accidentally skip commits, when two merges have happened in
succession (maybe a bug with the impl, not the concept).
But also, the actual merge commit seems to already be part of the
norma history? So maybe we can ommit the merge commit explicitly, and
simply return a new branch handle instead.