- For a situation where `authored = 3`, `committed = 4`, the previous
calculation was `3 - 4` which meant that `committed = -1` in the end.
- This was incorrect, since a user can't have negative contributions!
- Instead, only do the subtraction to get the deduplicated `committed`
count if the number of authored commits is higher than the number of
committed commits. This approach should achieve the desired "don't
double count things that the user authored and committed, but do count
things that another person authored that the user committed".
- Double counting is artificially inflating folks' contributions (sadly ;-)).
- Since I'm not going to enumerate every possible author to filter by *both*
fields via the API, let's do some arithmetic to figure out the unique
committer numbers for a user.
- The GitHub list commits API now supports this filtering
(https://docs.github.com/en/rest/commits/commits?apiVersion=2022-11-28#list-commits--parameters),
because I wrote it. :-)
- Authoring a commit and committing a commit are two separate concepts: author
is the person who wrote the code and, in old parlance, the committer is the
person who applied the patch (remember when we sent patches to mailing lists?).
- In practice for us in Homebrew, this occurs when we make a change in GitHub's
web editor, or, more obviously, when BrewTestBot pushes `homebrew-core`
commits from users (then, `BrewTestBot` is the `committer`).
I know that we're outside our normal deprecation cycle but: these are
totally broken with the API and it doesn't make sense to support them
only for non-core formulae.
- The `reviewed-by` filter retrieved all reviews for a user, including
those they'd added to their own PRs. Since it's impossible to click
the "approve" button on one's own PR, filter this to `review:approved`
to get "further project goals" kinds of reviews.
- Suggested in https://github.com/Homebrew/brew/pull/14813#discussion_r1118696385.
- Signoffs were just a stopgap until we implemented getting "real"
reviews for a user via the GitHub API. They were a suboptimal way of getting
reviews because they only really exist in Homebrew/homebrew-core where
BrewTestBot adds signoffs for each maintainer who reviewed the PR.
- Now `brew contributions --from=2023-02-23 --to=2023-02-26` works to limit the
results for reviews. I forgot this in the original implementation, again,
ugh.
- The search APIs don't have that high a rate limit but we shouldn't need to
worry about that too much because, to get counts, the JSON response comes
with a `total_count` number.
- Turns out in my head a few days ago I was overcomplicating this. I had a
brainwave while in the shower.
- Some refactoring so that we call `totals` to sum up the hash of hashes less,
since the grand total numbers are now used in multiple places.
- `brew contributions --user=issyl0` was taking forever because it went
through all maintainers first, because the conditionals were in the
wrong order.
- This was too quiet, far too quiet, for something that takes so long.
- Now verbose mode tells you what repos it's scanning for a user.
- With `brew contributions`, this will output a list of stats
(across the specified time period, or all time) for people in the
"maintainers" team on GitHub.
- Add a `--user` flag for getting stats for a specific user (either
username, name or email address).
- This assumes that their Git committer details are the same as their name is
set to on GitHub.
- Show an error message if trying to generate a CSV for the full maintainer
list, since I haven't worked out how to best show all of that info yet (or
even how best to show only the totals across everything for every user) in
that format.
- These are arbitrary length limits that had a load of disables in code.
- The limits were only increasing over time rather than decreasing.
- Fixing the problematic code to be shorter would take a long time for
questionable gain since the problem has been around so long.
- Using `git log` was brittle with name changes and email address changes for
contributors over the years unless we made a Git `mailmap` file which brings
with it its own updatedness overhead.
- Let's use the GitHub commits API (importantly _not_ the search API) so that
we can give it a username and it will return contributions associated with
every email address on that user's account:
https://docs.github.com/en/rest/commits/commits?apiVersion=2022-11-28#list-commits--parameters.
- This is quite significantly slower, but it's worth it for correctness
especially when we get to all maintainers' contributions (in a separate PR).
- The commits API does not (yet?) support trailers or commit "committer"s, just
authors.