Sam Ford b6eb945320
livecheck: Add Options class
This adds a `Livecheck::Options` class, which is intended to house
various configuration options that are set in `livecheck` blocks,
conditionally set by livecheck at runtime, etc. The general idea is
that when we add features involving configurations options (e.g., for
livecheck, strategies, curl, etc.), we can make changes to `Options`
without needing to modify parameters for strategy `find_versions`
methods, `Strategy` methods like `page_headers` and `page_content`,
etc. This is something that I've been trying to improve over the years
and `Options` should help to reduce maintenance overhead in this area
while also strengthening type signatures.

`Options` replaces the existing `homebrew_curl` option (which related
strategies pass to `Strategy` methods and on to `curl_args`) and the
new `url_options` (which contains `post_form` or `post_json` values
that are used to make `POST` requests). I recently added `url_options`
as a temporary way of enabling `POST` support without `Options` but
this restores the original `Options`-based implementation.

Along the way, I added a `homebrew_curl` parameter to the `url` DSL
method, allowing us to set an explicit value in `livecheck` blocks.
This is something that we've needed in some cases but I also intend
to replace implicit/inferred `homebrew_curl` usage with explicit
values in `livecheck` blocks once this is available for use. My
intention is to eventually remove the implicit behavior and only rely
on explicit values. That will align with how `homebrew_curl` options
work for other URLs and makes the behavior clear just from looking at
the `livecheck` block.

Lastly, this removes the `unused` rest parameter from `find_versions`
methods. I originally added `unused` as a way of handling parameters
that some `find_versions` methods have but others don't (e.g., `cask`
in `ExtractPlist`), as this allowed us to pass various arguments to
`find_versions` methods without worrying about whether a particular
parameter is available. This isn't an ideal solution and I originally
wanted to handle this situation by only passing expected arguments to
`find_versions` methods but there was a technical issue standing in
the way. I recently found an answer to the issue, so this also
replaces the existing `ExtractPlist` special case with generic logic
that checks the parameters for a strategy's `find_versions` method
and only passes expected arguments.

Replacing the aforementioned `find_versions` parameters with `Options`
ensures that the remaining parameters are fairly consistent across
strategies and any differences are handled by the aforementioned
logic. Outside of `ExtractPlist`, the only other difference is that
some `find_versions` methods have a `provided_content` parameter but
that's currently only used by tests (though it's intended for caching
support in the future). I will be renaming that parameter to `content`
in an upcoming PR and expanding it to the other strategies, which
should make them all consistent outside of `ExtractPlist`.
2025-02-25 10:56:31 -05:00

97 lines
3.5 KiB
Ruby

# typed: strict
# frozen_string_literal: true
module Homebrew
module Livecheck
module Strategy
# The {Hackage} strategy identifies versions of software at
# hackage.haskell.org by checking directory listing pages.
#
# Hackage URLs take one of the following formats:
#
# * `https://hackage.haskell.org/package/example-1.2.3/example-1.2.3.tar.gz`
# * `https://downloads.haskell.org/~ghc/8.10.1/ghc-8.10.1-src.tar.xz`
#
# The default regex checks for the latest version in an `h3` heading element
# with a format like `<h3>example-1.2.3/</h3>`.
#
# @api public
class Hackage
# A `Regexp` used in determining if the strategy applies to the URL and
# also as part of extracting the package name from the URL basename.
PACKAGE_NAME_REGEX = /(?<package_name>.+?)-\d+/i
# A `Regexp` used to extract the package name from the URL basename.
FILENAME_REGEX = /^#{PACKAGE_NAME_REGEX.source.strip}/i
# A `Regexp` used in determining if the strategy applies to the URL.
URL_MATCH_REGEX = %r{
^https?://(?:downloads|hackage)\.haskell\.org
(?:/[^/]+)+ # Path before the filename
#{PACKAGE_NAME_REGEX.source.strip}
}ix
# Whether the strategy can be applied to the provided URL.
#
# @param url [String] the URL to match against
# @return [Boolean]
sig { params(url: String).returns(T::Boolean) }
def self.match?(url)
URL_MATCH_REGEX.match?(url)
end
# Extracts information from a provided URL and uses it to generate
# various input values used by the strategy to check for new versions.
# Some of these values act as defaults and can be overridden in a
# `livecheck` block.
#
# @param url [String] the URL used to generate values
# @return [Hash]
sig { params(url: String).returns(T::Hash[Symbol, T.untyped]) }
def self.generate_input_values(url)
values = {}
match = File.basename(url).match(FILENAME_REGEX)
return values if match.blank?
# A page containing a directory listing of the latest source tarball
values[:url] = "https://hackage.haskell.org/package/#{match[:package_name]}/src/"
regex_name = Regexp.escape(T.must(match[:package_name])).gsub("\\-", "-")
# Example regex: `%r{<h3>example-(.*?)/?</h3>}i`
values[:regex] = %r{<h3>#{regex_name}-(.*?)/?</h3>}i
values
end
# Generates a URL and regex (if one isn't provided) and passes them
# to {PageMatch.find_versions} to identify versions in the content.
#
# @param url [String] the URL of the content to check
# @param regex [Regexp] a regex used for matching versions in content
# @param options [Options] options to modify behavior
# @return [Hash]
sig {
params(
url: String,
regex: T.nilable(Regexp),
options: Options,
block: T.nilable(Proc),
).returns(T::Hash[Symbol, T.untyped])
}
def self.find_versions(url:, regex: nil, options: Options.new, &block)
generated = generate_input_values(url)
PageMatch.find_versions(
url: generated[:url],
regex: regex || generated[:regex],
options:,
&block
)
end
end
end
end
end