Sam Ford b6eb945320
livecheck: Add Options class
This adds a `Livecheck::Options` class, which is intended to house
various configuration options that are set in `livecheck` blocks,
conditionally set by livecheck at runtime, etc. The general idea is
that when we add features involving configurations options (e.g., for
livecheck, strategies, curl, etc.), we can make changes to `Options`
without needing to modify parameters for strategy `find_versions`
methods, `Strategy` methods like `page_headers` and `page_content`,
etc. This is something that I've been trying to improve over the years
and `Options` should help to reduce maintenance overhead in this area
while also strengthening type signatures.

`Options` replaces the existing `homebrew_curl` option (which related
strategies pass to `Strategy` methods and on to `curl_args`) and the
new `url_options` (which contains `post_form` or `post_json` values
that are used to make `POST` requests). I recently added `url_options`
as a temporary way of enabling `POST` support without `Options` but
this restores the original `Options`-based implementation.

Along the way, I added a `homebrew_curl` parameter to the `url` DSL
method, allowing us to set an explicit value in `livecheck` blocks.
This is something that we've needed in some cases but I also intend
to replace implicit/inferred `homebrew_curl` usage with explicit
values in `livecheck` blocks once this is available for use. My
intention is to eventually remove the implicit behavior and only rely
on explicit values. That will align with how `homebrew_curl` options
work for other URLs and makes the behavior clear just from looking at
the `livecheck` block.

Lastly, this removes the `unused` rest parameter from `find_versions`
methods. I originally added `unused` as a way of handling parameters
that some `find_versions` methods have but others don't (e.g., `cask`
in `ExtractPlist`), as this allowed us to pass various arguments to
`find_versions` methods without worrying about whether a particular
parameter is available. This isn't an ideal solution and I originally
wanted to handle this situation by only passing expected arguments to
`find_versions` methods but there was a technical issue standing in
the way. I recently found an answer to the issue, so this also
replaces the existing `ExtractPlist` special case with generic logic
that checks the parameters for a strategy's `find_versions` method
and only passes expected arguments.

Replacing the aforementioned `find_versions` parameters with `Options`
ensures that the remaining parameters are fairly consistent across
strategies and any differences are handled by the aforementioned
logic. Outside of `ExtractPlist`, the only other difference is that
some `find_versions` methods have a `provided_content` parameter but
that's currently only used by tests (though it's intended for caching
support in the future). I will be renaming that parameter to `content`
in an upcoming PR and expanding it to the other strategies, which
should make them all consistent outside of `ExtractPlist`.
2025-02-25 10:56:31 -05:00

326 lines
13 KiB
Ruby

# typed: strict
# frozen_string_literal: true
require "utils/curl"
require "livecheck/options"
module Homebrew
module Livecheck
# The `Livecheck::Strategy` module contains the various strategies as well
# as some general-purpose methods for working with them. Within the context
# of the `brew livecheck` command, strategies are established procedures
# for finding new software versions at a given source.
module Strategy
extend Utils::Curl
# {Strategy} priorities informally range from 1 to 10, where 10 is the
# highest priority. 5 is the default priority because it's roughly in
# the middle of this range. Strategies with a priority of 0 (or lower)
# are ignored.
DEFAULT_PRIORITY = 5
# cURL's default `--connect-timeout` value can be up to two minutes, so
# we need to use a more reasonable duration (in seconds) to avoid a
# lengthy wait when a connection can't be established.
CURL_CONNECT_TIMEOUT = 10
# cURL does not set a default `--max-time` value, so we provide a value
# to ensure cURL will time out in a reasonable amount of time.
CURL_MAX_TIME = T.let(CURL_CONNECT_TIMEOUT + 5, Integer)
# The `curl` process will sometimes hang indefinitely (despite setting
# the `--max-time` argument) and it needs to be quit for livecheck to
# continue. This value is used to set the `timeout` argument on
# `Utils::Curl` method calls in {Strategy}.
CURL_PROCESS_TIMEOUT = T.let(CURL_MAX_TIME + 5, Integer)
# The maximum number of redirections that `curl` should allow.
MAX_REDIRECTIONS = 5
# This value is passed to `#parse_curl_output` to ensure that the limit
# for the number of responses it will parse corresponds to the maximum
# number of responses in this context. The `+ 1` here accounts for the
# situation where there are exactly `MAX_REDIRECTIONS` number of
# redirections, followed by a final `200 OK` response.
MAX_PARSE_ITERATIONS = T.let(MAX_REDIRECTIONS + 1, Integer)
# Baseline `curl` arguments used in {Strategy} methods.
DEFAULT_CURL_ARGS = T.let([
# Follow redirections to handle mirrors, relocations, etc.
"--location",
"--max-redirs", MAX_REDIRECTIONS.to_s,
# Avoid progress bar text, so we can reliably identify `curl` error
# messages in output
"--silent"
].freeze, T::Array[String])
# `curl` arguments used in `Strategy#page_content` method.
PAGE_CONTENT_CURL_ARGS = T.let(([
"--compressed",
# Return an error when the HTTP response code is 400 or greater but
# continue to return body content
"--fail-with-body",
# Include HTTP response headers in output, so we can identify the
# final URL after any redirections
"--include",
] + DEFAULT_CURL_ARGS).freeze, T::Array[String])
# Baseline `curl` options used in {Strategy} methods.
DEFAULT_CURL_OPTIONS = T.let({
print_stdout: false,
print_stderr: false,
debug: false,
verbose: false,
timeout: CURL_PROCESS_TIMEOUT,
connect_timeout: CURL_CONNECT_TIMEOUT,
max_time: CURL_MAX_TIME,
retries: 0,
}.freeze, T::Hash[Symbol, T.untyped])
# A regex used to identify a tarball extension at the end of a string.
TARBALL_EXTENSION_REGEX = /
\.t
(?:ar(?:\.(?:bz2|gz|lz|lzma|lzo|xz|Z|zst))?|
b2|bz2?|z2|az|gz|lz|lzma|xz|Z|aZ|zst)
$
/ix
# An error message to use when a `strategy` block returns a value of
# an inappropriate type.
INVALID_BLOCK_RETURN_VALUE_MSG = "Return value of a strategy block must be a string or array of strings."
# Creates and/or returns a `@strategies` `Hash`, which maps a snake
# case strategy name symbol (e.g. `:page_match`) to the associated
# strategy.
#
# At present, this should only be called after tap strategies have been
# loaded, otherwise livecheck won't be able to use them.
# @return [Hash]
sig { returns(T::Hash[Symbol, T.untyped]) }
def self.strategies
@strategies ||= T.let(Strategy.constants.sort.each_with_object({}) do |const_symbol, hash|
constant = Strategy.const_get(const_symbol)
next unless constant.is_a?(Class)
key = Utils.underscore(const_symbol).to_sym
hash[key] = constant
end, T.nilable(T::Hash[Symbol, T.untyped]))
end
private_class_method :strategies
# Returns the strategy that corresponds to the provided `Symbol` (or
# `nil` if there is no matching strategy).
#
# @param symbol [Symbol, nil] the strategy name in snake case as a
# `Symbol` (e.g. `:page_match`)
# @return [Class, nil]
sig { params(symbol: T.nilable(Symbol)).returns(T.untyped) }
def self.from_symbol(symbol)
strategies[symbol] if symbol.present?
end
# Returns an array of strategies that apply to the provided URL.
#
# @param url [String] the URL to check for matching strategies
# @param livecheck_strategy [Symbol] a strategy symbol from the
# `livecheck` block
# @param regex_provided [Boolean] whether a regex is provided in the
# `livecheck` block
# @param block_provided [Boolean] whether a `strategy` block is provided
# in the `livecheck` block
# @return [Array]
sig {
params(
url: String,
livecheck_strategy: T.nilable(Symbol),
regex_provided: T::Boolean,
block_provided: T::Boolean,
).returns(T::Array[T.untyped])
}
def self.from_url(url, livecheck_strategy: nil, regex_provided: false, block_provided: false)
usable_strategies = strategies.select do |strategy_symbol, strategy|
if strategy == PageMatch
# Only treat the strategy as usable if the `livecheck` block
# contains a regex and/or `strategy` block
next if !regex_provided && !block_provided
elsif [Json, Xml, Yaml].include?(strategy)
# Only treat the strategy as usable if the `livecheck` block
# specifies the strategy and contains a `strategy` block
next if (livecheck_strategy != strategy_symbol) || !block_provided
elsif strategy.const_defined?(:PRIORITY) &&
!strategy.const_get(:PRIORITY).positive? &&
livecheck_strategy != strategy_symbol
# Ignore strategies with a priority of 0 or lower, unless the
# strategy is specified in the `livecheck` block
next
end
strategy.respond_to?(:match?) && strategy.match?(url)
end.values
# Sort usable strategies in descending order by priority, using the
# DEFAULT_PRIORITY when a strategy doesn't contain a PRIORITY constant
usable_strategies.sort_by do |strategy|
(strategy.const_defined?(:PRIORITY) ? -strategy.const_get(:PRIORITY) : -DEFAULT_PRIORITY)
end
end
# Creates `curl` `--data` or `--json` arguments (for `POST` requests`)
# from related `livecheck` block `url` options.
#
# @param post_form [Hash, nil] data to encode using `URI::encode_www_form`
# @param post_json [Hash, nil] data to encode using `JSON::generate`
# @return [Array]
sig {
params(
post_form: T.nilable(T::Hash[Symbol, String]),
post_json: T.nilable(T::Hash[Symbol, String]),
).returns(T::Array[String])
}
def self.post_args(post_form: nil, post_json: nil)
if post_form.present?
require "uri"
["--data", URI.encode_www_form(post_form)]
elsif post_json.present?
require "json"
["--json", JSON.generate(post_json)]
else
[]
end
end
# Collects HTTP response headers, starting with the provided URL.
# Redirections will be followed and all the response headers are
# collected into an array of hashes.
#
# @param url [String] the URL to fetch
# @param options [Options] options to modify behavior
# @return [Array]
sig { params(url: String, options: Options).returns(T::Array[T::Hash[String, String]]) }
def self.page_headers(url, options: Options.new)
headers = []
if options.post_form || options.post_json
curl_post_args = ["--request", "POST", *post_args(
post_form: options.post_form,
post_json: options.post_json,
)]
end
[:default, :browser].each do |user_agent|
begin
parsed_output = curl_headers(
*curl_post_args,
"--max-redirs",
MAX_REDIRECTIONS.to_s,
url,
wanted_headers: ["location", "content-disposition"],
use_homebrew_curl: options.homebrew_curl || false,
user_agent:,
**DEFAULT_CURL_OPTIONS,
)
rescue ErrorDuringExecution
next
end
parsed_output[:responses].each { |response| headers << response[:headers] }
break if headers.present?
end
headers
end
# Fetches the content at the URL and returns a hash containing the
# content and, if there are any redirections, the final URL.
# If `curl` encounters an error, the hash will contain a `:messages`
# array with the error message instead.
#
# @param url [String] the URL of the content to check
# @param options [Options] options to modify behavior
# @return [Hash]
sig { params(url: String, options: Options).returns(T::Hash[Symbol, T.untyped]) }
def self.page_content(url, options: Options.new)
if options.post_form || options.post_json
curl_post_args = ["--request", "POST", *post_args(
post_form: options.post_form,
post_json: options.post_json,
)]
end
stderr = T.let(nil, T.nilable(String))
[:default, :browser].each do |user_agent|
stdout, stderr, status = curl_output(
*curl_post_args,
*PAGE_CONTENT_CURL_ARGS, url,
**DEFAULT_CURL_OPTIONS,
use_homebrew_curl: options.homebrew_curl ||
!curl_supports_fail_with_body? ||
false,
user_agent:
)
next unless status.success?
# stdout contains the header information followed by the page content.
# We use #scrub here to avoid "invalid byte sequence in UTF-8" errors.
output = stdout.scrub
# Separate the head(s)/body and identify the final URL (after any
# redirections)
parsed_output = parse_curl_output(output, max_iterations: MAX_PARSE_ITERATIONS)
final_url = curl_response_last_location(parsed_output[:responses], absolutize: true, base_url: url)
data = { content: parsed_output[:body] }
data[:final_url] = final_url if final_url.present? && final_url != url
return data
end
error_msgs = stderr&.scan(/^curl:.+$/)
{ messages: error_msgs.presence || ["cURL failed without a detectable error"] }
end
# Handles the return value from a `strategy` block in a `livecheck`
# block.
#
# @param value [] the return value from a `strategy` block
# @return [Array]
sig { params(value: T.untyped).returns(T::Array[String]) }
def self.handle_block_return(value)
case value
when String
[value]
when Array
value.compact.uniq
when nil
[]
else
raise TypeError, INVALID_BLOCK_RETURN_VALUE_MSG
end
end
end
end
end
require_relative "strategy/apache"
require_relative "strategy/bitbucket"
require_relative "strategy/cpan"
require_relative "strategy/crate"
require_relative "strategy/electron_builder"
require_relative "strategy/extract_plist"
require_relative "strategy/git"
require_relative "strategy/github_latest"
require_relative "strategy/github_releases"
require_relative "strategy/gnome"
require_relative "strategy/gnu"
require_relative "strategy/hackage"
require_relative "strategy/header_match"
require_relative "strategy/json"
require_relative "strategy/launchpad"
require_relative "strategy/npm"
require_relative "strategy/page_match"
require_relative "strategy/pypi"
require_relative "strategy/sourceforge"
require_relative "strategy/sparkle"
require_relative "strategy/xml"
require_relative "strategy/xorg"
require_relative "strategy/yaml"