-
-
Notifications
You must be signed in to change notification settings - Fork 9.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WIP: Slim down formula API JSON #16433
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -3,6 +3,7 @@ | |
|
||
require "extend/cachable" | ||
require "api/download" | ||
require "utils/collection" | ||
|
||
module Homebrew | ||
module API | ||
|
@@ -101,6 +102,71 @@ def write_names_and_aliases(regenerate: false) | |
end | ||
end | ||
end | ||
|
||
REQUIRED_FIELDS = %w[ | ||
aliases | ||
bottle | ||
build_dependencies | ||
caveats | ||
conflicts_with | ||
conflicts_with_reasons | ||
dependencies | ||
deprecation_date | ||
deprecation_reason | ||
desc | ||
disable_date | ||
disable_reason | ||
full_name | ||
homepage | ||
keg_only_reason | ||
license | ||
link_overwrite | ||
name | ||
oldnames | ||
optional_dependencies | ||
options | ||
post_install_defined | ||
pour_bottle_only_if | ||
recommended_dependencies | ||
requirements | ||
revision | ||
ruby_source_checksum | ||
ruby_source_path | ||
service | ||
tap | ||
tap_git_head | ||
test_dependencies | ||
urls | ||
uses_from_macos | ||
uses_from_macos_bounds | ||
variations | ||
version_scheme | ||
versioned_formulae | ||
versions | ||
].freeze | ||
private_constant :REQUIRED_FIELDS | ||
|
||
sig { params(formula_hash: Hash).returns(Hash) } | ||
def slim_hash(formula_hash) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Again: don't need this yet but probably want to have tests ensuring that formulae are loaded as expected from this JSON. |
||
hash = formula_hash.slice(*REQUIRED_FIELDS) | ||
|
||
if stable_bottle = hash.dig("bottle", "stable") | ||
stable_bottle["files"] = stable_bottle["files"].transform_values do |file| | ||
# The "url" is not used here and takes up a lot of space. | ||
file.slice("cellar", "sha256") | ||
end | ||
end | ||
|
||
# This array can include empty hashes but they should not be omitted since | ||
# array index position matters here. It corresponds to "uses_from_macos". | ||
uses_from_macos_bounds = hash["uses_from_macos_bounds"] | ||
|
||
hash = Utils::Collection.recursive_compact_hash(hash) | ||
raise ArgumentError, "Invalid formula hash" if hash.nil? | ||
|
||
hash["uses_from_macos_bounds"] = uses_from_macos_bounds if uses_from_macos_bounds | ||
hash | ||
end | ||
end | ||
end | ||
end | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -58,10 +58,14 @@ def generate_formula_api | |
Formulary.enable_factory_cache! | ||
Formula.generating_hash! | ||
|
||
api_json = [] | ||
|
||
tap.formula_names.each do |name| | ||
formula = Formulary.factory(name) | ||
name = formula.name | ||
json = JSON.pretty_generate(formula.to_hash_with_variations) | ||
formula_hash = formula.to_hash_with_variations | ||
json = JSON.pretty_generate(formula_hash) | ||
api_json << API::Formula.slim_hash(formula_hash) | ||
html_template_name = html_template(name) | ||
|
||
unless args.dry_run? | ||
|
@@ -74,6 +78,7 @@ def generate_formula_api | |
raise | ||
end | ||
|
||
File.write("formula.json", JSON.generate(api_json)) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm not sure where this file should go. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Location seems fine. Filename could be something like There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Maybe they should both go in the same directory? |
||
canonical_json = JSON.pretty_generate(tap.formula_renames.merge(tap.alias_table)) | ||
File.write("_data/formula_canonical.json", "#{canonical_json}\n") unless args.dry_run? | ||
end | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,31 @@ | ||
# typed: true | ||
# frozen_string_literal: true | ||
|
||
module Utils | ||
module Collection | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is expectation that this will be used for Casks too? Obviously no rush yet but: good to have tests for this. |
||
def self.recursive_compact(value) | ||
case value | ||
when Array | ||
recursive_compact_array(value) | ||
when Hash | ||
recursive_compact_hash(value) | ||
else | ||
value | ||
end | ||
end | ||
|
||
sig { params(array: Array).returns(T.nilable(Array)) } | ||
def self.recursive_compact_array(array) | ||
array.map do |value| | ||
recursive_compact(value) | ||
end.compact.presence | ||
end | ||
|
||
sig { params(hash: Hash).returns(T.nilable(Hash)) } | ||
def self.recursive_compact_hash(hash) | ||
hash.transform_values do |value| | ||
recursive_compact(value) | ||
end.compact.presence | ||
end | ||
end | ||
end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know this blows up the scope a bit but: I'd instead be tempted to actually have a new
Formula#to_hash
(and Cask equivalent) or pass a versioned argument there and generate an entirely new hash format with e.g. less nesting/repetition that better meets our needs.May be a good time to actually have some sort of stricter versioning or e.g. specifying a minimum Homebrew version in the JSON itself here, too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That would be ideal but I'm a bit worried about backwards compatibility. I'm not sure how many people depend on the output of
brew info --json=v2
though. I guess that could just be json v3 and we could just keep the old format at json v2.Also, changing the format entirely would add a lot of complexity to the load from API methods which rely on the current format. This is especially important for formula that get removed since we'll have to load them from the original JSON definition.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess we could have multiple loaders and just have a key like
json_version
that specifies which one we should try and load it with.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here's a sample of the current slimmed-down JSON representation. Let me know if there's anything that looks like it can be removed and/or simplified.
https://gist.github.com/apainintheneck/882bb75d29394f7bdd5f8b5efe482038
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"versions" -> "head"
and"versions" -> "bottle"
can probably be trimmed."full_name"
and"tap"
too probably"versions" -> "head"
is quite an interesting one. In theory we might support:In practice: does it actually work? Not that it's a concern for Homebrew/core, but maybe something we could make error on a Ruby DSL level if it behaves weirdly. If it just makes it behave like the old
devel
spec then it's probably ok, even if weird.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah in my mind a theoretical tap agnostic format would have been:
Which is basically the same thing you suggest.
The tap_migrations JSON being a separate download is a hack indicative of this problem. All of this is backwards incompatible changes however. We're already going to break backwards compatibility here in terms of old brew versions (given 984dcf8), but this would be a compatibility break in terms of old cached or mirrored JSON data (though we can do what we did when moving to JWS if we want to go this direction).
Though a compatibility layer is also possible without too significant effort.
If we go this direction at some point there would also be room to revive the dependency redo code shelved for a theoretical v2 (which might(?) also help our goal of reducing overall object footprint as it deduplicates some things, but it also adds more subkeys so not necessarily): Bo98@25aec4b
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I understand your comment about 984dcf8. Those changes should be backwards compatible.
I like your idea about the tap agnostic format. I think it should be possible to implement that in a backwards compatible way though. Basically, we could add additional logic to build the
Homebrew::API::Formula.all_formulae
hash so that it stays the same as before minus the missing hash keys despite the changes to theformula.jws.json
file. Whether that's worth the effort is another question entirely. This would mean that the JSON files saved during the installation should hold the same amount of information as before.What was the goal of the dependency redo code you mention here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So we'd have two different branches of parsing logic for the old and new
formula.jws.json
files but the same resultingHomebrew::API::Formula.all_formulae
hash if that makes sense.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean if we switch to generate JSONs that strip those previously not-nilable values then older brew versions can't read newer JSON (typically wasn't a concern for git taps, but API downloads can be detached from update.sh).
Idea at the time was to make a formula that does:
which currently produces
be:
but thinking more about more how the Ruby JSON parser works while it might make it physically smaller, it won't necessarily make it more performant so still very much a "to investigate" for the future.
Yeah that was what I was thinking yeah
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, sorry, this was my thinking 👍🏻. Might even be something like
vHomebrewCore
or something as well as a separate v3 which could have more information in a nicer structure.Yeh, this is my thinking: we remove anything that we won't use/need in core.
These seem like good things to omit.
I think we should aim to just force a new API download when we release a new enough Homebrew version. I can see an argument for a couple of point releases where we have branching logic here but it's a small enough download that needs refetched/updated often enough I don't think it's worth keeping around for long at all beyond testing/gradual rollout.
It is not 😁