Skip to content

Conversation

@cruessler
Copy link
Contributor

This is a draft that adds ObjectId::Sha256 and Kind::Sha256 to gix-hash. I’m opening this to get early feedback, in particular from a CI run.

Some remarks:

  • It feels wrong to put empty_blob_sha256 and empty_tree_sha256 into oid as the comment says oid is related to Sha1, but I’m unsure how to best resolve this tension.
  • There’s inconsistencies with respect to spelling: SHA256, SHA-256, Sha256.
  • ObjectId::null_sha1 hardcodes 20. Should this be SIZE_OF_SHA1_DIGEST? oid::null_sha1 uses SIZE_OF_SHA1_DIGEST.
  • Kind::len_in_hex and Kind::len_in_bytes also hardcode values.
  • Hardcoded empty trees and empty blobs appear in more than place (for both SHA-1 and SHA-256). I feel like we should have a single source of truth here.
  • Some comments in ObjectId would probably benefit from being rephrased. See e. g. ObjectId::new_sha1 or ObjectId::from_20_bytes.

Sources for the hardcoded values related to SHA-256:

https://github.com/git/git/blob/e85ae279b0d58edc2f4c3fd5ac391b51e1223985/hash.h
https://github.com/git/git/blob/e85ae279b0d58edc2f4c3fd5ac391b51e1223985/hash.c

@Byron
Copy link
Member

Byron commented Dec 11, 2025

You are on fire :)! Let me share some early thoughts, without having looked at the code at all.

  • It feels wrong to put empty_blob_sha256 and empty_tree_sha256 into oid as the comment says oid is related to Sha1, but I’m unsure how to best resolve this tension.

And indeed, &oid is supposed to represent any hash. if empty_blob somehow is hardcoded to always be SHA1, it's a bug and it needs to take a gix-hash::Kind. The empty_something could also be implemented on gix-hash::Kind while at it.

  • There’s inconsistencies with respect to spelling: SHA256, SHA-256, Sha256.

Should we fix them in favor of what Git seems to do: SHA256?

Screenshot 2025-12-11 at 04 59 53

I am definitely open to your suggestions.

  • ObjectId::null_sha1 hardcodes 20. Should this be SIZE_OF_SHA1_DIGEST? oid::null_sha1 uses SIZE_OF_SHA1_DIGEST.

Yes, that sounds like an oversight.

  • Kind::len_in_hex and Kind::len_in_bytes also hardcode values.

Yes, let's use constants where we have them. And since there is already a precedent with SIZE_OF_SHA1_DIGEST we might also have the respective SHA256 versions and use them.

  • Hardcoded empty trees and empty blobs appear in more than place (for both SHA-1 and SHA-256). I feel like we should have a single source of truth here.

I am open to suggestions as to where the empty-* methods should be, but would also not lightheartedly introduce breaking changes. If the new way feels better, we should have it. By now, I always think in terms of gix-hash::Kind::null()/empty_*() by the way, which aligns well with repo.object_hash().…()

  • Some comments in ObjectId would probably benefit from being rephrased. See e. g. ObjectId::new_sha1 or ObjectId::from_20_bytes.

Yes, it's a good time to unify this and make it less 'hardcoded'.

Thanks again 🙏

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants