Internally, Git is a content-addressable object store — it stores everything as objects (blobs, trees, commits, tags) identified by their SHA-1 hash. Understanding this model demystifies Git and explains why its operations behave as they do.
Git is a content-addressable object store
Git stores 4 types of OBJECTS, each identified by the SHA-1 HASH of its content:
BLOB → file CONTENTS (just the data, no filename)
TREE → a directory: maps names → blobs (files) and trees (subdirs) + permissions
COMMIT → a snapshot: points to a TREE (the root) + parent commit(s) + author/message
TAG → an annotated tag object (points to a commit, with metadata)
→ Content-addressable: an object's ID IS the hash of its content (same content = same hash).
