How Does Git Work Internally?
Blob, Tree, Commit — The Three Git Objects
The myth that 'Git stores diffs' is wrong.
Git stores a full snapshot of every file at every commit. But since identical content produces the same hash, unchanged files aren't duplicated.
The three objects
Blob: File content. Identified by SHA-1 of the content, not the filename. Identical content is always one blob.
Tree: Directory. Maps filename -> blob hash, and subdirectory -> tree hash.
Commit: Top-level tree hash + parent commit hashes + author + message. That's all.
Branches are just pointers
.git/refs/heads/main is a text file with one commit hash in it. Creating a branch means creating one file. That's why Git branches are cheap.
HEAD is another pointer that tracks which branch you're currently on.
How It Works
`git add` -> creates blob from file content (.git/objects/)
`git commit` -> builds tree from index, then commit = tree + parent + message
Branch pointer (.git/refs/heads/xxx) updated to new commit hash
`git log` walks HEAD -> commit -> parent -> parent... chain backwards
`git cat-file -p <hash>` lets you inspect any object directly