Git Objects

There are four types of Git objects: blobs, trees, commits and tags. For eachone pygit2 has a type, and all four types inherit from the base Objecttype.

Object lookup

In the previous chapter we learnt about Object IDs. With an Oid we can ask therepository to get the associated object. To do that the Repository classimplementes a subset of the mapping interface.

  • Repository.get(key, default=None)
  • Return the Git object for the given id, returns the default value ifthere’s no object in the repository with that id. The id can be an Oidobject, or an hexadecimal string.

Example:

  1. >>> from pygit2 import Repository
  2. >>> repo = Repository('path/to/pygit2')
  3. >>> obj = repo.get("101715bf37440d32291bde4f58c3142bcf7d8adb")
  4. >>> obj
  5. <_pygit2.Commit object at 0x7ff27a6b60f0>
  • Repository.getitem(id)
  • Return the Git object for the given id, raise KeyError if there’s noobject in the repository with that id. The id can be an Oid object, oran hexadecimal string.

  • Repository.contains(id)

  • Returns True if there is an object in the Repository with that id, Falseif there is not. The id can be an Oid object, or an hexadecimal string.

The Object base type

The Object type is a base type, it is not possible to make instances of it, inany way.

It is the base type of the Blob, Tree, Commit and Tag types, soit is possible to check whether a Python value is an Object or not:

  1. >>> from pygit2 import Object
  2. >>> commit = repository.revparse_single('HEAD')
  3. >>> print(isinstance(commit, Object))
  4. True

All Objects are immutable, they cannot be modified once they are created:

  1. >>> commit.message = u"foobar"
  2. Traceback (most recent call last):
  3. File "<stdin>", line 1, in <module>
  4. AttributeError: attribute 'message' of '_pygit2.Commit' objects is not writable

Derived types (blobs, trees, etc.) don’t have a constructor, this means theycannot be created with the common idiom:

  1. >>> from pygit2 import Blob
  2. >>> blob = Blob("data")
  3. Traceback (most recent call last):
  4. File "<stdin>", line 1, in <module>
  5. TypeError: cannot create '_pygit2.Blob' instances

New objects are created using an specific API we will see later.

This is the common interface for all Git objects:

  • Object.id
  • The object id, an instance of the Oid type.

  • Object.type

  • One of the GIT_OBJ_COMMIT, GIT_OBJ_TREE, GIT_OBJ_BLOB or GIT_OBJ_TAGconstants.

  • Object.short_id

  • An unambiguous short (abbreviated) hex Oid string for the object.

  • Object.read_raw()

  • Returns the byte string with the raw contents of the object.

  • Object.peel(target_type) → Object

  • Peel the current object and returns the first object of the given type

  • Object.eq(Object)

  • Return self==value.

  • Object.ne(Object)

  • Return self!=value.

  • Object.hash()

  • Return hash(self).

Blobs

A blob is just a raw byte string. They are the Git equivalent to files ina filesytem.

This is their API:

  • Blob.data
  • The contents of the blob, a bytes string. This is the same asBlob.read_raw()

Example, print the contents of the .gitignore file:

  1. >>> blob = repo["d8022420bf6db02e906175f64f66676df539f2fd"]
  2. >>> print(blob.data)
  3. MANIFEST
  4. build
  5. dist
  • Blob.size
  • Size.

Example:

  1. >>> print(blob.size)
  2. 130
  • Blob.is_binary
  • True if binary data, False if not.

  • Blob.diff([blob, flag, old_as_path, new_as_path])

  • Directly generate a pygit2.Patch from the differencebetween two blobs.

Returns: Patch.

Parameters:

  • blob :Blob
  • The Blob to diff.
  • flag
  • A GITDIFF* constant.
  • old_as_path :str
  • Treat old blob as if it had this filename.
  • new_as_path :str
  • Treat new blob as if it had this filename.
    • Blob.diffto_buffer([_buffer, flag, old_as_path, buffer_as_path])
    • Directly generate a Patch from the differencebetween a blob and a buffer.

Returns: Patch.

Parameters:

  • buffer :Blob
  • Raw data for new side of diff.
  • flag
  • A GITDIFF* constant.
  • old_as_path :str
  • Treat old blob as if it had this filename.
  • buffer_as_path :str
  • Treat buffer as if it had this filename.

Creating blobs

There are a number of methods in the repository to create new blobs, and addthem to the Git object database:

  • Repository.createblob(_data) → Oid
  • Create a new blob from a bytes string. The blob is added to the Gitobject database. Returns the oid of the blob.

Example:

  1. >>> id = repo.create_blob('foo bar') # Creates blob from bytes string
  2. >>> blob = repo[id]
  3. >>> blob.data
  4. 'foo bar'
  • Repository.createblob_fromworkdir(_path) → Oid
  • Create a new blob from a file within the working directory. The givenpath must be relative to the working directory, if it is not an erroris raised.

  • Repository.createblob_fromdisk(_path) → Oid

  • Create a new blob from a file anywhere (no working directory check).

  • Repository.createblob_fromiobase(_io.IOBase) → Oid

  • Create a new blob from an IOBase object.

There are also some functions to calculate the id for a byte string withoutcreating the blob object:

  • pygit2.hash(data) → Oid
  • Returns the oid of a new blob from a string without actually writing tothe odb.

  • pygit2.hashfile(path) → Oid

  • Returns the oid of a new blob from a file path without actually writingto the odb.

Trees

A tree is a sorted collection of tree entries. It is similar to a folder ordirectory in a file system. Each entry points to another tree or a blob. Atree can be iterated, and partially implements the sequence and mappinginterfaces.

  • Tree.getitem(name)
  • Return the TreeEntry object for the given name. Raise KeyError ifthere is not a tree entry with that name.

  • Tree.contains(name)

  • Return True if there is a tree entry with the given name, False otherwise.

  • Tree.len()

  • Return the number of entries in the tree.

  • Tree.iter()

  • Return an iterator over the entries of the tree.

  • Tree.diffto_tree([_tree, flags, context_lines, interhunk_lines, swap]) → Diff

  • Show the changes between two trees.

Parameters:

  • tree: Tree
  • The tree to diff. If no tree is given the empty tree will be usedinstead.
  • flag
  • A GITDIFF* constant.
  • context_lines
  • The number of unchanged lines that define the boundary of a hunk(and to display before and after).
  • interhunk_lines
  • The maximum number of unchanged lines between hunk boundaries beforethe hunks will be merged into a one.
  • swap
  • Instead of diffing a to b. Diff b to a.
    • Tree.diffto_workdir([_flags, context_lines, interhunk_lines]) → Diff
    • Show the changes between the Tree and the workdir.

Parameters:

  • flag
  • A GITDIFF* constant.
  • context_lines
  • The number of unchanged lines that define the boundary of a hunk(and to display before and after).
  • interhunk_lines
  • The maximum number of unchanged lines between hunk boundaries beforethe hunks will be merged into a one.
    • Tree.diffto_index(_index[, flags, context_lines, interhunk_lines]) → Diff
    • Show the changes between the index and a given Tree.

Parameters:

  • tree :Tree
  • The tree to diff.
  • flag
  • A GITDIFF* constant.
  • context_lines
  • The number of unchanged lines that define the boundary of a hunk(and to display before and after).
  • interhunk_lines
  • The maximum number of unchanged lines between hunk boundaries beforethe hunks will be merged into a one.

Tree entries

  • TreeEntry.name
  • Name.

  • TreeEntry.id

  • Object id.

  • TreeEntry.hex

  • Hex oid.

  • TreeEntry.filemode

  • Filemode.

  • TreeEntry.type

  • Type.

  • TreeEntry.cmp(TreeEntry)

  • Rich comparison between tree entries.

Example:

  1. >>> tree = commit.tree
  2. >>> len(tree) # Number of entries
  3. 6
  4.  
  5. >>> for entry in tree: # Iteration
  6. ... print(entry.id, entry.type, entry.name)
  7. ...
  8. 7151ca7cd3e59f3eab19c485cfbf3cb30928d7fa blob .gitignore
  9. c36f4cf1e38ec1bb9d9ad146ed572b89ecfc9f18 blob COPYING
  10. 32b30b90b062f66957d6790c3c155c289c34424e blob README.md
  11. c87dae4094b3a6d10e08bc6c5ef1f55a7e448659 blob pygit2.c
  12. 85a67270a49ef16cdd3d328f06a3e4b459f09b27 blob setup.py
  13. 3d8985bbec338eb4d47c5b01b863ee89d044bd53 tree test
  14.  
  15. >>> entry = tree['pygit2.c'] # Get an entry by name
  16. >>> entry
  17. <pygit2.TreeEntry object at 0xcc10f0>
  18.  
  19. >>> blob = repo[entry.id] # Get the object the entry points to
  20. >>> blob
  21. <pygit2.Blob object at 0xcc12d0>

Creating trees

  • Repository.TreeBuilder([tree]) → TreeBuilder
  • Create a TreeBuilder object for this repository.

  • TreeBuilder.insert(name, oid, attr)

  • Insert or replace an entry in the treebuilder.

Parameters:

  • attr
  • Available values are GIT_FILEMODE_BLOB,GIT_FILEMODE_BLOB_EXECUTABLE, GIT_FILEMODE_TREE, GIT_FILEMODE_LINKand GIT_FILEMODE_COMMIT.
    • TreeBuilder.remove(name)
    • Remove an entry from the builder.
  • TreeBuilder.clear()
  • Clear all the entries in the builder.

  • TreeBuilder.write() → Oid

  • Write the tree to the given repository.

  • TreeBuilder.get(name) → TreeEntry

  • Return the TreeEntry for the given name, or None if there is not.

Commits

A commit is a snapshot of the working dir with meta informations like author,committer and others.

  • Commit.author
  • The author of the commit.

  • Commit.committer

  • The committer of the commit.

  • Commit.message

  • The commit message, a text string.

  • Commit.message_encoding

  • Message encoding.

  • Commit.raw_message

  • Message (bytes).

  • Commit.tree

  • The tree object attached to the commit.

  • Commit.tree_id

  • The id of the tree attached to the commit.

  • Commit.parents

  • The list of parent commits.

  • Commit.parent_ids

  • The list of parent commits’ ids.

  • Commit.commit_time

  • Commit time.

  • Commit.commit_time_offset

  • Commit time offset.

  • Commit.gpg_signature

  • A tuple with the GPG signature and the signed payload.

Signatures

The author and committer attributes of commit objects are Signatureobjects:

  1. >>> commit.author
  2. <pygit2.Signature object at 0x7f75e9b1f5f8>

Signatures can be compared for (in)equality.

  • Signature.name
  • Name.

  • Signature.raw_name

  • Name (bytes).

  • Signature.email

  • Email address.

  • Signature.raw_email

  • Email (bytes).

  • Signature.time

  • Unix time.

  • Signature.offset

  • Offset from UTC in minutes.

Creating commits

  • Repository.createcommit(_reference_name, author, committer, message, tree, parents[, encoding]) → Oid
  • Create a new commit object, return its oid.

Commits can be created by calling the create_commit method of therepository with the following parameters:

  1. >>> author = Signature('Alice Author', 'alice@authors.tld')
  2. >>> committer = Signature('Cecil Committer', 'cecil@committers.tld')
  3. >>> tree = repo.TreeBuilder().write()
  4. >>> repo.create_commit(
  5. ... 'refs/heads/master', # the name of the reference to update
  6. ... author, committer, 'one line commit message\n\ndetailed commit message',
  7. ... tree, # binary string representing the tree object ID
  8. ... [] # list of binary strings representing parents of the new commit
  9. ... )
  10. '#\xe4<u\xfe\xd6\x17\xa0\xe6\xa2\x8b\xb6\xdc35$\xcf-\x8b~'

Tags

A tag is a static label for a commit. See references for more information.

  • Tag.name
  • Tag name.

  • Tag.raw_name

  • Tag name (bytes).

  • Tag.target

  • Tagged object.

  • Tag.tagger

  • Tagger.

  • Tag.message

  • Tag message.

  • Tag.raw_message

  • Tag message (bytes).

  • Tag.get_object() → object

  • Retrieves the object the current tag is pointing to.

Creating tags

  • Repository.createtag(_name, oid, type, tagger, message) → Oid
  • Create a new tag object, return its oid.