Muddle patch

muddle_patch.py is an independent program in the same directory as the normal muddle program.

Summary

This program was written to address issue 111: Implement “diff build tree against version stamp” and corresponding patch command.

Specifically, given an up-to-date muddle build tree, and the stamp file for a less up-to-date build tree, it attempts to work out the patches to make the latter into a copy of the former. When it can, it tries to do this in such a way that the underlying VCSs will really believe it.

  • In this document, the local (most up-to-date) build tree is “our” build tree, and the other (to be updated) build tree is the “far” build tree.
  • If a checkout is present in our build tree, but absent in the far build tree, then it is NEW, and a copy of the current state of the checkout directory will be transferred. Note that this script will not mark a NEW checkout as checked out (see below).
  • If a checkout is absent in our build tree, but present in the far build tree, then it is DELETED. muddle_patch.py does nothing about this - it is up to the user to delete the checkout at the far end if necessary.
  • If a checkout has changed (just) its revision, then it is CHANGED, and an attempt will be made to determine the differences in the best way possible for the VCS concerned. Again, see below.
  • If a checkout has changed anything else (its location in the build tree, its domain, or the VCS it is using), then it will be reported as a problem, and will not have a patch generated for it. It may be that this is overkill, and some of these should be coped with - if you have a good use-case, please raise it as an issue.

Any checkouts with problems in either stamp file will not be dealt with - in other words, both build trees should really be in a state suitable for muddle stamp version before using muddle_patch.py.

Writing a patch directory

The command line is:

muddle_patch.py write [-f[orce]] <our_stamp> <far_stamp> <patch_dir>

<patch_dir> is the directory to which the patch data will be written. If -force is given, then an existing <patch_dir> will be deleted, otherwise <patch_dir> may not already exist.

<our_stamp> is either a stamp file describing the current build tree, or -. In the latter case, a “virtual” stamp file will be built in memory and used as “our” stamp file. This virtual stamp file is equivalent to that which would be produced by muddle stamp save -f.

<far_stamp> is the stamp file for the far build tree.

muddle_stamp.py write must be run in the top-level directory of “our” build tree - i.e,, the directory containing the .muddle/ and src/ directories. It is not supported to run the program in a sub-domain.

For instance:

muddle_patch.py  write  -f  -  ../far.v8_missing.stamp  ../v8-patches/

Using a patch directory

The command line is:

muddle_patch.py read <patch_dir>

<patch_dir> is the directory containing the patch data.

muddle_stamp.py read must be run in the top-level directory of the “far” build tree - i.e,, the directory containing the .muddle/ and src/ directories. It is not supported to run the program in a sub-domain.

Note

If some part of muddle_patch.py read fails (for instance, if an attempt is made to write a NEW checkout but the target directory already exists) then muddle_patch.py will give up immediately.

The current version of the code does not know how to continue from a partial “read”, so will typically fail if run again (since the first amendment it tries to make will presumably now fail because the relevant patch has already been applied).

The simplest thing to do in this case is probably to edit the MANIFEST.txt file in the <patch_dir> and comment out (using ‘#’) the lines for those checkouts that have already been patched successfully.

Some future version of the program may be more robust.

How it works (or doesn’t)

The MANIFEST.txt file

In the <patch_dir>, muddle_patch.py creates a file called MANIFEST.txt. This is in the traditional INI format, and contains an entry for each checkout described in the <patch_dir>, indicating its VCS and other useful information.

Typical entries look like:

[BZR helpers]
checkout=helpers
directory=None
patch=helpers.bzr_send
old=1
new=3
[TAR v8]
checkout=v8
directory=platform
patch=v8.tgz

There is nothing very magical about this file, and it may sometimes be useful to edit it. Lines starting with ‘#’ or ‘;’ are commented out, and will be ignored. Lines may not start with whitespace.

Entries are of the form:

  • [<vcs> <name>] – where <vcs> is one of BZR, SVN, GIT or NEW, and <name> is the checkout name.
  • checkout=<name> – again, <name> is the checkout name.
  • directory=<subdirectory> – <subdirectory> is either None, or the name of the subdirectory of src/ where the checkout directory may be found (so if the checkout is src/platforms/v8, then <subdirectory> would be platforms).
  • patch=<filename> names the patch file (or, for GIT, the patch directory) in <patch_dir>
  • old=<revision> and new=<revision> give the old and new revision ids (or equivalent) for the two checkouts. “new” corresponds to “our” checkout, and “old” to the “far” checkout. These are not specified for NEW patches.

Changed bzr checkouts

The problem

The original version of this program was tested, and seemed to work as described in How it should be below.

Unfortunately, on retesting with bzr` 2.1.1 and 2.2.1, it no longer seem to work, instead exploding with a bzrlib.errors.NoSuchRevision exception. At time of writing I am still investigating this, in the fond hope that it is something I am doing wrong, rather than a bug in Bazaar (although an exception rather than an error report still seems rather nasty).

Thus for the moment, the program is instead doing What we do instead.

If you wish to change the behaviour back to the original, then edit the muddle_patch.py file, and change:

BZR_DO_IT_PROPERLY = False

to:

BZR_DO_IT_PROPERLY = True

Note

The BZR_DO_IT_PROPERLY flag actually only affects the “write” operation. The “read” operation actually decides what to do based on the extension of the <checkout>.bzr_send or <checkout>.diff file in the patch directory.

How it should be

When writing a changed Bazaar checkout, muddle_patch.py uses:

bzr send

to create a file (in <patch_dir>) containing the differences between the two revisions of the checkout. The file will be named <checkout>.bzr_send, where <checkout> is the muddle checkout name.

When reading the <patch_dir>, muddle_patch.py uses:

bzr merge --pull

to update the checkout. This should produce a result identical to having done an appropriate merge/pull. The use of merge --pull means that Bazaar should attempt to do a pseudo-pull of the changes, and only revert to “merge” behaviour if it has to.

Note

Determine if a bzr commit is still required at this stage.

What we do instead

A simple difference file is generated, using:

bzr diff -p1

to create a file (in <patch_dir>) containing the differences between the two revisions of the checkout. The file will be named <checkout>.diff, where <checkout> is the muddle checkout name.

When reading the <patch_dir>, muddle_patch.py uses:

patch -p1

to update the checkout. This will produce a result similar to that in the “near” checkout, with two important exceptions:

  1. History is not propagated, and
  2. Any new but empty files in the “near” checkout will not be created in the “far” checkout. This is a limitation of the diff/patch sequence we have available.

After the patch has succeeded, it will be necessary to do a bzr add (to catch any new files), followed by a bzr commit to actually commit the result.

If any files have been removed by the patch process, these will have to be manually removed from bazaar.

Changed svn checkouts

When writing a changed Subversion checkout, muddle_patch.py uses:

svn diff

to create a file (in <patch_dir> containing the differences between the two revisions of the checkout.

When reading the <patch_dir>, muddle_patch.py uses:

patch

to update the checkout. This should result in a checkout with the same source file content as wished, but of course the Subversion metadata will not have been changed, so Subversion will think it has a different revision number.

Changed git checkouts

When writing a changed git checkout, muddle_patch.py uses:

git format-patch

to create a diretory (in <patch_dir> containing the differences between the two revisions of the checkout.

When reading the <patch_dir>, muddle_patch.py uses:

git am

to update the checkout.

git am leaves HEAD detached, so this needs fixing – for instance, if one was on branch master, one might do:

git branch post-update-branch   # to reattach HEAD to a branch
git checkout master             # to go back to master, if that's correct
git merge post-update-branch    # to merge in our new stuff to master

Note

The writer of this document is not a git expert, so please treat the above with caution.

New checkouts

When writing a NEW checkout, muddle_patch.py uses tar to create a gzipped tarfile for the checkout source directory, in <patch_dir>. This contains all of the content of that directory, whether under version control or not (and it also includes any “magic” VCS directories, such as .bzr/).

When reading the <patch_dir>, muddle_patch.py simply uses tar to unzip and unpack the checkout in the appropriate place. It will refuse to do this if a directory of the right name already exists (i.e., it does not overwrite an existing directory).

It will not attempt to add the new checkout to any version control system.