Distributions, licenses and muddle distribute

What we want to do

We want to be able to:

  • Distribute build trees to other users, without their needing to clone stuff from the build’s repositories.
  • Indicate the licensing for checkouts, and thus
  • Distribute partial build trees, especially partial build trees whose content is determined by that licensing.
  • In particular, distribute partial build trees that satisfy GPL compliance rules, but are still maximally useful as build trees.

The muddle distribute command

The muddle distribute command is intended to help with the following use cases:

  1. I want to distribute all of my source code as an archive (typically, a .tgz (gzipped TAR file) or .zip file). How do I extract the relevant directories?
  2. I want to prepare a binary release for passing to someone.
  3. I want to prepare a distribution where some packages are distributed as source code, some as binary, and some not at all. This should be doable by:
    1. specifying checkouts and packages explicitly, or
    2. specifying checkout licenses, and selecting based on licenses.

That third case was actually the driving force of the development of muddle distribution support, as we wanted to be able to produce a build tree containing source code for any open-source packages, binary blobs for proprietary packages whose source code is private, and omitting entirely packages that are private (i.e., typilcally for proprietary reasons).

The desired intent is that, whatever means (1 through 3) is used, the result should be a functioning muddle build tree. Clearly this might not always be possible for some of the more extreme values of case 3 (if too much of the build tree is not distributed because it is private), but even in such cases it should ideally be possible to write the build description(s) in such a manner that something useful can be built.

(For instance, GPL v2 says “For an executable work, complete source code means all the source code for all modules it contains, plus any associated interface definition files, plus the scripts used to control compilation and installation of the executable.” (I hope that counts as fair quoting!). At the individual checkout/package level, the Makefiles are simple to distribute. Quite often the “scripts used to ...” for the entire system, though, are indeed those used to build the system, but don’t actually function outside the original context in which they were used. We’d like to make it possible, in this sort of situation, to distribute muddle build trees that actually work and produce someting useful.)

Warning

The muddle distribute command is new, and the details of its operation, and especially of the -no-muddle-makefile and -copy-vcs switches, may change.

Note

The muddle distribute _deployment command is bundled as part of muddle distribute, but works rather differently, and does not distribute any part of the source tree. See Deployment distribution below.

What is a distribution?

A distribution is two things:

  1. a copy of all or part of the muddle build tree (the result of distributing)
  2. a description of what to copy, for this purpose

The muddle distribute command uses an instance of (2) to produce (1).

Distributions are identified by a name, and each is also tagged with which license categories it can distribute.

The name is just a string. As normal, we reserve names starting with underscore (_) for muddle itself to define. These are termed “standard distributions”, and new standard distributions may be introduced as time goes on, so avoid defining new distributions with names starting with an underscore.

Standard distributions include:

  • _source_release - discussed in Case 1: Source distribution
  • _binary_release - discussed in Case 2: Binary distribution
  • _for_gpl - discussed in Case 3: The other standard distributions
  • _all_open - ditto
  • _by_license - ditto

License categories are the broad classification of license type, and are discussed below.

How does it work? (you can safely ignore this bit)

More-or-less as you might expect, with labels and tags and actions.

For those checkouts or packages that are to be distributed, a rule is added to “build” label checkout:<name>/distributed from checkout:<name>/checked_out using the DistributeCheckout action, or package:<name>{<role>)/distributed from package:<name>{<role>}/postinstalled using the DistributePackage action.

Note

Technically doing a DistributePackage of just the obj directories doesn’t require that the package(s) be /postinstalled, as /built would suffice. I don’t think that’s worth worrying about at the moment, especially as muddle always builds through to /postinstalled if at all posible (i.e., that’s what muddle build does, and see what it says in muddle help build about depending on /postinstalled being the norm).

For source and binary releases (cases 1 and 2) the muddle distribute command itself organises this. For case 3, appropriate code in the build description causes the rules to be added, or sets the conditions that allow muddle distribute to do it.

You won’t, however, see any /distribute tags in the .muddle tags directories, as the /distribute tag is treated as “transient” - i.e., it doesn’t get recorded on disk. This avoids problems if one (for instance) does a source distribution with VCS to one directory, and then without to another, followed by a partial distribution to yet another - it would not be clear what the /distribute tags meant, and it would always be necessary to clear them before each muddle distribute.

Source distribution

In a simple build tree, where there are no subdomains, a .tgz file can be produced with a command like:

$ tar -zcvf archive.tgz build/src build/.muddle build/versions

where build is the directory containing the build tree.

If VCS directories (.git/ and the like) are not wanted, then something like:

$ tar -zcvf --exclude-vcs archive.tgz build/src build/.muddle build/versions

may suffice.

However:

  1. Both of those copies over the entirety of the .muddle directory, not just the parts needed (so one would typically get tags set for packages indicating that they have been built, which will not be true of the archived build tree).
  2. If there are subdomains, the command would need to be rather more complex. And if you’re doing something weird, like having a VCS directory (perhaps .git) for the tree as stored in the local repositories, but also a historical VCS directory (perhaps .bzr) that came with the checkout when it was imported, then it all gets rather difficult.

So instead the standard distribution “_source_release” is provided. The command:

$ pushd build
$ muddle distribute -with-version _source_release ../release_1.0
$ popd
$ tar -zcvf release_1.0.tgz release_1.0

will copy the sources to a directory called release_1.0.

Specifically, it will copy:

  • each checkout directory, including the build description, in all domains
  • the versions directory, if it exists
  • the necessary parts of the .muddle directory:
    • the Description, RootRepository and VersionsRepository files
    • the tags/checkout directory and the tags for the copied checkouts

It does not copy VCS “special” files (as determined by the build description, .git, .gitignore, etc.) for the checkouts, or for the versions directory.

If the versions directory is not wanted, then the -with-versions switch can be left off.

If the VCS “special” files in the checkouts (and versions directory) are wanted, then the -with-vcs switch should be specified:

$ muddle distribute -with-version -with-vcs _source_release ../release_1.0.vcs

In either case, because the necessary parts of the .muddle directory have been copied, it should be possible to build the build tree in the normal manner:

$ cd release_1.0
$ muddle

Binary distribution

Sometimes someone wants a binary release. A binary release is taken to be the “end product” of doing a muddle build, and is thus the content of the install directories, plus the build descriptions and muddle Makefiles (so the recipient hopefully has enough context to muddle deploy if necessary).

For this purpose, the standard distribution “_binary_release” is provided. The command:

$ muddle distribute _binary_release ../release_1.0-bin

performs a binary release. It copies:

  • the install/<role> directory for each package (this is, of course, the same directory for all packages in the same role)

  • the build description (i.e., its checkout) for the top-level and for any subdomains that have distributed packages (and any “intermediate” subdomains - i.e., if we distribute package:(a(b))name/* then we will distribute the build description for subdomain “a(b)”, but also for “a”).

  • the necessary parts of the .muddle directory:

    • the Description, RootRepository and VersionsRepository files
    • the tags/package directory, and the tags for each package and its role therein
    • the tags/checkout directory and the tags for the checkouts used by the distributed packages
  • the muddle Makefile (alone in its source checkout directory) for each package.

    For instance, if package:zlib{x86}/* is built from checkout:zlib-3.0/checked_out, using muddle Makefile Makefile.muddle in src/libs/zlib-3.0, then the user will receive a directory called src/libs/zlib-3.0 that just contains Makefile.muddle.

    Note that this is not always enough to allow deployment, and sometimes other support files will need to be (epxlicitly) added via the build description. See Adding extra checkout files in a binary distribution below.

As you might guess, if the -with-version switch is given, then it will also distribute the versions directory, if it exists.

Also, if -with-vcs is specified, the VCS “special” files for the build descriptions and the versions directory (if distributed) will also be included.

Deployment distribution

Sometimes, a customer just wants the content of the deploy/ directory. This can happen because they are not set up to deal with muddle, or because they really want to have just the results of a build, quickly.

In some situations, it would then be sufficient just to zip up the deploy/ directory directly, and give that to the customer, but it is better practice to also include a stamp file indicating the status of the source tree from which the deployment was generated, and also some indication of what command was used to build the distribution.

Thus muddle distribute _deployment is provided as a special case. It does not work with the same mechanisms as the other distribute commands (which, in particular, means that none of the switches are relevant). Instead:

muddle distribute _deployment <target_directory> <labels>

acts more or less as if the user had done:

  • muddle deploy <labels>
  • mkdir -p <target_directory>
  • cp -a deploy <target_directory>
  • muddle stamp save <target_directory>/`muddle query name`.stamp
  • cat "muddle distribute _deployment <target_directory> <labels>" > <target_directory>/MANIFEST.txt

This results in a <target_directory> which contains:

  1. A copy of the deploy directory
  2. A version stamp
  3. A MANIFEST.txt file, reproducing the command line used

which the command user can then zip or tar up as appropriate.

Note

It is recommended that the user ensures that the deploy directory is created from scratch (so it doesn’t have any previous build artefacts left in it). The recommended way is either to make a new muddle init in a clean directory, or to do a muddle veryclean, before using this command.

So, for instance:

$ muddle veryclean
$ muddle distribute _deployment ../deploy-20130902 _all
$ cd ..
$ tar -zcf deploy-20130902.tgz deploy-20130902

Filtering the labels distributed

Sometimes it is not desirable to distribute all of the build tree. For instance, only one deployment might be appropriate to distribute, or only a subset of packages and checkouts.

If the muddle distribute command line ends with one or more labels, then they will be used to determing a “filter” for the distribution.

If a deployment label is given, then all packages in that deployment will be added to the filter, and also all checkouts used directly by each of those packages.

If a package label is given, then that package and all checkouts used directly by it will be added to the filter.

Finally, if a checkout label is given, then that checkout will be added to the filter.

In all cases, the actual label tag is ignored.

If a filter (i.e., a set of package and checkout labels) has been specified, then after the labels for the distribution, taken from the entire build tree, have been determined, the will be compared to the filter set, and only those labels which occur in the filter set will be kept in the distribution. Note that this filtering is done before build descriptions are added.

So, for instance:

$ muddle distribute _binary_release ../release_fred-bin deployment:fred

will only consider packages and checkouts in deployment fred for distribution.

Basic setting up of Licenses

Any piece of source code has a license, whether implicit or explicit. Muddle does not try to capture the whole essence of what a “license” means, but simply allows the association of a checkout with a license name and some basic properties.

“”“DISCLAIMER: The allocation of licenses to particular checkouts within a build description is purely for programmatic purposes for use with the “muddle” build tool, and is not necessarily intended to reflect the actual licensing state of a particular checkout - for that, please consult the checkout source code directly.”“”

The licenses module

Stuff to do with licensing is conveniently packaged in the muddled.licenses module, so one might typically do:

from muddled.licenses import set_license

(and so on for any other items needed from it), or just:

import muddled.licenses

or:

import muddled.licenses as licenses

The muddle documentation command can be used to get docstrings from the source code - for instance:

muddle doc licenses
muddle doc licenses.LicenseLGPL
muddle doc licenses.set_license

License categories

For this purpose, licenses come in four broad categories:

  • gpl. This category includes all of the FSF GPL licenses.

    For our purposes, they are distinguished by two properties:

    1. There is a requirement to be able to produce the source code on request. That “source code” includes the checkout itself and also the build infrastructure needed to build it (although there is no requirement that this must necessarily work outside the build-as-a-whole).

    2. Some (but not all) GPL licenses “propagate” to other checkouts.

      This means that if package:A is built from GPL checkout:A, and package:B links with package:A, then checkout:B may also need to be distributed.

      There are two considerations here, though:

      1. Some GPL-licensed checkouts will have licenses with exemptions that specifically allow certain sorts of linking, without causing this GPL “propagation”. The best known example is probably the C libraries, which one may explicitly link against, and whose header files can be freely included.

        To allow for this, when defining a GPL license, muddle allows the license to be marked with or without such exemption. There are several examples in the standard licenses - for instance, GPL-2.0-with-GCC-exception.

      2. When muddle is told that package:B depends on package:A, there may be several reasons for this, not all of which trigger GPL “propagation”.

        It may be that package:B depends on a file having been created by package:A, and that file itself is not GPL. Or the checkout for package:A may be LGPL, and package:B is linking against package:A dynamically (which is allowed), and not statically.

        A function is thus provided to allow a build description to say that this is in fact the case.

      In either case, note that muddle is not trying to describe the fine details of how GPL licenses work, and nor is the above meant to be an authoratitive explanation. It is still up to the individual writing the build description to use the facilities available to do what is required by law.

  • open-source. This category includes all open-source licenses which are not GPL-like - i.e., they do not require source code distribution, nor do they propagate.

  • prop-source. Proprietary source. This is source code that is proprietary, but still distributed as source. Examples might include /etc files, configuration files, and Python scripts.

  • binary. This category is used for checkouts (and thus packages using them) that are only to be distributed as binary.

    Note that this will still (normally) distribute the appropriate muddle Makefile, and possibly other user-nominated files.

  • private. This category is used for checkouts (and thus packages using them) that should not, in general, be distributed. Some provision is also made for not distributing the build description for such packages, as well.

There is also a “meta” type, not licensed. Any checkout that has not had its license specified is regarded as not licensed. In many cases, such checkouts will be treated as if they had open-source licenses.

Standard licenses

Muddle predefines some of the more common licenses, and gives them useful short names (such as “gpl3”, “bsd-new” and “mpl”). A list of the standard licenses can be printed out with:

$ muddle query licenses

For instance:

$ muddle query licenses
Standard licenses are:

GPL-2.0                          LicenseGPL('GPL', version='v2.0 only')
GPL-2.0+                         LicenseGPL('GPL', version='v2.0 or later')
GPL-2.0-linux                    LicenseGPL('GPL', version='v2.0', with_exception=True)
GPL-2.0-with-GCC-exception       LicenseGPL('GPL with GCC Runtime Library exception', version='v2.0', with_exception=True)
GPL-2.0-with-autoconf-exception  LicenseGPL('GPL with Autoconf exception', version='v2.0', with_exception=True)
GPL-2.0-with-bison-exception     LicenseGPL('GPL with Bison exception', version='v2.0', with_exception=True)
GPL-2.0-with-classpath-exception LicenseGPL('GPL with Classpath exception', version='v2.0', with_exception=True)
GPL-2.0-with-font-exception      LicenseGPL('GPL with Font exception', version='v2.0', with_exception=True)
GPL-3.0                          LicenseGPL('GPL', version='v3.0 only')
GPL-3.0+                         LicenseGPL('GPL', version='v3.0 or later')
GPL-3.0-with-GCC-exception       LicenseGPL('GPL with GCC Runtime Library exception', version='v3.0', with_exception=True)
GPL-3.0-with-autoconf-exception  LicenseGPL('GPL with Autoconf exception', version='v3.0', with_exception=True)
GPL-3.0-with-bison-exception     LicenseGPL('GPL with Bison exception', version='v3.0', with_exception=True)
GPL-3.0-with-classpath-exception LicenseGPL('GPL with Classpath exception', version='v3.0', with_exception=True)
GPL-3.0-with-font-exception      LicenseGPL('GPL with Font exception', version='v3.0', with_exception=True)
LGPL-2.0                         LicenseLGPL('Lesser GPL', version='v2.0 only')
LGPL-2.0+                        LicenseLGPL('Lesser GPL', version='v2.0 or later')
LGPL-2.1                         LicenseLGPL('Lesser GPL', version='v2.1 only')
LGPL-2.1+                        LicenseLGPL('Lesser GPL', version='v2.1 or later')
LGPL-3.0                         LicenseLGPL('Lesser GPL', version='v3.0 only')
LGPL-3.0+                        LicenseLGPL('Lesser GPL', version='v3.0 or later')

APL-1.0                          LicenseOpen('Adaptive Public License', version='1.0')
Apache-2.0                       LicenseOpen('Apache', version='2.0')
Artistic-2.0                     LicenseOpen('Artistic License', version='2.0')
BSD-2-Clause                     LicenseOpen('BSD 2-clause "Simplified" or "FreeBSD"')
BSD-3-Clause                     LicenseOpen('BSD 3-clause "New" or "Revised" license')
BSD-4-Clause                     LicenseOpen('BSD 4-clause "Original" license ("with advertising")')
BSL-1.0                          LicenseOpen('Boost Software License', version='1.0')
CDDL-1.0                         LicenseOpen('Common Development and Distribution License')
EPL-1.0                          LicenseOpen('Eclipse Public License', version='1.0')
IPA                              LicenseOpen('IPA Font License')
Libpng                           LicenseOpen('libpng license')
MIT                              LicenseOpen('MIT License')
MPL-1.1                          LicenseOpen('Mozilla Public License', version='1.1')
MPL-2.0                          LicenseOpen('Mozilla Public License', version='2.0')
OFL-1.1                          LicenseOpen('Open Font License', version='1.1')
OSL-3.0                          LicenseOpen('Open Software License', version='3.0')
Python-2.0                       LicenseOpen('Python License', version='2.0')
QPL-1.0                          LicenseOpen('Q Public License', version='1.0')
UKOGL                            LicenseOpen('UK Open Government License')
Zlib                             LicenseOpen('zlib license')

Proprietary                      LicenseProprietarySource('Proprietary Source')

Private                          LicensePrivate('Private')
CODE NIGHTMARE GREEN             LicensePrivate('Code Nightmare Green')

The list of standard licenses has been inspired by some of the content at the Open Source Initiative, specifically by the lists at http://opensource.org/licenses/alphabetical and http://opensource.org/licenses/category.

The key for each license is its SPDX short-form identifier, as described at http://www.spdx.org/licenses/.

Licensing a checkout

There are several ways to assign a license to a checkout.

The first and simplest is directly:

set_license(builder, co_label, license)

In this call, license is either the short name for a standard license, or an instance of a License subclass (see Creating a new license below).

co_label is the label for the checkout to which this license should apply.

Note

For convenience, we also allow the name of a checkout to be used in the call of set_license. This seems worth it because this is the function that is used most often in setting up licenses, and constructing labels just for this purpose can be a chore. However, remember that using a checkout name does not allow specifying a subdomain (for those who are using subdomains).

It may be simpler to specify a license for more than one checkout at the same time. For this, you can use:

set_license_for_names(builder, co_names, license)

license is the same as before, but co_names is a sequence of checkout names. This is normally all that’s needed in a build description, but note that it does mean that the checkouts cannot be in a different domain.

So, for instance:

set_license(builder, checkout('zlib'), 'Zlib')

set_license_for_names(builder, ['docs', 'specs'],
                      LicenseOpen('Creative Commons Attribution 2.0'
                                  ' UK: England & Wales License.')

Some licenses, notably the BSD licenses, require that a file containing the license be distributed, even in binary distributions. In this case, the checkout license is set with:

set_license(builder, co_label, license, license_file)

where license_file is the name of the file (its path relative to the checkout directory). For instance:

set_license(builder, 'strace-4.5.20, 'BSD-3-Clause', license_file='COPYRIGHT')

This implicitly adds the license file COPYRIGHT as a required file for all distributions of checkout strace-4.5.20.

Note

You may feel that you need to indicate the appropriate license file for each GPL licensed checkout as well. The GPL licenses require clear indication of the license being used in a binary distribution, and for many purposes the indication in the build description may be enough. However, for absolute safety you might want to force distribution of the license file, which is normally in a file called “COPYING”.

(Clearly this is not a problem with a source distribution.)

Creating a new license

There are license classes corresponding to each of the license categories. They are all subclasses of License, but License itself is not intended to be used directly.

  • LicenseGPL(name, version='v3.1', with_exception=False)

    This creates a GPL license, version v3.1.

    If with_exception is True, then the license does not automatically “propagate” to other packages/checkouts that depend on it.

  • LicenseLGPL(name, with_exception=False)

    This creates an LGPL license. If with_exception is True, then the license does not automatically “propagate” to other packages/checkouts that depend on it.

    The default, however, is still False because muddle cannot tell whether “depends on” means linking to dynamically (which would be OK), statically (which would cause “propagation”), or not at all (which would also be OK).

  • LicenseOpen(name)

  • LicenseProprietarySource(name)

  • LicenseBinary(name)

  • LicensePrivate(name)

All License subclass instances have the following values and methods:

  • category is a string containing the category, one of gpl, open-source, binary or private.

  • version is None or a string containing the license version.

    All licenses can be given a version, using the version=<string> argument when creating the instance.

    If no version is specified for a license, then it has no specific version, and the version will not be shown when it is printed.

    Most, but not all, of the standard licenses have versions.

  • is_gpl() returns True if this is license has category gpl.

  • is_lgpl() returns True if this license has category gpl and is LGPL. This is true for the LicenseLGPL class.

  • is_open() returns True if this license has category open-source or gpl.

  • is_open_not_gpl() returns True if this license has category open-source (but not if it has category gpl)

  • is_proprietary_source() returns True if this license has category prop-source.

  • is_binary() returns True if this license has category binary

  • is_private() returns True if this license has category private

  • propagates() returns True if the license “propagates” to other checkouts. This will be True for licenses in category gpl which have with_exception set to False.

  • distribute_as_source() returns True for licenses that relate to source code distribution - so gpl, open-source and prop-source.

  • copy_with_version(version) returns a copy of the license object, but with its version set to the given string.

So, for instance:

top_secret = LicensePrivate('Top Secret', version='3.14')
only_works_on_my_machine = LicensePrivate('Local')
CC-ASA = LicenseOpen('Creative Commons Attribution-ShareAlike'
                     ' Unported License', version='3.0')

Note that str of a License gives back the license name and version (if any):

>>> str(top_secret)
'Top Secret 3.14'

whilst repr will give back how it was created:

>>> repr(top_secret)
"LicensePrivate('Top Secret', version='3.14')"

The standard licenses are stored in a dictionary:

licenses.standard_licenses

Please do not alter the contents of this dictionary.

Obviously the standard licenses do not include all versions of even the common licenses, and sometimes a checkout is licensed with an older license. A variant version of an existing standard license can be produced using the copy_with_version method - for instance:

>>> from muddled.licenses import standard_licenses
>>> mpl11 = standard_licenses['MPL-1.1']
>>> mpl10 = mpl11.copy_with_version('1.0')
>>> print mpl10
Mozilla Public License 1.0

The other standard distributions

We’ve already met the standard distributions that do not take account of checkout licenses, “_source_release” and “_binary_release”. There are also some standard distributions to help with common cases where licenses should be observed. These are:

  • “_for_gpl”
  • “_all_open”
  • “_by_license”

Warning

None of these distribute private licensed checkouts. This means that if there are such checkouts in the build tree, and the build description is meant to work when distributed, then some case must be taken in its construction - see Keeping parts of the build description private below.

GPL source distribution

The command:

$ muddle distribute _for_gpl ../gpl_release_1.0

is meant to help with creating a GPL-compliant source code release.

It distributes

  • all checkouts that have a GPL license (i.e., in category GPL)

  • all checkouts to which a GPL license has “propagated” (and which have not explicitly said they are not affected - see Avoiding unnecessary GPL “propagation” below)

  • all open source checkouts on which one of the GPL licensed checkouts depend

    Actually: if the package built from the GPL checkout depends directly on any non-GPL open source checkouts, or if it depends directly on any packages that are built from non-GPL open source checkouts, then those open source checkouts will also be included.

    If any of the selected dependant checkouts are not open-source, then muddle will give a warning, and will not distribute the offending checkouts. Occasionally, such a warning may be spurious, as muddle does not know why a package depends on a checkout, and the writer of the build description may have done something non-standard. The correct behaviour, though, is still not to distribute the offending checkout.

  • any necessary build descriptions

It will fail if GPL “propagation” affects checkouts that have declared binary or private licenses.

Open source distribution

The command:

$ muddle distribute _all_open ../open_release_1.0

is meant to help with creating a release of all the open-source checkouts in a build tree. As such, it distributes everything that “_for_gpl” does, plus any checkouts that have licenses in the open-source category.

Since it includes the “_for_gpl” distribution, it will also fail if GPL “propagation” affects checkouts that have declared binary or private licenses.

By license source/binary distribution

The command:

$ muddle distribute _by_license ../mixed_release_1.0

attempts to distribute as much of the build tree as it can in a license-appropriate manner. Thus it attempts to distribute:

  • source code for gpl, open-source and prop-source licensed checkouts
  • binaries for binary licensed checkouts
  • nothing at all for private licensed checkouts

It starts by including the source code content of an “_all_open” distribution (and can fail for the same reason).

For checkouts with a binary license, it determines which packages are built (directly) from them, and then which role those packages are in. It then distributes the entirety of the install/<role> directory for each such role.

(See Not distributing too many binaries below for how to write a build description that does not distribute binaries you were not expecting.)

Avoiding unnecessary GPL “propagation”

As we’ve said, muddle has to assume that “depends on” means “links to” (or the equivalent) when calculating GPL “propagation”. Since this is not always so, we need a way of telling muddle when it is not.

There are three cases:

  1. The GPL license being used is marked as “with exemption” or “with exception”, in which case there is no “propagation”.
  2. The checkout is never actually linked against (whatever action is needed to “propagate” the license is not done). An example would be busybox, which only provides executables (programs). Sometimes, even a checkout which provides libraries might not actually be linked against by any other checkouts (perhaps it was included in the build for another reason).
  3. Particular packages/checkouts do depend on the checkout, but not in a way that is significant for GPL “propagation”.

If we know that nothing “builds against” a particular checkout (whatever that means in the relevant context), then we can declare this with:

set_nothing_builds_against(builder, co_label)

where co_label is the checkout that no-one builds against. It is sometimes worth adding a comment to explain what “builds against” means for this particular checkout.

For convenience, this can also be stated in the call of set_license - for instance:

set_license(builder, 'busybox-1.19.3', 'GPL', not_built_against=True)

If we know that a particular label depends upon our checkout, but does not “build against” it, then we can use:

set_license_not_affected_by(builder, this_label, co_label)

This says that the checkout with label co_label does not in fact affect the license of this_label, despite implicit GPL “propagation”. this_label may be a package or a checkout.

So, for instance, we might have:

muddled.pkgs.make.medium(builder, 'libProp', ['x86'], 'libProp-2.3')
muddled.pkgs.make.medium(builder, 'program', ['x86'], 'program-4.9',
                         deps=['libProp'])

set_license(builder, checkout('libProp-2.3'), 'LGPL-3.0')
set_license(builder, checkout('program-4.9'), 'CODE NIGHTMARE GREEN')

# 'program' links dynamically to 'libProp', and thus doesn't need
# to be distributed as source
set_license_not_affected_by(builder, package('program', 'x86'), checkout('libProp-2.3'))

This function does not check that:

  • this_label actually depends on co_label
  • co_label is GPL licensed
  • co_label is GPL licensed with with_exception set to False (i.e., that it is a “propagating” license)

but none of these will hurt if incorrect.

Note

If checkout A depends on B depends on C which depends on a GPL checkout D, saying that B is “not built against” D doesn’t say anything about A or C, so you will have to address A and C with separate function calls, if appropriate.

Build descriptions

Build description licensing

Build descriptions are a little awkward in licensing terms, as all distributions wish to include them.

There are three obvious ways to handle build description licenses:

  1. Do not declare any license in the muddle build description. This is, of course, not the same as not actually having a license for the build description, it just means not telling muddle what it is (muddle has no way of reading a license statement in the comments or docstring of the build description, or held in a LICENSE file, and so on).

    This is perhaps the simplest option, as not-licensed checkouts are allowed in any distribution.

  2. Declare a gpl or open-source license. This is obviously only a solution if that form of licenses is actually applicable to the particular build description.

    Again, such a build description is clearly allowed in any distribution.

  3. Declare a prop-source license. Muddle assumes that proprietary source is not to be distributed in _for_gpl or _all_open distributions, but in the case of a build description, will just output a warning and carry on:

    WARNING: DISTRIBUTING BUILD DESCRIPTION DESPITE LICENSE CLASH
      Checkout checkout:builds/distributed is not allowed in distribution "_for_gpl"
      Checkout has license "Wombat & co. licensed, may be distributed as source",
        which is "prop-source", distribution allows "gpl", "open-source"
    END OF WARNING
    

If you tell muddle that your build description has a binary or private license, then it will not be possible to do a _for_gpl or _all_open distribution.

There is also another solution, though:

Distribution specific build descriptions

In some cases, the full build description is just not appropriate.

A typical case is when doing a _for_gpl distribution of a large and complex build tree, where much of it is not GPL. If the full build description is distributed, the recipient can only muddle build those checkouts that have actually been distributed, and these are likely to be buried in an unobvious way in the rest of the build description.

The appropriate solution in this case is to provide a replacement (simpler) build description.

Let us consider doing muddle distribute _for_gpl of a build tree that has its normal build description in src/builds/01.py.

To provide a build description just for that distribution, add a file called _distribution/_for_gpl.py in the build description checkout - so the full path is:

src/builds/_distribution/_for_gpl.py

If muddle distribute <name> sees a build description file called _distribution/<name>.py, that is, in directory _distribution and named after the distribution (don’t forget the .py), then instead of distributing the “normal” contents of the checkout, it will instead just distribute that file.

So let us assume we have a build description checkout that contains:

.git/
01.py
01.pyc
kernel.py
userspace.py
_distribution/
    _for_gpl.py

then muddle distribute _for_gpl will produce a target build description checkout that looks like:

01.py

where 01.py contains the text from _distribution/_for_gpl.py.

Note

When such a substitution is performed, the VCS information for the build description checkout is never copied, since it would be misleading at best. Similarly, “private” files (see Keeping parts of the build description private) are not relevant, as the normal build description is not being copied.

In comparison, muddle distribute _source_release would copy all of the .py files from the directory (and from _distribution as well), and might or might not copy the VCS, because there is no file called _source_release.py in _distribution/).

Note

It is not an error if the _distribution directory does not exist - it only needs to be created if you need it.

Keeping parts of the build description private

There are two reasons to keep part of the build description private:

  • describing how to build private checkouts/packages may itself be something that should not be advertised, and
  • once the build tree has been distributed without private checkouts, any part of the build description that relies on them won’t work.

The solution to this is to construct the build description so that private checkouts and packages are kept separate:

  1. separate in role
  2. separate in build description

The first is necessary so that a binary distribution won’t try to distribute a role that includes binary and private files (which would cause a clash, and thus won’t work).

The second essentially means organising the build description so that the private parts are in a separate file, coded to a particular API.

This is best illustrated with an example.

Assuming that our build description lives in src/builds, then we put our private description into a separate Python file, src/builds/private.py:

# How to build our private code

import muddled.deployments.collect as collect
import muddled.pkgs.make as make

from muddled.depend import checkout
from muddled.licenses import LicensePrivate, set_license

def describe_private(builder, *args, **kwargs):
    """This describes how to do the private part of our build.

    We require it to be called as:

        describe_private(builder, deployment='<deployment-name>')
    """
    deployment = kwargs['deployment']

    make.medium(builder, 'secret-thing', ['x86-private'], 'secretThing-0.1')

    set_license(builder, checkout('secretThing-0.1'),
                LicensePrivate('Do Not Distribute'))

    collect.copy_from_role_install(builder, deployment, role='x86-private',
                                   rel='', dest='')

and then in the “normal” build description (typically src/builds/01.py), import the function from that file:

from private import describe_private

and call it in the body of the build description describe_to - for instance:

# We also have some private stuff, described elsewhere
describe_private(builder, deployment=deployment)

and also tell muddle that private.py is not to be distributed as-is:

# So that "elsewhere" is private - i.e., private.py
# and we should never distribute it in non-private distributions
for name in get_distributions_not_for(builder, ['private']):
    set_private_build_files(builder, name, ['private.py'])

This last tells muddle that when muddle distribute is called to distribute something that does not distribute private licensed checkouts, the content of private.py is to be replaced by a “stub” file, which just contains:

def describe_private(builder, *args, **kwargs):
    pass

This means that the build description will continue to work, but when it calls the describe_private() function, nothing will happen, and no information on the original content of private.py is leaked.

In summary, the name of the private file (or files) does not matter, but their API must be a single function with signature:

describe_private(builder, *args, **kwargs)

Refining binary distributions

Adding extra checkout files in a binary distribution

When doing binary distributions, “muddle distribute” will generally include the necessary muddle Makefiles, so that the data in install/<role>/ can be processed to give a deployment in the normal manner.

Sometimes, however, the muddle Makefile is not sufficient - typically because it calls out to the “original” Makefile or files from a checkout.

Thus there is a way to say that extra files should be included, for a given checkout, in a particular distribution:

distribute_checkout_files(builder, name, label, source_files)

In this:

  • name is the name of a distribution, or a wildcard string matching one or more distributions (see Distribution names and wildcards below).
  • label is a checkout label. The label tag is ignored.
  • source_files is a list of extra source files, relative to the checkout directory.

So, for instance:

muddled.pkgs.make.medium(builder, 'binapp', ['x86'], 'binapp-1.2')
distribute_package(builder, 'marmalade', package('binapp', 'x86'))
distribute_checkout(builder, 'marmalade', checkout('binapp-1.2'),
                    ['Makefile', 'src/Makefile', 'src/rules'])

causes the distribution “marmalade” to include the extra files as named.

In fact, sometimes one wants to assert this for a whole range of distributions, and so it is perhaps more likely that one woud write:

for name in get_distributions_for(['binary']):
  distribute_checkout(builder, name, checkout('binapp-1.2'),
                      ['Makefile', 'src/Makefile', 'src/rules'])

which loops over all the distributions that distribute anything with a binary license, and set them to distribute the extra files.

Not distributing too many binaries

Binary distributions distribute the content of install/<role> directories. Muddle has no way of telling which packages have actually put content into a particular install/<role> directory - just because a package is in role <role> does not necessarily mean that it does so.

So when a particular package is distributed as binary, and causes its install/<role> directory to be distributed, it may “drag along” unwanted other content, from other packages in the same role.

This can only really be avoided by writing a build description which does not use the same role for incompatible (licensing) purposes.

When first creating a new distribution, it is always worth looking hard at the content of any install directories that are being distributed, to check for unexpected files.

The commands:

$ muddle query role-licenses
$ muddle -n distribute <name> ../fred

may be useful in working out what is going on.

Note

It is arguable that a role that contains binaries from binary and open-source or binary and gpl checkouts should be regarded as a clash in the same way that binary and private is. This may be implemented in a future version of muddle.

Creating your own distributions

It is also possible to create new distributions, by specifying them in the build description.

The distribution module

Stuff to do with distributions is conveniently packaged in the muddled.distribe module, so one might typically do:

from muddled.distribute import name_distribution

(and so on for any other items needed from it), or just:

import muddled.distribute

or:

import muddled.distribute as distribute

The muddle documentation command can be used to get docstrings from the source code:

muddle doc distribute
muddle doc distribute.name_distribution

Name a distribution first

Before you can use a new distribution, it must be “named”:

name_distribution(builder, name)
name_distribution(builder, name, categories)

name is the name for the new distribution. It should not start with an underscore, as such names are reserved for the standard distributions (which muddle has already defined, before reading the build description).

If categories is given, then it is a sequence of license categories (so, taken from gpl, open-source, binary and private). Only checkouts with those categories will be allowed to be distributed with this new distribution.

Warning

When naming the license categories that a distribution uses, gpl and open-source are distinct and separate - there is no assumption that open-source includes gpl. In this instance open-source means “open but not GPL”.

If categories is not given (or is None), then all license categories are allowed.

All non-standard distributions must be named before they can be used. It is not an error to name a distribution more than once, but the categories must match each time it is named.

Note

If a subdomain names distribution “Fred”, and then the parent domain refers to domain “Fred” after including that subdomain, that will work, as the naming of “Fred” (in the subdomain) did come before its use (in the parent domain). However, we don’t recommend writing this sort of code in a build description - it would generally be better to re-name the distribution in the parent domain as well, so anyone reading the description can easily tell how it is defined.

Distribution names and wildcards

The functions that add checkouts, packages and checkout files to distributions can all take a shell-style wildcard instead of a specific distribution name. They will then add the relevant entity to all distributions with names that match that wildcard.

Only distributions that have already been named will be considered in “expanding” the wildcard.

Wildcarding is done with fnmatchcase from the Python fnmatch module, so:

*       matches everything
?       matches any single character
[seq]   matches any character in seq
[!seq]  matches any char not in seq

Distribute a checkout

distribute_checkout(builder, name, label, copy_vcs=False)

This is used to say that a particular checkout (identified by label) is to be part of the distribution called name. If name is wildcarded (see Distribution names and wildcards), then the checkout will be added to each distribution that matches the wildcard.

If copy_vcs is true, then the checkouts VCS “special” files should be distributed. The default is not to do so.

Note that:

  1. the label name may also be a wildcard (‘*’), in which case all matching checkouts will be distributed.
  2. the label tag is ignored.

The function creates a rule saying that checkout:<name>/distributed is build from checkout:<name>/checked_out using the DistributeCheckout action.

Adding a checkout to the same distribution more than once has no special effect, except that it is the last call that sets the value of copy_vcs that will be used.

Note

All checkouts in the build description are implicitly part of distribution _source_release. The muddle distribute command itself calls distribute_checkout to add them to this distribution, after the build description has been read. Thus there is never any point in explicitly adding a checkout to _source_release in the build description itself, as that will be ignored.

Distribute specific files from a checkout

distribute_checkout_files(builder, name, label, source_files)

This is used to say that the files in sequence source_files, named relative to the source directory of checkout label, should be distributed as part of distribution name.

If name is wildcarded (see Distribution names and wildcards, then the files will be added to each distribution that matches the wildcard.

Note that the label name may not be wildcarded, but the label tag is still ignored.

This is the function that is used internally to add muddle Makefiles for packages. It can also be used directly to specify that other files must also be distributed.

Multiple calls with the same builder, name and label, but different source_files, will just add the new files to the same distribution.

Calling distribute_checkout_files after calling distribute_checkout for the same distribution and checkout has no effect - the latter call has already selected all files.

Calling distribute_checkout after calling distribute_checkout_files for the same distribution and checkout means that the calls of the latter are essentially ignored, because distribute_checkout is choosing all files.

Build description checkouts

Build description checkouts are treated specially.

The muddle distribute command adds the necessary rules itself, by calling distribute_build_description, and saying that each checkout:<build-desc>/distributed is built from checkout:<build-desc>/checked_out using the DistributeBuildDescription action (which is actually very similar to the DistributeCheckout action, of course).

There is never any reason to call distribute_build_description directly, as muddle distribute will always override it.

Similarly, there is never a reason to call distribute_checkout() on a build description checkout, as muddle distribute will always override it, too.

Distribute a package

distribute_package(builder, name, label, obj=False, install=True,
                   with_muddle_makefile=True)

This is used to say that a particular package (identified by label) is to be part of the distribution called name. If name is wildcarded (see Distribution names and wildcards, then the package will be added to each distribution that matches the wildcard.

Note that:

  1. the label name and role may also be wildcards (‘*’), in which case all matching packages will be distributed.
  2. the label tag is ignored.

If obj is true, then the obj/<package-name>/<role> directory for label will be distributed.

If install is true, then the install/<role> directory for label will be distributed.

If with_muddle_makefile is true, then the muddle Makefile for this package will also be distributed. For instance, if package:zlib{x86}/* is built from checkout:zlib-3.0/checked_out, using muddle Makefile Makefile.muddle in src/libs/zlib-3.0, then the user will receive a directory called src/libs/zlib-3.0 that just contains Makefile.muddle. Note that this is not always enough to allow deployment, and sometimes other support files will need to be (epxlicitly) added via the build description.

If both obj and install are false, nothing much is going to be done for this package. The current implementation doesn’t grumble about that.

The function creates a rule saying that package:<name>{<role>)/distributed is built from package:<name>{<role>}/postinstalled using the DistributePackage action.

Adding a package to the same distribution more than once has no special effect, except that it is the last call that sets the value of obj, install and with_muddle_makefile that will be used for that distribution.

Note

All packages in the build description are implicitly part of distribution _binary_release. The muddle distribute command itself calls distribute_package to add them to this distribution, after the build description has been read. Thus there is never any point in explicitly adding a package to _binary_release in the build description itself, as that will be ignored.

Finding out about distributions

A variety of useful functions are provided by the distributions module.

You can use:

get_distribution_names(builder)

to return the names of all the distributions that are currently defined, or:

get_distribution_names_for(builder, categories)

to return the names of any distributions that have all of the license categories in categories. So, for instance:

for name in get_distributions_for(['gpl', 'open-source']):
    distribute_checkout_files(builder, name, co_label, 'Makefile')

Contrariwise:

get_distribution_names_not_for(builder, categories)

returns the names of any distributions that do not have any of the license categories in categories. So, for instance:

for name in get_distributions_not_for(['private']):
    set_private_build_files(builder, name, ['private_app.py'])

The function:

get_distributions_by_category(builder)

returns a dictionary with category names as keys, and sets of distributions names as values - so it allows one to look up which distributions can distribute particular license categories.

Finally:

get_used_distribution_names(builder)

returns the names of all the distributions that are “in use”, i.e., referenced by some action in the dependency tree. This will not normally return any of the “standard” distributions, because they are set up by “muddle distribute” after the build description has been read.

Muddle commands

  • muddle distribute handles distribution.
  • muddle help distribute gives the help text for muddle distribute, which summarises some of this document.
  • muddle query distributions lists the names of the distributions that exist, either because they are defined in the build description, or because they are predefined.
  • muddle query licenses prints out the standard licenses
  • muddle query checkout-licenses prints the licenses for the checkouts in the dependency tree, and other related information it has deduced. This is deliberately verbose, giving as much information as possible.
  • muddle query role-licenses prints the licenses used in each role, and whether there are any obvious clashes.

More license stuff

The licenses module provides more functions, mostly aimed at the muddle developer. See their docstrings for details on how to use them.

A tuple containing all of the license categories is provided:

ALL_LICENSE_CATEGORIES

There are some useful query functions:

get_gpl_checkouts(builder)
get_implicit_gpl_checkouts(builder)
get_open_checkouts(builder)
get_open_not_gpl_checkouts(builder)
get_prop_source_checkouts(builder)
get_binary_checkouts(builder)
get_private_checkouts(builder)
get_not_licensed_checkouts(builder)

for finding out which checkouts have particular categories of licenses.

You can find out if a checkout has a license using:

builder.db.checkout_has_license(co_label)

You can retrieve the particular license using:

get_license(builder, co_label)

which will return None if the checkout does not have a license. If you’d prefer an exception in that case, you can call it as:

get_license(builder, co_label, absent_is_None=False)

You can find out if a checkout has a license that is in a particular category (or categorie) with:

checkout_license_allowed(builder, co_label, categories)

Note

Some of the functions in the licenses module are just wrappers around calls of builder.db methods. Specifically:

  • set_license is partly a wrapper for builder.db.set_checkout_license, but it (set_license) may also store the license file information, which the low-level set_checkout_license method does not do.
  • get_license is a wrapper for builder.db.get_checkout_license (but beware that the latter swaps the default for absent_is_None, so that it defaults to raising an exception if there is no license)
  • set_license_not_affected_by is a wrapper for builder.db.set_license_not_affected_by

There is also a boolean query:

builder.db.checkout_has_license(co_label)

which doesn’t currently have a wrapper function.

Questions

What happens if we rebuild in a binary distribution?

Experimenting shows:

$ m3 distrebuild package:main_pkg{x86}
Building: package:main_pkg{x86}/distclean ..
> Building package:main_pkg{x86}/distclean[T]
Can't build package:main_pkg{x86}/distclean: Missing source directory
  package:main_pkg{x86}/distclean[T] depends on checkout:main_co/*
  Directory /home/tibs/sw/m3/tests/transient/binary/src/main_co does not exist

but, unfortunately:

$ m3 rebuild main_pkg
Killing package:main_pkg{x86}/built
Clearing tags for package:main_pkg{x86}/built
  package:main_pkg{x86}/built
  package:main_pkg{x86}/installed
  package:main_pkg{x86}/postinstalled
Building package:main_pkg{x86}/postinstalled
> Building package:main_pkg{x86}/built
Can't build package:main_pkg{x86}/postinstalled - Missing source directory
  package:main_pkg{x86}/built depends on checkout:main_co/*
  Directory /home/tibs/sw/m3/tests/transient/binary/src/main_co does not exist

so we’ve now lost our “built” tags. Oh well. At least it told us.