OpenCM User's Guide

OpenCM User's Guide

Jonathan S. Shapiro

  Copyright (C) 2002, The EROS Group, LLC and Johns Hopkins University.

Permission is granted to redistribute this document without royalty or fee, provided that this copyright notice remains intact.

 

Table of Contents

OpenCM: An Introduction

This document describes how to use the OpenCM configuration management system. It is intended for both new and expert users. The first portion of the document provides a tutorial introduction to OpenCM. Later sections deal with specific "expert" issues.

This document applies to versions 0.1.2alpha5pl2 and later of the OpenCM configuration management system, which is most commonly installed as a program named cm.

OpenCM has been tested only on Linux systems, though we have every expectation that it will work on UNIX systems in general. A port to the cygwin environment would be helpful, and we would be happy to re-integrate the necessary changes.

NOTE: This manual makes reference in many places to disconnected development and replication functionality (the two are intimately related). These features are not included in release 0.1. Replication is very sensitive to schema changes, and we found that keeping the code up to date was significantly delaying our release date. Rather than continue to delay the release, we removed the replication feature temporarily. Replication should re-appear in release 0.2 or 0.3, as it is very high on our priority list. Rather than remove references to the replication logic from the manual, we have left them present so you can see what is coming.

Why Another CM System?

In the open source world, the current "gold standard" for configuration management is CVS. CVS has been successful because it is an effective 80% solution. It is an excellent example of open source at its best and worst: on the plus side, the community needed a tool and they went out and built one. On the minus side, when it hit "good enough," development on it essentially reverted to maintenance mode. OpenCM will probably be subject to the same effect, but we hope we have set a higher initial bar.

While a lot of people use CVS, everybody who uses it has a pet list of things they wish CVS did better. In the EROS project, our list of key missing features in CVS was:

There are several existing alternatives to CVS, each with strengths and weaknesses:

Perforce
The Perforce system is a commercial product, but is licensed at no charge for use on open source projects. To all appearances, Perforce is a good product, but we have commercial collaborators and a need to manage open but unreleased code. The Perforce license does not support this. Also, Perforce does not provide disconnected development, which is a critical concern for us.
Bitkeeper
Another frequently mentioned candidate is bitkeeper, which has had success in a few of the Linux kernel projects. At the time we started the OpenCM work, BitKeeper didn't exist. Bitkeeper shares several features with OpenCM, but its license is, for us, problematic. The BitKeeper authentication model did not satisfy our need to avoid adding accounts on the server, and BitKeeper provides no end to end integrity controls. It's repository architecture is not conceived as a software distribution mechanism, and no architectural attention has been given to the possibility of damage to a managed code base by a hostile mirror.
PRCS
Josh MacDonald's PRCS is a system that heavily influenced the OpenCM design, and we anticipate integrating some of his XDELTA and XDFS work in the future. Josh's work does not address our concerns about end to end provenance control, integrity, or disconnected development, though it could easily evolve in these directions.
VESTA
VESTA was announced as we were writing this. We need a good comparative description here.
None of these quite met our needs, and we reluctantly decided that we needed to build something different.

Features at a Glance

OpenCM is designed as a mostly drop-in replacement for CVS. Like CVS, OpenCM uses a "checkout, modify, merge" development process. Unlike CVS, it provides integrity checks, replication, access controls, and provenance tracking.

OpenCM uses SSL-based cryptographic authentication for user management. The OpenCM administrator can add new users to the OpenCM repository subsystem without giving them access to the rest of the machine.

OpenCM is a true configuration management system. Every change to a branch is committed as a single, traceable unit. It is therefore possible to recover with confidence any version of a software system.

OpenCM tracks both commentary and change history across replicates. Patches are not used as a change transmission device. The entire trail of evolution is preserved by every merge.

While the command-line OpenCM client (the one described in this manual) is designed for workspaces that live in file systems, nothing in the OpenCM protocol or repository relies on this assumption. With a suitable client, OpenCM could equally well be used for a repository-based IDE.

OpenCM lets you develop while disconnected. You can check out a project, get on an airplane, modify the code, and commit to your laptop. Later, you can merge the result back into the main repository without losing any of the history or comments from your work.

OpenCM can be used as a source or binary deployment vehicle. It is very straightforward to set up replicating repositories, and a user can determine whether the copies obtained from the replicates are authentic.

OpenCM handles binary files as well as text.

NOTE: Binary merge management is not currently addressed. Future versions of OpenCM will have content-sensitive merge facilities to better support this.

OpenCM uses your connection efficiently. Because OpenCM uses cryptographic hashes for object names, it is readily able to determine what objects you already have. Similarly, there is no need for OpenCM to connect to the server to determine what has changed in your workspace.

1 Basic Concepts

OpenCM is a configuration management system: a system for managing source code, documents, or other complex, changing content. It is appropriate when you have multiple documents or files, and you need to keep versions of these files synchronized with each other as part of some larger work.

For example, you might have a book with 10 chapters, each stored in a separate file. You might revise two of these chapters to make a new revision of the book. The new revision contains 8 chapters that are the same as the previous book, and two chapters that have been replaced. Organizing these relationships is the job of a configuration management system.

1.1 Versioning Concepts

To manage versions of your work, OpenCM applies two levels of organization on your work. As you work, OpenCM maintains a "branch." A branch is a sequence of "configurations of a work in progress". A configuration is a consistent set of files or objects. OpenCM records each file or object as a separate entity.

As you make changes to your work, you periodically will reach a point where you want to save your work. In OpenCM, this is done using the commit command. The commit command examines your workspace, identifies all of the files that have been changed, added, or deleted, and builds a new configuration of your work. Committing your changes has two consequences:

  1. Your entire current working state is saved as a consistent whole.
  2. Subject to access checks, other users can obtain a copy (check out) of the new version that you have just committed.

OpenCM tracks versions of systems, not files. Each configuration is a complete version of your work. Even if you change just one line of one file, a new configuration is created when you commit. For this reason, OpenCM does not track versions of files. It tracks versions of branches. A branch is simply a sequence of committed configurations. A figure is needed here.

1.2 Users and Access Control

Every OpenCM user has one (or more) unique cryptographic key(s) that identify that user. Whenever you use OpenCM, your session begins with key authenticatation. Your key is used to determine whether you can read or write on this repository (at all) and whether you have read or write access to specific objects in the repository. In OpenCM, the access-controlled objects are branches, users, groups, and directories. Whenever you change one of these objects, a new "revision" is created in the repository recording the time of the change, the identity of the previous revision, and the authenticated user. A figure for revisions is needed here.

OpenCM users can have multiple keys used for different purposes. For example, it is usually appropriate for the OpenCM administrator to have one key for administrative purposes and another for day to day activities.

1.3 Name Spaces

Every OpenCM user has a unique name space, or home directory, that is created when the user is added to the repository. The home directory stores key-value pairs that map human readable names (keys) to the cryptographically generated names (values) of OpenCM objects of interest. Since the names of the OpenCM archived objects are not easily remembered (given the cryptographic nature of the names), the directory entries provide an essential means for defining easily remembered or easily understood names for referencing OpenCM objects.

Most needed entries in the home directory are inserted automatically by the system, for such actions as creating new branches, adding users, and creating groups. These entries allow the user to manipulate the associated OpenCM objects via a human-readable name, since most OpenCM client commands accept either an absolute OpenCM object reference or a key from a user's home directory. The entries in OpenCM directories are user-specific and each user maintains total control over the entries in his own directory. So, even though the system will automatically enter key-value pairs on behalf of a user, the user can always delete or modify those entries. OpenCM provides a complete set of client commands for listing and modifying the contents of the home directory.

1.4 More on Names

All objects in an OpenCM archive are ultimately referenced via a Uniform Resource Identifier (URI). Although the URI format is commonly understood, OpenCM uses cryptographically generated names, which tend to make the URIs of objects unreadable. (This was the motivation for providing a home directory for each user as previously explained.) To generate a URI, OpenCM uses a combination of the universally unique repository ID (as the "network location" part of the URI) and a universally unique random name (as the "path" part of the URI). Although unreadable, this aspect of OpenCM is absolutely essential for supporting distributed modification and replication of any archived object because any OpenCM archived object is guaranteed to have a unique URI associated with it.

1.5 Workspaces and Development Model

An OpenCM user interacts with a repository via a local Workspace. The user's Workspace is simply a local organization of one version (usually the latest) of a given branch. The Workspace is organized in a file system directory on the user's local machine. Any modifications to the objects are done locally. When ready, the user commits or uploads the entire set of modifications to the repository.

A typical development scenario is that a user checks out a branch of a repository to which he has access. He then makes modifications to the local copies of the repository objects, and when ready, commits those modifications back to the repository. For concurrent development, OpenCM supports merging different branches (assuming they have a common ancestor) and updating a user's Workspace to incorporate changes committed by other users. In addition, a user may create empty branches, as well as fork current branches in order to support new/parallel lines of development.

All development on a branch is done in the context of a Workspace. In other words, all direct modifications of objects and any indirect modifications as a result of merging or updating a branch happen only to the local copies of objects in a user's Workspace. No permanent changes to archived objects happen until a Workspace is committed to the repository.

1.6 Users and Access Control

This section needs to be filled in. It needs to describe:

2 Setting Up OpenCM

So far, OpenCM has been tested only on Linux systems. For now, you can obtain the latest copy of OpenCM from the OpenCM web site. OpenCM is delivered as a gzipped tar file. For Linux, it is also available as a RPM file.

2.1 Installing OpenCM

While it is possible to rebuild OpenCM from scratch, we recommend that if you install OpenCM using the RPM mechanism. RPM keeps track of what files need to be deinstalled if you decide to remove OpenCM.

NOTE: There are other package management systems that do this as well. We do not currently support this simply because we do not know how to build packages for them. If you have a favorite package utility that you would like us to support, please file a bug report in the bug database. Take a moment to see if such a request already exists, and add your vote to it if appropriate.

2.1.1 Installing Using RPM

To install OpenCM with RPM:

# rpm -ivh opencm-0.1.2alpha5pl2-1.rpm

Installing the OpenCM RPM installs the necessary commands, this manual, and the help files under /usr/bin and /usr/share. The next step is to set up a repository.

You can perform a minimal test of the resulting installation by checking the version of the OpenCM system:

# src/cm version
Client: opencm version 0.1.2alpha5pl2

2.1.2 Rebuilding OpenCM From Scratch

Before downloading OpenCM, we recommend that you begin by creating an empty directory. Download the OpenCM tar file to that directory and follow the directions below.

First, unpack the tar file by using the following command:

# gunzip opencm-0.1.2alpha5pl2-1.tar.gz
# tar -xvf opencm-0.1.2alpha5pl2-1.tar
If your tar command supports the -z option, it may be possible to skip the gunzip step by using the -z option to tar. Note that some web servers will decompress the file before transmitting it, but will still give you a file name ending in .gz. If gzip indicates that the file is not a compressed file, try using tar on the .gz file directly. What the heck, it might work.

You will now have a subdirectory called opencm/. Change into this directory using cd. Configure and compile OpenCM using autoconf and configure:

# cd opencm
# autoconf
# ./configure
# make

With a bit of luck, this will get OpenCM built for your system. You will need to have the OpenSSL system installed to perform this compile.

By default, OpenCM will configure to install under /usr/local. If you prefer to install the OpenCM software in the "official" locations, you should use replace the configure line above with

# ./configure --prefix=/usr

Before you actually do the installation (which must be done as root), it is probably a good idea to check if OpenCM is vaguely working. To do this, you might try the OpenCM binary by typing:

# src/cm version
Client: opencm version 0.1.2alpha5pl2

If OpenCM has built successfully, you can install it (as root) using

# make install

OpenCM requires no special privileges. It can be run by any user for personal use. For use as a public repository, it should be run as a separate user account. Let's test it before we set that up.

2.2 Creating the Administrative User

Before you create the repository, you must first create the cryptographic keys that will be used by the administrator of the repository. OpenCM uses cryptographic methods (SSL/TLS) for user authentication. It does not rely on or use the host operating system for authentication at all. This is necessary for several reasons:

  1. If OpenCM relied on the host authentication mechanism, it would not be able to distribute successfully across different types of hosts.
  2. It is frequently desirable to authorize a user to use the OpenCM repository on your machine. Because OpenCM uses cryptographic authentication, it is not necessary to give this user an account on the host. This ensures that their access is restricted to OpenCM.
  3. Cryptographic authentication does not rely on a single administrator. Since OpenCM is designed as a globally replicatable system, dependency on a local administration policy would be inappropriate.

Even if your users do have a local account on the repository machine, they will require an OpenCM crytographic account.

The administrative user has the authority to add new users to the repository, and to revoke access to the repository. Because of this, it is a very bad idea to use the administrative user's cryptographic key as your daily working identity. We will show how to add other users momentarily.

To create the administrative user key:

cm create user admin

You will be prompted for various information to create the X.509 keys.

When you create a user key, you must provide a valid email address. OpenCM has no way to check that you provided a valid email address, but the tool relies on the email address from the X.509 certificate to identify a user in a human-friendly manner and (soon) to notify subscribed users of changes to repository objects. For the Administrative user admin, we suggest using opencm-admin@yourhost.com, and setting up an appropriate email alias in your system's aliases file. Later, when the time comes to authorize users to access the repository, you should take care to check the email address they have provided.

OpenCM will also show the common name field and other information for user display purposes if provided, so it's a good idea to enter pertinent data for that entry as well. For the admin user, we suggest you use something like OpenCM Administrator for MumbleCo or OpenCM Administrator for MyProject. That way, you will be able to figure out later what project or activity this key goes with. Take the time to figure out what information the administrative key should provide. Hopfully, You are going to live with this key for a long time.

NOTE: Typing passwords can be a real pain in the neck, but you should always use a password for your OpenCM keys. The next version of the OpenCM client will include a built-in key management agent similar to ssh-agent to eliminate the need for this, so there is no good excuse for avoiding a password.

After the "create user" command has run, you will find that you have a new directory called .opencm under your home directory. The .opencm directory should now contain:

users/
  admin.pem
  admin.key

2.3 Creating a Repository

The next step in the setup process is to create a repository. If you are creating a repository that will run as a server, this is the tricky part.

OpenCM currently supports two file-based mechanisms for storing the repository:

fs
The fs storage type is a mechanism based on flat files. It is used primarily for testing OpenCM, and it not recommended for use as a working repository. In a fs repository, every version of every managed object is stored in a separate file.
gzfs
The gzfs storage mechanism is exactly like the fs mechanism, execpt that gzip compression is used. For now, newly created repositories should use the gzfs mechanism.

In the future, other delta-encoding storage mechanisms will probably be supported (most likely, the long awaited sxd2, which will fix most of the issues with the old sxd repository). In addition, support for database storage mechanisms have also been considered, but are not supported at this time.

Once you have an admin user key, you can create the actual repository. You will need to specify the path to the admin.pem file, which is the file created when you created the admin user previously. There is no need for this file to have any particular name. Whatever PEM file you supply will be accepted as the administrative user's identity for this repository. If you already have an administrator key set up from creating some previous repository, it can be reused. It is possible to change the administrator key later.

2.3.1 Creating a Personal Repository

OpenCM is designed to store all of its content in a single, server-wide repository. If you are creating a repository for personal use, this repository can be created using the create repository command:

$ cm create repository /path/to/directory gzfs admin.pem

When this command succeeds, the directory /path/to/directory will exist, and a new OpenCM repository will have been created.

After you create the repository, you can see if you got this right by typing:

$ cm --repository file:/path/to/directory version

2.3.2 Creating a Public Repository

When OpenCM is installed, it creates a new, non-privileged user called "opencm." To reduce any risk from server compromise, the OpenCM daemon (the public server process) runs as this user. Unfortunately, this makes creating the public repository slightly more complicated. You will only need to do these steps once.

The first step is to become root on the server machine. Since the OpenCM repository will probably live inside a directory that the OpenCM user cannot write, you'll need to create the top-level repository directory by hand. We will use /home/OPENCM as our example repository directory. Make this directory writable to the opencm user, and you can then run the create repository command using the su command:

$ su
password: type root password here
# mkdir /home/OPENCM
# chown opencm.opencm /home/OPENCM
# su opencm -c "cm create repository \\
                /home/OPENCM gzfs /path/to/admin.pem"

Note that we had to break up the last line for publication. The last command should be typed on a single line.

Before you can use the public repository, you also need to edit the file /etc/opencm.conf. Find the line with the OPENCM_REPOSITORY variable. This variable is commented out in the distributed configuration file. Change this line to read:

OPENCM_REPOSITORY=file:/home/OPENCM

The opencm server will not start automatically unless this variable is set in /etc/opencm.conf

2.3.3 What Happened

When you create the new repository, OpenCM automatically creates several objects in the repository:

2.3.4 Locating the Repository

Specifying the --repository option for every command quickly becomes awkward. When trying to find your repository, OpenCM checks the following places in order:

  1. The --repository command line option, which expects a URI:
    $ cm -u admin --repository file:/path/to/directory version
    
  2. Your workspace (discussed below), which identifies the repository you used when you checked out your working copy.
  3. The OPENCM_REPOSITORY environment variable:
    # export OPENCM_REPOSITORY='file:/path/to/directory'
    

Now that you have a repository created, it is a good idea to set the OPENCM_REPOSITORY environment variable to tell OpenCM where it is. From this point forward, we will assume that you have set the OPENCM_REPOSITORY variable - otherwise the example commands get too long.

NOTE: A future version of OpenCM will use a systemwide configuration file to locate the repository for the opencm server command, allowing the server to more easily be started from the system startup scripts.

2.3.5 Verifying The Repository

Now that you have created the repository, it is possible to use the new administrative key to test the newly created repository:

$ cm -u admin version
Client: opencm version 0.1.2alpha5pl2
Repository: opencm version 0.1.2alpha5pl2

If the version command is able to make a connection to the repository, it reports the version of the server as well as the client. If the command reports a repository version, you can have reasonable confidence that the repository has been created correctly.

2.4 Creating Users

The repository is now ready to use, but using it with the administrator account is generally not a bright idea. A better approach is to create a user identities for each user who will use the repository. This proceeds in two steps:

  1. Each user creates a new public/private key pair for themselves, and provides the public key to the repository administrator.
  2. The repository administrator adds an account for the user.

2.4.1 Creating the Key Pair

To create a new public/private key pair, you can proceed exactly as you did when creating the administrator key above. At this point, you are creating a key for your own use. This tutorial assumes that your name is "Jack."1 To create the key for Jack, type:

cm create user jack

Once again, be sure to use a valid email address.

The create user command creates two files. The jack.key file is the private key for the new user. The jack.pem file is the public key and certificate for the new user. The private key should be secured by using an appropriate pass phrase. The private key file is never an argument to any OpenCM command. The server administrator has no need to see your private key. Indeed, this is why creating the user key and adding a new user to a repository is done in separate steps.

2.4.2 Creating the User Account

Once a user key pair has been created, an account for this user needs to be created on the repository. This step must be done by a member of the repository administrative group. The user should send the PEM file to the administrator to have the account created. If you send the PEM file via email, it is probably a good idea to send it as an attachment rather than as part of the message body - some mail agents mangle long lines of text, and it may be difficult for the administrator to correctly reconstruct your certificate if this happens.

When the administrator receives the certificate, they can create authorize this user by typing:

cm -u admin adduser jack.pem rw

The argument jack.pem is the path to the certificate file for the new user (NOT the private key file). The final argument to this command indicates the overall repository access for the new user. Choices are r (read only access), rw (read and write access), and w (write only access). A user who has read-only access can obtain material from the repository but cannot modify anything. Write-only access is a bug that should be removed.

Jack is now ready to start using OpenCM for real work.

2.4.3 What Jack Sees

If you are Jack, and you have followed the setup procedure up to this point (that is, you are also the administrator), your .opencm directory should now contain:

users/
  admin.pem
  admin.key
  jack.pem
  jack.key

When the new user Jack is created in the repository, a directory will be created for them as well. Any user can list the content of their main directory with the ls command:

$ cm -u jack ls
[User] self         # Jack's user object
[Group] everyone    # the group of all users

2.4.4 Deciding Which User to Use

Specifying the -u option for every command quickly becomes awkward. OpenCM determines the name of the certificate file to use in the following order:

  1. If the command line -u name has been supplied, OpenCM will use this name.
  2. Otherwise, OpenCM uses the name specified by the OPENCM_USER environment variable.
  3. Otherwise, OpenCM assumes that the user name is default.

Once a name is chosen, OpenCM uses the certificate and key found in ~/.opencm/users/name.pem and ~/.opencm/users/name.key.2 If you have a particular key that you wish to use by default, the easiest thing to do is to copy or create a symbolic link to the PEM file:

# cd ~/.opencm/users
# cp jack.pem default.pem
# cp jack.key default.key

Future versions of OpenCM will provide an automated means to specify a preferred key on a server by server basis.

For the balance of this manual, we will assume that you have either created a default.pem and default.key file or that you have set the OPENCM_USER variable to your preferred user name.

2.4.5 Pass Phrases

You may come to feel (as we do) that typing a pass phrase for every OpenCM command is impossibly cumbersome. Future versions of OpenCM will make it possible to avoid retyping a password for this key. As a temporary measure, if you feel that your machine's login mechanism provides adequate security for your short-term needs, you can remove the password as follows:

# cd ~/.opencm/users
# openssl rsa -in jack.key -out tmp.key
# mv tmp.key jack.key

We cannot really encourage this practice, but we do understand why it is tempting. We are working to integrate a password manager into OpenCM as quickly as we can.

3 Client/Server OpenCM

OpenCM is designed as a client/server application. In most installations, a server process (daemon) will manage the repository. This chapter describes how to configure client and server, and how to start the built-in OpenCM CGI support.

3.1 Setting up a Network Repository Daemon

Once you have created an OpenCM repository, you can start up OpenCM as a server process by executing the cm server command:

cm server file:/local/path/to/repository

To avoid accidentally exposing the wrong repository, the cm server command does not honor the OPENCM_REPOSITORY environment variable or the settings of your local workspace (if any). It requires that the local repository URI be explicitly provided on the command line.

Once started, the server listens for client connection requests, authenticates clients, and then processes OpenCM client commands sent over the OpenCM wire protocol. The daemon will remain running until explicitly terminated (e.g. via SIGHUP). A server process is not required for OpenCM operation, but without it, remote access to the repository is unavailable.

3.2 Using a Network Repository (Client)

To tell the client that it should access a network repository, simply specify an OpenCM URI as the value of the OPENCM_REPOSITORY environment variable:

# export OPENCM_REPOSITORY='opencm://host-name'

Any valid host name or IP address can be used as the repository host name. Assuming that you are an authorized user on the server, all requests will now be processed using the remote repository.

By now you may have noticed that OpenCM objects are named by URIs of the form:

$ opencm://host-crypto-name/object-crypto-name
Eventually, it will be possible using only the cryptographic host name to connect directly to a publicly registered repository.

3.3 OpenCM CGI Support

OpenCM provides built-in support for execution as a CGI program....

Describe how to set up the CGI service here.

4 Basic Usage

To get a sense of how to use OpenCM, let's go through the process of creating a new project and making a few changes. Along the way, we will introduce the key ideas on which OpenCM is built.

4.1 Starting a new Project

This section describes how to create a new project, and introduces the notion of a "repository name."

4.1.1 Creating a New Project

If you are starting a new project from scratch, you can create a new line of development by typing:

$ cm create project sample

This creates a new, empty line of development and binds it to the name sample in your OpenCM directory.

You will be prompted for various information about the new project, including a name and a description of the project. The name is used by various OpenCM commands to describe what you are working on. It is a good idea to keep it descriptive but short, much as you would a conventional file name.

4.1.2 Importing an Existing Code Base

If you already have a directory of files, you can create a new project incorporating them in one step. First, place all of the files in your project under a common (UNIX) directory somewhere. There should be nothing in this directory other than your project files. Make this directory your current directory Then type:

$ cm import sample
# other commands might add or drop files here
$ cm commit

In effect, the import command creates a new project, checks it out, and automatically adds all of the files in the current directory and below. At this point, you can type other commands to add or remove files it you like. The commit command then uploads all of this state to the repository.

4.2 Repository Names

At this point, we should digress briefly to introduce the first of the three OpenCM namespaces: repository names.

Repository names are names that appear in a user's repository directory. Each OpenCM repository maintains a directory space for every user. The main function of this space is to provide a place to keep track of work that you have in progress. In our repository creation example:

$ cm create project sample

the name sample is a repository name. As with conventional files, objects in the repository can be renamed or deleted. The mv and rm commands are reserved for manipulating managed content, so for repository names we use rebind and unbind. To rename an object in the repository, use the rebind command:

$ cm rebind sample new-name

The project should now be called new-name. You can see this by using the cm ls command:

everyone    # the group of all users
new-name    # the project you just renamed
self        # the user object for the administrative user
users       # a directory containing entries for each user
            # this is now redundant and may soon go away.

Repository objects can be organized into subdirectories. To illustrate this, let's create a directory to hold the lines of development associated with this new project, and relabel our new line of development as the main line of development for the project.

$ cm mkdir my-project
$ cm rebind new-name my-project/main

Finally, you can remove an object. We won't do it here, because we will need this new project for the rest of the chapter, but assuming that your object's repository name is ties, you could remove it by:

$ cm unbind ties

The object will not actually go away until all names binding it are removed. Once all bindings are gone the object will be removed the next time the repository store is garbage collected.

4.3 Using the Workspace

To start working on your new project, create an empty (UNIX) directory somewhere, make it your current directory and type:

$ cm checkout sample

This will check out a copy of the previously create sample project. If you do not specify a particular version, OpenCM will check out the most recent version of the line of development.

Since the sample project has no files yet, you will not initially see anything new appear in your current directory. Running the UNIX ls -a command, however, will reveal that a new OpenCM workspace has been created. You will find that there is now a .opencm directory here. Within this directory you will find a file named Workspace that records information about your workspace and the pending operations that need to be committed.

4.3.1 Adding Files

To add a file to the sample project, you first need to create the file using a text editor. Assuming you called the file hello.c, you can then type:

$ cm add file hello.c
A ./hello.c
The output from this command shows that hello.c was added to the project. Another way to display this information is with the cm status command.
$ cm status
[ A       ]      65 ./hello.c

In addition to displaying the flag 'A' to show that this file was added to the project, this command displays the length of the file.

Something to notice here is that you did not need to specify a repository. When you are operating within a workspace, OpenCM learns the identity of the repository by consulting the Workspace file. The correct Workspace is identified by proceeding up the file tree until one containing a .opencm directory is found. Therefore, when you are working within a workspace it is not necessary to specify a particular repository.

OpenCM assumes that the workspace's repository should take precedence over the one specified in your OPENCM_REPOSITORY environment variable. If you need to override the workspace repository for some reason, use the --repository option.

4.3.2 Checking Status, Committing Changes

When you are satisfied with your current round of changes, the cm commit command can be used to check them in to the repository. Once the changes are in the repository, you can recover this particular state of your workspace at a later time.

Before we go ahead and commit these changes, let's take a look at the current state of the project:

$ cm ls sample
[Version] 0

Once again, note that we didn't need to type a repository name because we are operating from within a workspace. If we specifically list version 0 of this line of development, we will see that there are no files in it:

$ cm ls sample/0
# OpenCM prints no response -- this could be clearer, but it is correct.

Now, let's go ahead and commit our new hello.c file into the repository. The cm commit command will prompt you for a message indicating what this commit is about:

$ cm commit
[ A       ]      65 ./hello.c
opencm: Uploading CommitInfo Record...
opencm: Uploading "./hello.c"

After the commit, the cm status shows nothing, because by default it does not display information about unmodified state in the workspace:

$ cm status
# OpenCM prints no response

To see a more detailed status view, try the -v option:

$ cm status -v
[same     ]      65 ./hello.c

Once again listing the project, we see that it now has two revisions:

$ cm ls sample
[Version] 0
[Version] 1
$ cm ls sample/1
[Entity] hello.c
# shap: the various outputs need cleanup for consistency.

4.3.3 Modifying and Renaming Files

Let's copy the file so we can illustrate rename, modify, and delete:

# First, the UNIX command to copy the file:
$ cp hello.c hello2.c
# Next, add and commit this new file:
$ cm add file hello2.c
N ./hello2.c
$ cm commit -m 'New file for demo purposes'
[ A       ]      65 ./hello2.c
opencm: Uploading CommitInfo Record...
opencm: Uploading "./hello2.c"

The -m option can be used to avoid bringing up an editor for short messages by providing them directly on the command line.

While the output badly needs to be cleaned up, you can see this commit message using the cm show object command:

$ cm show object sample/2
URI:          opencm://rIrdJD4vIwPKANxDrEAxLYaQ3ZWQSA/9v8eiEdriVAD4CG76VNhQrUfYXrKjC
Sequence No:  4
nRevisions:   3
Read Group:   opencm://rIrdJD4vIwPKANxDrEAxLYaQ3ZWQSA/xgSjAvXiK3YzGEpvhD-kw4gbthZLkg
Mod Group:    opencm://rIrdJD4vIwPKANxDrEAxLYaQ3ZWQSA/xgSjAvXiK3YzGEpvhD-kw4gbthZLkg
Notify:       <null>
Name:         Sample
Description:  This is an initial creation of a sample project for purposes of
document writing.

Flags:        0x0
Signature     0e12b666a734d423b4663f9d4ebef461f39426e6ca781cbb8c1a0dd21ee67f1fbf51a07d5eec63fc662b153b5af7f249c1d76875bbb4906d1ef1bd00ee4d5fc6630d3eb9fe35481daf219981dc8063398dbbc03b3d3f321c16eb5dc1e36399e4599ee2ab25398e12d1a494fd43de0ace642752140c5fac9159a32fff876daf83701274b918296ed6077449814800279ea89ded6582f2ec4ad6717f925ec3ef8febe990f08a34634ccd5b93e25b45f4fcd82c2dd522fa93c18907344141e5169c2eb9f5580ed094f4fe726764243f19605b0d5d4274c18f746c12933f999f37e646e1bd13068ebd1af675e32e459dcd0204b86dd152f80ff78c8fd3c4004cf2d4

[Change]
Change:       "chg_sha1_B53m9BMP-kwMwaHLNkAQj73eBKpsMK"
File ACLs:    "<null>"
CommitInfo:   "cmt_sha1_rR0IGBoWy9hLA2y7CFA6--m3E1cDOH"
Parent:       "chg_sha1_-CPBrDEeVETNQB4BpDACsjyeP5GksJ"
Merge Parent: "<null>"
Entities:
   0  ent_sha1_NvCPXBcfKSGDQlGA76AMu+o4VyQnjA
   1  ent_sha1_t+oIyCUnEaeLgO--HbAtTKop9auUSB

Right now, the show command is not greatly useful because it displays information mostly in internal terms. The CGI browser interface does a better job at the moment, and we expect significant improvements in the next version of OpenCM. Some of the items that are shown are important for collaboration, though, and we will discuss them further in the next chapter.

For now, let's change hello2.c o be a "goodbye" message:

# After editing:
$ cm status
[  M      ]      67 ./hello2.c

The M flag indicates that the file has been modified.

The name hello2.c at this point really isn't appropriate, and we should probably rename the file. If you were not using a configuration management system, you would do this with the UNIX mv command. Watch what happens if you do that:

$ mv hello2.c goodbye.c
$ cm status
opencm: File "./hello2.c" is missing.
? ./goodbye.c

OpenCM is warning you that something confusing has happened. At this point (don't do this!), you could add the goodbye.c file and remove the hello2.c file by typing:

$ cm add file goodbye.c
N ./goodbye.c
$ cm remove file goodbye.c

but this is not the best way to proceed. If you do this, OpenCM will not know that the file was renamed, and it will have difficulty performing merges correctly for this file.

Let's put the file back to its original name and do things the right way:

# Restore the file back to its original name:
$ mv goodbye.c hello2.c
$ cm mv hello2.c goodbye.c
$ ls
goodbye.c  hello.c
$ cm status
[  M  N   ]      67 ./goodbye.c
        Name was: "./hello2.c"

OpenCM has renamed the file on your behalf, and it knows that the file has been renamed. When you commit this change and other people fetch it from the repository, OpenCM will be able to correctly preserve any working changes they may have made to hello2.c in their workspace.

$ cm commit -m 'Created goodbye world and renamed appropriately'
[  M  N   ]      67 ./goodbye.c
        Name was: "./hello2.c"
opencm: Uploading CommitInfo Record...
opencm: Uploading "./goodbye.c"

4.3.4 Other Useful Commands

OpenCM provides a means to record "notes" about your work in progress. The cm note command allows you to record stray thoughts as you work so that you can later incorporate them into your commit message. OpenCM remembers these notes in its workspace records. When it laters pops up an editor for you to enter a commit message, it includes these notes in the file, allowing you to edit them for incorporation into your commit message.

OpenCM provides both summary help and specific command help:

$ cm help
The opencm commands are:

add         adduser     admin       bind        browse      checkout
commit      create      debug       diff        gadd        gremove
help        log         import      logmail     ls          merge
mkdir       mv          ndiff       note        notes       options
rebind      revert      rm          server      set         show
status      tag         unbind      update      version     whoami
$ cm help commit
commit [fsname]*

Uploads local changes to the repository. If one ore more fsnames are
provided, only those files will be commited. Otherwise, all local
changes are uploaded.

See also: import, tag

One last commonly used option is -z[0-9], which enables zlib compression when running commands against a remote server. By default, -z3 is in effect, which seems to provide most of the benefit compression can provide without overly burdening the CPU. In some cases, you may wish to disable compression completely, with -z0.

4.4 Branching and Merging

Sometimes it is appropriate to make a sequence of changes experimentally. If the change is small, it may be perfectly fine to do this in your workspace, but some changes are larger and more complicated than this. Often, you want to save your work at various points so you have a way to back up. In many configuration management systems, this is more trouble than it is worth, and users avoid doing it. In OpenCM it is relatively easy to do.

The first step is to create a new "branch." A branch is simply a line of development that diverges from some existing line of development. It may represent a new product version that will never rejoin the original effort, or it may be a private "work in progress" that will eventually be merged back into the main code base. In this case, we will use a somewhat contrived example.

In OpenCM, you begin a divergent line of development using the create branch command:

$ cm create branch sample mybranch

This command creates a new line of development in the repository called mybranch that is derived from the sample branch. You can now check out mybranch in a new workspace directory and continue your work.

NOTE: This command is presently misimplemented. It is currently a repository manipulation command, when it should be a workspace command that does not require a new checkout. There needs to be a way to "fork" an existing line of development without checking anything out, but it shouldn't be called create branch. We need to fix this for the 0.1 release.

Having checked out the new branch, you can proceed with changes to it. Let's add the well-known missing period in hello.c:

$ cm status
[  M      ]      66 ./hello.c
$ cm commit -m 'Brian cannot spell.'

Now we can go back to our workspace for the sample line of development. You may need to check it out again. You can now type:

$ cm merge mybranch
Keeping "./goodbye.c" -- unchanged from common in mergespace
Updating "./hello.c".
Removing "./hello.c"
Renaming ".opencm/scratch/mergeoutput-1" => "./hello.c"
Merge is complete. Please run 'commit' to save any changes.

This will merge the changes made on the mybranch line of development into the sample line of development. You should be able to see the effects using the cm status command:

$ cm status

You can now commit this change integration using the cm commit command.

Unlike CVS, which doesn't really know what commits are, OpenCM remembers the merge operations you have done. If you are working on a long-running effort that you intend to merge, it is likely that the baseline will move several times before you are ready to do your final integration. If this is so, you can merge the baseline into your working branch any number of times to minimize the size of the changes, and then merge your work back into the baseline at the end of your efforts.

4.5 Tagging

As we've learned, each new branch, or line of development, can be mapped to a human-readable name in a user's OpenCM directory. However, the human-readable name of a branch, by default, always refers to the latest version of that branch. Frequently, it's useful to keep track of one specific version of a branch. OpenCM allows you to tag specific branch versions with a unique name, so they're easier to manipulate. For example, you may want to tag a version of development that represents what was shipped to a customer. Such a tag would allow you to always view the exact state of your development at the time it was shipped. You can always specify a specific version of a branch by appending a forward slash and the version number to a branch specification, but a tag allows you to identify the version using a unique name.

5 Using OpenCM in Multiperson Teams

Now that you have the basics of using OpenCM, we can turn to using OpenCM in team situations.

5.1 Users and Groups

OpenCM enforces access controls on every repository object. Each object has an authorized writer and an authorized reader. The authorized reader/writer can refer to either a user or a group. In this section, we will show how to create a new group and make the sample project accessable to this group.

The first step is to create a group that will control our project:

$ cm create group mygroup
Group name: shared project group

By default, the creating user is automatically added to any group they create, and becomes the authorized reader and writer for that group.

Since this project is to be collaborative, we will want this group to control both reading and writing for our project. We also want to let anyone in this group control access to the group:

$ cm set group sample mygroup w
$ cm set group mygroup mygroup rw

The second command is a bit tricky: it means that anybody who is already in mygroup can add or remove people from mygroup. This authority should definitely be treated with care.

If we wanted to make this a publicly readable project, we would set its reader group to everyone:

$ cm set group sample everyone r

Now that we have done this, anybody who knows the name of this group can add it to their workspace and use it.

Groups are recursive: they can contain other groups. Group membership is transitive.

5.2 Binding Objects

Suppose that I have created a line of development and added you to the write group for it. How do you gain access to is so that you can work with it?

The answer is that you need to know the object's URI. The object URI is a cryptographic name that is universally unique. The URI of an object can be discovered by using the cm show command on the object:

$ cm show sample
URI:          opencm://Amw2iBkmqxZBgmsWYrAFe1wDQrKTDC/tGgt8XvIthbquj6BudgpYL2IhWKgoR
Sequence No:  4
.....

Any user who knows this URI can bind this object into their own workspace:

$ cm bind opencm-sample opencm://Amw2iBkmqxZBgmsWYrAFe1wDQrKTDC/tGgt8XvIthbquj6BudgpYL2IhWKgoR

Binding an object means that you have created a directory entry for it. It does not necessarily mean that you can access to the object. Access is still controlled by the read and write groups for the object.

In practice, it is inconvenient to bind objects one at a time in this way. A more convenient solution is to create a shared directory for the project and construct a shared binding for the directory. Once the directory is shared, all objects within it can be seen by all parties. It is still necessary to set the access rights on the object to allow the desired access.

5.3 Updating Your Workspace

Now that two or more users can check out an object and work on it, it becomes possible for modifications to overlap. You and I may both check out a branch. Suppose that I commit my changes before you do. You may want to incorporate these changes into your workspace and continue working. It is necessary that you do this before you commit so that work will not be lost.

To integrate changes made by another person into your workspace, use:

$ cm update

This command performs a three-way merge. It updates the baseline version used in your workspace to be the latest version on the branch, and it attempts to re-merge your modifications into the new version.

Usually, updating eagerly is the best thing to do. This way your workspace never gets too far out of date. Sometimes a change is large and you may wish to defer the update. This is also a reasonable strategy for development. Be aware, however, that OpenCM will not let you commit your changes unless your workspace is up to date.

6 CVS Quickstart

This chapter is written for users who already know CVS. It describes how to import an existing CVS workspace into OpenCM. The goal is to give CVS users a way to learn the workings of OpenCM with low initial overhead, so that you can make an informed decision about whether to import an entire CVS repository.

6.1 OpenCM Command Summary

NOTE: Use the command-line opencm help feature to see details on any of the commands.

6.2 Importing a CVS workspace

7 Getting Started

7.1 Creating a Project

7.2 Working with Files

7.3 Committing

7.4 Updates

8 Command Reference

9 Handling Repository Upgrades

Most releases of OpenCM are simple. The only change they make to existing repositories is to upgrade the version number. OpenCM will automatically perform simple upgrades like this the first time it is run on a given repository, without asking you for permission.

On rare occasions, however, it is necessary to upgrade or convert the repository in a significant way. This may occur because of a change in the repository schema, or because of a change in the supported repository types. For example, we did this type of upgrade when we dropped the old SXD version 0 repository type.

When we write upgrade procedures, we typically build them to be done "in place," converting each object and then deleting the old object as soon as it is fully converted. We do this because repositories can be very large (ours is currently 1.6 gigabytes), and an in-place conversion avoids the need to consume very large amounts of disk space - which can cause conversion to fail if the disk space runs out. Also, upgrades can take a while on a large repository, and we don't want users to wonder why OpenCM seems to be doing nothing.

We are careful, but no matter how careful we may be there is no such thing as a perfect upgrade process. Therefore, this type of "invasive" upgrade is never done automatically. If a potentially destructive upgrade to the repository is needed, OpenCM will complain and exit. To get the upgrade to happen, you should run

$ cm upgrade --repos file:/path/to/directory ls

OpenCM repository files have an owning "user" in the eyes of the native operating system. If you are running the upgrade on a personal repository, you should run it as whatever user you would normally use. Typically this will be yourself. If you are running the upgrade on a server repository, you should run this as user opencm, or whatever user owns your repository files according to your operating system.

It is strongly recommended that you make a backup of your repository directory before doing any upgrade of this kind. In at least one of the OpenCM alphas, we got the upgrade wrong and were very glad to have a backup ourselves!

10 Futures and Things To Do

11 Holding Buffer

As a system designed for archival purposes, OpenCM never deletes data on the repository. Thus, it treats all stored data as either frozen or mutable. The content of any archived object is frozen and if an archived object is allowed to be modified, its frozen content is referenced by an additional mutable object that chains together the history of modifications to the underlying content. The state of any mutable object is determined by a version index directly in the mutable and a sequence number in the revision record of that mutable. Thus, a trail of versions can be established for any mutable data by referencing these version and sequence numbers. The repository itself has no other knowledge of the stored data.

======================== Command line options:

long opts:

short opts:


Footnotes

  1. If your name isn't Jack, and you don't know your name, contact your local administrator for support.

  2. The OpenCM --configdir option can be used to override the location of the .opencm directory.