The Future of Distributed 
                Software Development on the Internet 
                From CVS to WebDAV to 
                Delta-V 
                By Jim Whitehead
                Every day, developers with a shared software vision band 
                together from around the world to develop Open Source software. 
                A similar trend occurs in the corporate world: Large companies 
                with physically dispersed divisions create distributed teams to 
                work together on software projects. Cross-organizational 
                projects also occur with greater frequency, such as a 
                subcontractor working closely with a primary systems-integration 
                contractor on a large project.
                These geographically dispersed teams share the same needs for 
                distributed source-code control. When it comes to working on the 
                design documents, test cases, specifications, and source code 
                that comprise the project, individual team members need to work 
                on pieces in isolation, then integrate those pieces with the 
                modifications of their coworkers, without clobbering anyone 
                else's changes. Changes need to be tracked so that errors and 
                exploratory design changes can be undone easily. Tracking 
                creates a group memory of how files have changed over time -- 
                valuable for later reconstruction of detailed design rationales. 
                Released and stable configurations of the project are tracked so 
                they can be regenerated quickly, and so that bug fixes can be 
                made to the appropriate release. These capabilities are all 
                provided by software configuration management (SCM) systems.
                SCM systems use a library metaphor to control access to 
                project documents and source code. At first, the SCM repository 
                holds all development files in a "checked-in" state. To work on 
                a file, one needs to check it out, just like taking a book out 
                of a library. Once changes are complete, the file is checked 
                back in, accompanied with brief comments describing the changes. 
                A checked-in file is immutable, and can't be changed again 
                without checking it out. 
                Once a change-tracking system is in place, it's possible to 
                view previous revisions of a file and see differences between 
                revisions. Another typical feature is viewing the change history 
                of a file -- listing the modification date for all revisions, 
                the person who made the change, and the comments he or she 
                submitted with the change. It's also possible to discard some 
                revisions -- a useful capability if an exploratory change 
                doesn't work out as intended.
                Revision tracking also makes configuration tracking possible. 
                Since any nontrivial software system is composed of multiple 
                source objects, which are described by multiple design and 
                requirements documents, freezing the state of an entire project 
                requires knowing the exact version of each file in the project 
                so that a consistent snapshot can be made. SCM systems provide 
                this capability, allowing users to create baselines that can be 
                used for testing and release tracking. Since all checked-in 
                revisions are immutable, it's possible to revert to a previous 
                project configuration, a critical capability for supporting 
                previously released software projects.
                Remote Configuration 
                Management with CVS
                Today, the distributed configuration management system of 
                choice for Open Source developers is the Concurrent Versions 
                System (CVS). Currently in use by the Apache HTTP Server and 
                Netscape Communicator (Mozilla) Web-browser Open Source 
                projects, CVS has many advantages for distributed teamwork. 
                Since CVS is itself an Open Source project, it's freely and 
                widely available. In addition to providing typical versioning 
                and configuration-management features, CVS also offers excellent 
                work isolation for team members, and the CVS client/server 
                protocol allows this teamwork to occur remotely. The cvsweb 
                utility allows CVS version histories, old revisions, and 
                differences between revisions to be browsed in a read-only 
                manner on the Web. CVS front ends have been developed for UNIX, 
                PC, and Mac systems, allowing developers from all platforms to 
                participate on a project (see the box titled "Giving 
                CVS a Facelift"). Since many Open Source projects use CVS, 
                there is a large and growing pool of developers who know CVS, 
                and understand how to use it for team work. In conjunction with 
                an email mailing list, a Web site giving project overview and 
                documentation, and a bug reporting and tracking system, CVS is a 
                key coordination infrastructure for performing collaborative 
                teamwork via the Internet.
                Jim Jagielski's article in this issue on the Apache 
                development process highlights how CVS is used on a successful 
                Open Source development project. Using the CVS 
                update-edit-commit work cycle, Apache developers are able to 
                work on source code on their local machines, thereby isolating 
                themselves from the changes made by other developers. When local 
                changes are complete, they are merged with the intervening 
                modifications of other developers, and then committed to central 
                development server. 
                CVS isn't the only configuration-management tool that 
                supports remote development teams. Commercial SCM systems 
                frequently provide this capability, examples being Rational 
                ClearCase MultiSite, Merant PVCS Replicator, and the 
                Continuus/DCM distributed change management product. Other Open 
                Source tools also offer distribution support, a notable one 
                being the Distributed Versioning System (DVS) available at the 
                University of Colorado. These systems are just the tip of the 
                iceberg. The Configuration Management Yellow Pages has an 
                exhaustive listing of existing commercial and Open Source 
                systems (see "Online").
                Today: Remote Web 
                Authoring with WebDAV
                Exciting new work that's just starting in the Internet 
                Engineering Task Force (IETF) promises to make it easier to 
                perform remote collaborative project work over the Web. The new 
                effort is called Delta-V, and its goal is to provide versioning 
                and configuration management capabilities for the Web by 
                extending the Web's core protocol, HTTP. Using Delta-V, 
                collaborative teams will be able to edit the source code, 
                documents, Web pages, and binary graphics in a project, then 
                record important revisions and manage project configurations -- 
                all in-place on the Web. The Delta-V activity is building upon 
                the work of the WebDAV Distributed Authoring Protocol, an IETF 
                standard that has extended HTTP with operations for remote 
                collaborative authoring on the Web. Delta-V extends HTTP and 
                WebDAV with versioning, isolation of individual changes from 
                collaborators' changes, and SCM capabilities.
                The WebDAV protocol, the foundation on which Delta-V is 
                built, extends the Web to make authoring of Web resources as 
                easy as browsing them. Unlike CVS, which downloads files to a 
                local hard drive to retain compatibility with existing 
                applications, with WebDAV Web resources are edited directly on a 
                Web server. Applications must be modified to interact with the 
                Web server using the WebDAV protocol. Though WebDAV is still in 
                the early stages of adoption, Internet Explorer 5 and the Office 
                2000 suite of applications have already integrated WebDAV 
                support via a feature called "Web Folders," providing remote 
                authoring for Word, Excel, and PowerPoint documents directly on 
                the Web (see the box titled "Web 
                Folders and WebDAV"). Additionally, WebDAV Explorer provides 
                a file-system explorer interface for a WebDAV server. There are 
                many existing WebDAV servers, including the mod_dav module for 
                the Apache server, Microsoft Internet Information Server (IIS) 
                5, Glyphica PortalWare, Xythos Storage Server, DataChannel RIO, 
                Intraspect Knowledge Server, Digital Creations Zope, CyberTeams 
                WebSite Director lite, and the freely available WebRFM. The IBM 
                DAV4J server, available from AlphaWorks, also provides a Java 
                client API for WebDAV.
                WebDAV features are designed to accommodate existing tools, 
                making it straightforward to integrate WebDAV-based remote 
                authoring into them. WebDAV's namespace operations provide the 
                ability to create and list collections, and to copy and move Web 
                resources, thus supporting the needs of "File... Open" and 
                "File... Save" user-interface dialog boxes. Locking of entire 
                Web resources provides overwrite protection for all types of Web 
                resources (HTML pages, GIF images, word processing documents, 
                and source-code text files), and in fact, one of WebDAV's design 
                principles is to provide equal support for all Web-resource 
                types. WebDAV also provides support for storing and retrieving 
                metadata, in the form of attribute-value pairs called 
                properties, associated with a resource. The name of a WebDAV 
                property is a URL, used in this case as a property identifier, 
                not as a locator, and a property value is well-formed XML, 
                gaining XML's advantages for representing structured data and 
                for internationalizing string values.
                Early Web-authoring tools encountered the "lost update 
                problem," which occurs when two or more simultaneous authors of 
                a Web page clobber each other's work with successive saves to 
                the same URL without first merging their changes. Although HTTP 
                1.1 has support for detecting lost updates through unique 
                identifiers associated with the document state, no support is 
                provided for preventing lost updates in the first place. To 
                solve this problem, WebDAV uses long-duration, whole resource 
                locking as its concurrency control mechanism. The WebDAV 
                protocol provides a write lock, but no read lock capability. On 
                the Web, by default a resource is readable, although it may be 
                protected by access control. Therefore, HTTP doesn't require 
                that a Web browser obtain a lock in order to read a resource, as 
                is the case with traditional database locking, retrofitting the 
                Web with this capability was neither feasible nor desirable. Web 
                servers implement the write operation PUT by saving the contents 
                of the resource in a temporary buffer until the entire new 
                resource has been transmitted, then using internal concurrency 
                control to block read access while the new value is quickly 
                updated. So the traditional database problem of reading a value 
                in an inconsistent state is avoided. Another traditional 
                database issue, deadlock, is also avoided with WebDAV locks. 
                Since locks are granted via a protocol request, with a given 
                request either granted or denied, there's no blocking, and hence 
                no possibility of deadlock.
                WebDAV servers have used differing strategies to implement 
                the features in the protocol -- the major difference is the 
                underlying repository chosen by the server to store properties 
                and resources. Microsoft's IIS 5 server uses the Windows 2000 
                file system as its repository, and provides an extremely tight 
                integration between file system services and WebDAV services. 
                When a file is locked via WebDAV, it is also locked in the file 
                system, and hence a local user cannot clobber a file locked by a 
                remote user. IIS 5 also uses Windows 2000 user and 
                access-control lists to determine whether a WebDAV user has 
                access to a particular file; there is no separate Web 
                access-control mechanism used by IIS 5. In contrast, the mod_dav 
                Apache module also uses a file system repository, but requires 
                that the Apache server owns all WebDAV authorable files, thus 
                effectively preventing local access to the files. This avoids 
                the need to assume root privileges under UNIX to change the 
                ownership of files -- a security risk -- and lets mod_dav create 
                users that don't have local system accounts, only WebDAV 
                authoring privileges. Restricting local file access prevents 
                another potential problem: Since mod_dav stores properties in a 
                separate database, moving or deleting a file without telling 
                mod_dav results in "ghost" property entries for a resource that 
                no longer exists. 
                Other WebDAV servers store their information in databases 
                instead of the file system. The Glyphica PortalWare server has 
                created a content management system that sits on top of the 
                Versant object-oriented database system. All documents that are 
                submitted to PortalWare are indexed for full-text searching, and 
                have properties associated with them in the database. The Xythos 
                Storage Server uses a relational database for storage, instead 
                of an object-oriented one. The Xythos server uses standard SQL 
                via JDBC to interface with its database, which, combined with 
                the cross-platform support of databases like Oracle, Sybase, and 
                Informix, lets the Xythos server run cross-platform, and on a 
                variety of databases. Both servers gain several typical database 
                advantages, including transaction support that's useful in 
                implementing WebDAV methods, and good recovery from disasters 
                like power outages and disk failures.
                The Future: Web-Based 
                Delta-V
                While WebDAV's remote-authoring features are useful for 
                performing remote collaborative authoring, they highlight the 
                need for versioning support to preserve the history of work. The 
                work on Delta-V is intended to fill this role, adding versioning 
                support to WebDAV. Work on Delta-V is ongoing, so details of the 
                protocol may change as the standardization work continues, but 
                there's increasing convergence on its features and benefits. ( Figure 
                1 provides a high-level architecture diagram showing several 
                applications using Delta-V.) 
                Work is progressing rapidly, driven by working group 
                participants with a deep background in SCM, document management, 
                software environments, and Web portal systems. These 
                participants come from the leading companies in these areas: 
                IBM, Microsoft, Novell, Rational, Merant, DataChannel, Object 
                Technology International, and Dynamic Diagrams, with university 
                participation from U.C. Irvine.
                The Delta-V protocol addresses several shortcomings in CVS. 
                The primary advantage of Delta-V is its tight integration with 
                the Web. Using CVS to manage a Web site requires understanding 
                how the file structure managed by CVS maps into URLs served by 
                HTTP, a difficult concept for many users. With Delta-V, Web 
                resources are edited in-place, at a specific URL, and no mapping 
                of filenames to URLs is necessary. Furthermore, the Web-native 
                Delta-V protocol can handle the different types of Web resources 
                better than a file-oriented system like CVS. By versioning Web 
                resources, Delta-V allows HTML links to old revisions of Web 
                pages, creating a sort of time machine for the Web. Linking to a 
                specific revision often can preserve the semantic meaning of a 
                link, such as when linking to a Web-log site that changes 
                frequently, where the linked-to information may be gone in a 
                week. If the site used Delta-V to version its content, these old 
                revisions would still be accessible.
                The Delta-V protocol has several unique features. Delta-V 
                assumes that most editing will take place directly on Web 
                resources, which differs from CVS in that there's no local 
                replica. Isolation from the changes of other team members is 
                provided by "workspaces," which provide each collaborator with 
                his or her own view on the resources being edited. Unlike the 
                local replicas that provide isolation in CVS, workspaces isolate 
                collaborators as they work on the remote Web server. Overwrite 
                conflicts are avoided because a resource can be checked out by 
                multiple people simultaneously, and each check out creates a 
                separate working resource. Each collaborator actively working on 
                a resource has a separate virtual working area, identified by 
                his or her workspace, and modifications are made first in a 
                workspace, then merged with the changes of other collaborators. 
                
                Another drawback of CVS is its client/server protocol, which 
                is tightly coupled to CVS's repository. Unlike CVS, HTTP and 
                WebDAV have a proven track record of mapping to multiple types 
                of server back-end stores, such as databases, document 
                management systems, and file systems. Delta-V provides a 
                cross-platform integration layer, thus bringing the benefits of 
                remote Web collaboration support to a diverse set of existing 
                back-end repositories that do not currently provide Web 
                authoring or versioning support. Judging by the participants in 
                the working group, the Delta-V protocol will be mapped to SCM 
                systems, document management systems, and content management 
                systems, all of which employ a database to provide their 
                features. This makes the Delta-V protocol a more powerful data 
                integration technology than the CVS client/server protocol, 
                which maps only to the CVS repository.
                Delta-V provides versioning of collections, a feature not 
                supported by CVS. When a collection is versioned, collections 
                and their contents follow the check-out/edit/check-in model. 
                When a collection is checked in, its membership is frozen, and 
                can't be changed until the collection is checked out again. 
                Making a new file or deleting an existing file requires the 
                parent collection to be checked out. When all collections in a 
                project are versioned, it's possible to record permanently the 
                membership of each collection for each moment in time, thus 
                making configuration management support possible. Once both 
                collections and their contents are versioned, it's possible to 
                explicitly pick a single revision of each collection and file 
                (often the most recent revision), creating a snapshot of the 
                entire project. 
                CVS doesn't provide full versioned collection support, 
                leading to odd glitches. As an example, consider renaming a file 
                from A to B. Using CVS, this requires three steps: copying file 
                A's contents into the new location at B; using a cvs 
                add to put B into the CVS repository; and a cvs 
                remove to delete file A. If the collection containing B 
                were reverted to a previous state when A was present but B had 
                not yet been added, the collection will contain both A and B. 
                Since CVS doesn't store previous revisions of collections, it 
                doesn't know when B was added, and so can't revert the 
                collection correctly. Because Delta-V versions collections, it 
                can avoid this problem. Renaming the file in Delta-V would 
                involve checking out the collection to make it editable, moving 
                the file from A to B, and then checking in the collection. If 
                the collection is reverted to the original version, just before 
                the initial check out, it will contain A, but not B, and 
                similarly the following revisions will contain B, but not A. 
                Versioned collections thus provide the foundation for rigorous 
                configuration management. 
                Since Delta-V assumes work will take place directly on a Web 
                server, rather than on a local replica, existing WebDAV editing 
                tools, like Office 2000, that are not versioning-aware need to 
                be accommodated. Delta-V can automatically record, as separate 
                revisions, changes to a document made by a versioning-unaware 
                client. Delta-V also divides its functionality into two layers: 
                a simple versioning layer, and a more complex SCM layer. Since 
                authoring clients (word processors, text editors, spreadsheets, 
                and so on) typically work on a single file at a time, they are 
                only expected to use the basic versioning layer to support a 
                check out/edit/check in style of work. The typical authoring 
                client is not expected to provide a user interface for 
                operations like creating and reverting configurations, since a 
                configuration spans an entire project, far greater than their 
                single-file editing scope. A separate SCM control panel 
                application will make use of the features in the SCM layer. This 
                control panel will operate at a collection and project level, 
                providing the capability to create a project configuration or 
                revert to a previous configuration. It will complement the 
                single-file focus of the authoring tools with project-wide 
                capabilities. A full-featured programming environment will be a 
                third class of Delta-V application, one that uses both the 
                versioning and configuration capabilities of Delta-V, providing 
                support for editing individual source-code files, as well as 
                project-level SCM support.
                Despite their differences, Delta-V and CVS have much to offer 
                each other. Though Delta-V has been designed for collaborators 
                to work directly on a Web server, it's technically feasible to 
                use the protocol to create local replicas, as in CVS. In fact, 
                though it has not been attempted, it appears to possible to 
                replace the CVS client/server protocol with Delta-V, and an 
                existing WebDAV client called sitecopy provides a glimpse of how 
                this could be done. The sitecopy utility allows a local 
                file-system directory to be replicated to a remote WebDAV 
                server, so a Web site can be created locally using file-system 
                based authoring tools, then published remotely using the WebDAV 
                protocol. In its remote replication support, sitecopy is similar 
                to the CVS update operation. Though sitecopy and WebDAV don't 
                support versioning, it's not a far stretch to imagine adding 
                bidirectional synchronization, conflict flagging, and versioning 
                operations to sitecopy, thus creating a system that has many of 
                the capabilities of CVS. But why recreate the CVS user 
                interface? It's far better to integrate the Delta-V protocol 
                into CVS, retaining the benefits of the CVS without having to 
                learn a new system. Since Delta-V can map to multiple back-end 
                repositories, Delta-V would allow the CVS style of work to be 
                used against multiple repositories, not just with CVS. 
                The Delta-V protocol opens up several intriguing 
                possibilities for building software systems. These possibilities 
                vary based on where the source code, compiler, and object files 
                are located -- on the remote Delta-V server or on the local 
                machine. If they're all on the local machine, then the build 
                process is very CVS-like, with source code replicated to the 
                local machine before the compiler begins operation, yielding 
                object files that reside locally. But if the source code, 
                compiler, and object files are held remotely, a client would 
                initiate a build by sending a build request to a remote compile 
                server, giving the URL of a makefile and a workspace, storing 
                the object files in the same version-controlled URL hierarchy as 
                the source code. In this scheme, a different compile server 
                could compile each platform variant. While the compiler wouldn't 
                typically be placed on the same machine as the Delta-V server -- 
                so compiles don't adversely affect server performance -- it 
                would be reasonable to place the compile server on the same 
                local storage area network as the Delta-V server. Many 
                interesting configurations are possible for build management 
                using Delta-V, undoubtedly an area where implementations will 
                innovate on different strategies.
                With a proven track record based on successful use on a wide 
                range of Open Source projects, CVS is a low-cost, high-value 
                system available today. Looking to the future, the Delta-V 
                protocol melds versioning and SCM with the Web, adding powerful 
                team collaborative work facilities, with the potential for a 
                value-adding integration with CVS. Whether you're looking at the 
                state of things today, or the promise of the future, the 
                implication of these two technologies is clear: It's easier than 
                ever before to assemble a virtual team for remote collaborative 
                project work 
                
                
                
                Jim is the Chair of the IETF WebDAV Working Group, and an 
                active participant in the Delta-V Working Group. He is also a 
                Ph.D. student in the Department of Information and Computer 
                Science at the University of California, Irvine. Professional 
                experience includes a position at Raytheon, where he designed 
                firmware in C and Ada for the German civilian air traffic 
                control system (DERD) and for a prototype Microwave Airplane 
                Landing System.