-------- Original Message --------
Subject: Re: Source code for Linux XFS now available!
Date: Fri, 31 Mar 2000 15:30:44 +1000
From: Ken McDonell <kenmcd@xxxxxxxxxxxxxxxxx>
To: "David S. Miller" <davem@xxxxxxxxxx>
CC: mostek@xxxxxxx, slinx-xfs@xxxxxxxxxxxxxxxxxxxx,
slinx@xxxxxxxxxxxxxxxxxxxx,stg-fs@xxxxxxxxxxxxxxxxxxxx,
cxfs_biz@xxxxxxxxxxxxxxxxxxxx
On Thu, 30 Mar 2000, David S. Miller wrote:
> ...
>
> One would think that with a year or so of internal work this sort of
> stuff would have been cleaned up already. If you guys had released
> this a year ago things would be much further along than it is right
> now, I think that keeping it internal for so long was the biggest
> mistake SGI made about XFS on Linux. ...
>
> But that's SGI politics for you.
Dan has already pointed out the encumbrance issues, and there have
been follow ups from David and Steve.
Let me just add what is hopefully the last word on the topic ...
I've attached the timeline for the encumbrance review. The whole story
of why it took so long would require Jim, Russell, Steve, et al to
splice in the technical milestones from the process of ripping XFS out
of IRIX and stuffing it into Linux.
Also attached is the draft text from a paper we've submitted to USENIX
that describes the encumbrance review process at a high level.
Hopefully, this will convey the message that this was not only a joust
with the lawyers but a large and complex undertaking based on the sheer
size of the code bodies that had to be reviewed.
--------------
XFS Encumbrance Review Timeline
Date Event
14 May 1999 First suggestion that encumbrance review be sent to
engineering group in Melbourne.
20 May 1999 SGI announces intention to contribute XFS to the open
source
community. Press release and announcement at Linux Expo,
Raleigh.
7 June 1999 Melbourne team signs up for encumbrance review. Initial
communication on this topic with first of the SGI
lawyers.
30 June 1999 Initial attempts to contact those still at SGI who were
involved with initial XFS development and subsequent XFS
maintenance and evolution. Contact with Kirk McKusick,
re.
BSD4.4-lite experiences. Located primitive file
comparison
tools from earlier open source project within SGI.
23 Jul 1999 Relevant licenses and source code bases identified.
4 Aug 1999 Initiate contact with XFS developers no longer at SGI.
9 Aug 1999 Source of xfs log routines (5 files) and XFS licensing
terms
and conditions (GPL) published on oss.sgi.com and
announced
at LinuxWorld in San Jose.
16-20 Aug Encumbrance review team meets current engineering
developers
1999 for XFS architecture brain dump. Serious negotiations
with
second SGI lawyer, but still no agreement on process.
25 Aug 1999 Start negotiations with third SGI lawyer on formal
process
and statement of work. Initiate discussions with SCO.
30 Sep 1999 Abandon discussions with SCO.
4 Oct 1999 Legal and engineering resume dialog on statement of work
and
encumbrance review process description.
20 Oct 1999 Completed first version of the required production
strength
encumbrance tools to process large numbers of source
files.
26 Oct 1999 First contact with fourth SGI lawyer.
12 Dec 1999 Agreement between legal and engineering on statement of
work
and encumbrance review process description.
31 Jan 2000 A further 103 source files made available on oss.sgi.com.
21 Feb 2000 77 more source files published on oss.sgi.com.
28 Feb 2000 First pass encumbrance review of all source files
completed.
A small number of files require additional review and/or
relief.
29 Mar 2000 Encumbrance team finishes review and relief work. The
remaining 221 source files are cleared for release under
the
GPL.
30 Mar 2000 CVS tree with complete buildable source for XFS made
available on oss.sgi.com.
--------------
[draft text from USENIX conference paper in preparation]
Encumbrance Review
For XFS to be a viable alternative filesystem for the open source
community, it was deemed essential that XFS be released with a license
at least compatible with the GNU Public License (GPL).
The IRIX operating system in which XFS was originally developed has
evolved over a long period of time, and includes assorted code bases
with a variety of associated third party license agreements. For the
most part these agreements are in conflict with the terms and
conditions of the GNU Public License.
The initial XFS project was an SGI initiative, that started with a new
top-to-bottom filesystem design, rather than extending an existing
filesystem. Based upon the assertions of the original developers and
the unique features of XFS, there was a priori a low probability of
overlap between the XFS code and the portions of IRIX to which
third party licenses might apply. However it was still necessary to
establish that the XFS source code to be open sourced was free of all
encumbrances, including any associated with terms and conditions of
third party licenses applying to parts of IRIX.
SGI's objectives were:
+ to ensure the absence of any intellectual property infringements;
+ to establish the likely derivation history to ensure the absence
of any code subject to third party terms and conditions.
This was a major undertaking, as the initial release of buildable XFS
open source contained some 400 files and 199,000 lines of source. The
process was long, but relatively straightfoward, and encumbrance relief
was usually by removal of code.
The encumbrance review was a combined effort for SGI's Legal and
Engineering organizations. The comments here will be confined to the
technical issues and techniques used by the engineers.
Encumbrance Review Process
We were faced with making comparisons across several large code bases,
and in particular at least UNIX System V Release 4.2-MP, BSD4.3 NET/2,
BSD4.4-lite and the open source version of XFS.
1. Historical survey. We contacted as many as possible of the original
XFS developers and subsequent significant maintainers, and asked as
series of questions. This information was most useful as guideposts
or to corroborate conclusions from the other parts of the review.
2. Keyword search (all case insensitive). In each of the non-XFS code
bases, search for keywords associated with unique XFS concepts or
technologies, e.g. journal, transaction, etc. In the XFS code base,
search for keywords associated with ownership, concepts and
technologies
in the non-XFS code bases, e.g. at&t, berkeley, etc.
3. Literal copy check. Using a specially built tool compare every line
of each XFS source file against all of the source in the non-XFS
code bases. The comparison ignores white space, but some commonly
occurring strings are filtered out, e.g. matching "i++;" is never
going to be helpful.
4. Symbol matching. Tools were developed to post-process the ASCII
format databases from cscope to generate lists of symbols and their
associated generic type (function, global identifier, macro, struct,
union, enum, struct/union/enum member, typedef, etc.). In each XFS
source file the symbols were extracted and compared against all
symbols found in all the non-XFS code bases. A match occurred when
the
same symbol name and type was found in two different source files.
Some post-processing of the symbols was done to include plausible
name transformations, e.g. adding an "xfs_" prefix, or removal of
all underscores, etc.
5. Prototype matching. Starting with a variant of the mkproto tool,
the source code was scanned to extract ANSI C prototypes. Based on
some equivalence classes, "similar" types were mapped to a smaller
number of base types, and then the prototypes compared. A match
occurred when the type of the function and the number and type of
the arguments agreed.
6. Similarity of function, design, concept or implementation. This
process is based upon an understanding, and a study, of the source
code. In the XFS code, for each source file, or feature implemented
in a source file, or group of source files implementing a feature,
it was necessary to conduct a review of the implementation of any
similar source file or feature in each of the non-XFS code bases.
The objective of this review is to determine if an issue of
potential encumbrance arises as a consequence of similarity in the
function, implementation with respect to algorithms, source code
structure, etc.
7. Evidence of license agreements. The XFS code was examined
(especially
in comments) to identify any references to relevant copyrights or
license agreements.
In all of the steps above, the outcome was a list of _possible_ matches.
For each match, it was necessary to establish in the context of the
matches (in one or more files), if there was a real encumbrance issue.
A modified version of the tkdiff tool was used to allow the areas of the
"match" to be graphically highlighted without the visual confusion of
all
of the minutiae of the line-by-line differences. However, the
classification
of the matches was ultimately a manual process, based on the
professional
and technical skills of the engineers.
Encumbrance Relief
Especially in view of the size of the XFS source, a very small number
of real encumbrance issues were identified.
In all cases the relief was relatively straight forward, with removal
of code required for IRIX, but not for Linux, being the most common
technique.
|