Licensed to the Apache Software Foundation (ASF) under one or more contributor license agreements. See the NOTICE file distributed with this work for additional information regarding copyright ownership. The ASF licenses this file to You under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. Introduction ============ Apache MADlib is released as both source tarball and a series of binary convenience artifacts for Linux and Mac OS X operating systems. MADlib's community takes great care of making sure that each release is done in accordance with ASF's release policy: http://www.apache.org/legal/release-policy.html The latest state of the recommended MADlib's release process can be found on MADlib's wiki: https://cwiki.apache.org/confluence/display/MADLIB/Release+Process In all this, MADlib looks like any other project developed in Apache Software Foundation. There is, however, one major difference that anybody reviewing MADlib releases or considering to consume MADlib downstream need to be aware of: portions of MADlib source code lack the obligatory ASF licensing header information: http://www.apache.org/legal/release-policy.html#license-headers This is very much intentional and simply reflects the nature of the original BSD license that MADlib had (more on that later in the Historical Background section). In fact, this was explicitly approved by the ASF's VP Legal: https://s.apache.org/EOT5 https://issues.apache.org/jira/browse/LEGAL-293 It does, however, trip up human reviewers and also tools like Apache Release Audit Tool (RAT). Basically, for every release of MADlib the community itself and all the downstream consumers (including external reviewers) have to make sure that for any NEW file added to the project the proper licensing header is added as well. This could appear as a daunting task at first, but fortunately with a few tips summarized below it doesn't have to be. Tips for reviewers and consumers of MADlib source code ===================================================== 1. MADlib provides an exclusion list for RAT tool in its pom.xml file. Running RAT via $ mvn apache-rat:check and ispecting RAT's report afterwards provides a good baseline on which source files don't need to have an license header. 2. A second level of validation is to see how this exclusion list differs between the previous official release of MADlib and the one under review. Running a simple diff or a git diff on the pom.xml file will provide all the details. 3. Finally a 3d level of validation is to see what new code was added to the project. This is where you would have to use the magic of git by running something along the lines of: $ git diff --stat rel/XXXX..HEAD where XXX is the release tag of an official release immediately preceding the one being reviewed. Correlating the output of this command with RAT list will provide a full understanding of where licensing headers belong and where they don't. 4. For the really paranoid, you could always compare ANY release of MADlib to the state of the source code base when it was imported into the ASF's repository by running: $ git diff --stat asf_import..HEAD Historical Background ===================== Prior to the software grant to ASF on Sept 15, 2015 as an incubating project, MADlib was an open-source library licensed under a 2-clause BSD license, with multiple contributors since its inception in approximately 2011. After the grant to ASF, the MADlib community requested guidance from ASF legal regarding how to manage license headers for legacy BSD-licensed files, modified BSD-licensed files, and new files. The intent of the request was to ensure that the Apache MADlib (incubating) project was acting as a "good Apache citizen" and respecting the guidelines of ASF with respect to software licensing. Ultimate resolution (articulated in LEGAL-293) came down to: * don't do anything with existing (BSD) files even if we edit them * every new file we create gets an ASF license header