Thursday, October 17, 2019

A Tool for Jakarta EE Package Renaming in Binaries

In a previous post, I laid out my thinking on how to approach the package renaming problem which the Jakarta EE community now faces. Regardless of whether the community chooses big bang or incremental, there are still existing artifacts in the world using the Java EE package names that the community will need to use together with the new Jakarta EE package names.

Tools are always important to take the drudgery away from developers. So I have put together a tool prototype which can be used to transform binaries such as individual class files and complete JARs and WARs to rename uses of the Java EE package names to their new Jakarta EE package names.

The tools is rule driven which is nice since the Jakarta EE community still needs to define the actual package renames for Jakarta EE 9. The rules also allow the users to control which class files in a JAR/WAR are transformed. Different users may want different rules depending upon their specific needs. And the tool can be used for any package renaming challenge, not just the specific Jakarta EE package renames.

The tools provides an API allowing it to be embedded in a runtime to dynamically transform class files during the class loader definition process. The API also supports transforming JAR files. A CLI is also provided to allow use from the command line. Ultimately, the tool can be packaged as Gradle and Maven plugins to incorporate in a broader tool chain.

Given that the tool is prototype, and there is much work to be done in the Jakarta EE community regarding the package renames, I have started a list of TODOs in the project' issues for known work items.

Please try out the tool and let me know what you think. I am hoping that tooling such as this will ease the community cost of dealing with the package renames in Jakarta EE.

PS. Package renaming in source code is also something the community will need to deal with. But most IDEs are pretty good at this sort of thing, so I think there is probably sufficient tooling in existence for handling the package renames in source code.

Friday, May 17, 2019

I am an Incrementalist: Jakarta EE and package renaming


Eclipse Jakarta EE has been placed in the position that it may not evolve the enterprise APIs under their existing package names. That is, the package names starting with java or javax. See Update on Jakarta EE Rights to Java Trademarks for the background on how we arrived at this state.

So this means that after Jakarta EE 8 (which is API identical to Java EE 8 from which it descends), whenever an API in Jakarta EE is to be updated for a new specification version, the package names used by the API must be renamed away from java or javax. (Note: some other things will also need to be renamed such as system property names, property file names, and XML schema namespaces if those things start with java or javax. For example, the property file META-INF/services/javax.persistence.PersistenceProvider.) But this also means that if an API does not need to be changed, then it is free to remain in its current package names. Only a change to the signature of a package, that is, adding or removing types in the package or adding or removing members in the existing types in the package, will require a name change to the package.

There has been much discussion on the Jakarta EE mail lists and in blogs about what to do given the above constraint and David Blevins has kindly summed up the two main choices being discussed by the Jakarta EE Specification Committee: https://www.eclipse.org/lists/jakartaee-platform-dev/msg00029.html.

In a nutshell, the two main choices are (1) “Big Bang” and (2) Incremental. Big Bang says: Let’s rename all the packages in all the Jakarta EE specifications all at once for the Jakarta EE release after Jakarta EE 8. Incremental says: Let’s rename packages only when necessary such as when, in the normal course of specification innovation, a Jakarta EE specification project wants to update its API.

I would like to argue that Jakarta EE should chose the Incremental option.

Big Bang has no technical value and large, up-front community costs.

The names of the packages are of little technical value in and of themselves. They just need to be unique and descriptive to programmers. In source code, developers almost never see the package names. They are generally in import statements at the top of the source file and most IDEs kindly collapse the view of the import statements so they are not “in the way” of the developer. So, a developer will generally not really know or care if the Jakarta EE API being used in the source code is a mix of package names starting with java or javax, unchanged since Jakarta EE 8, and updated API with package names starting with jakarta. That is, there is little mental cost to such a mixture. The Jakarta EE 8 API are already spread across many, many package names and developers can easily deal with this. That some will start with java or javax and some with jakarta is largely irrelevant to a developer. The developer mostly works with type and member names which are not subject to the package rename problem.

But once source code is compiled into class files, packaged into artifacts, and distributed to repositories, the package names are baked in to the artifacts and play an important role in interoperation between artifacts: binary compatibility. Modern Java applications generally include many 3rd party open source artifacts from public repositories such as Maven Central and there are many such artifacts in Maven Central which use the current package names. If Jakarta EE 9 were to rename all packages, then the corpus of existing artifacts is no longer usable in Jakarta EE 9 and later. At least not without some technical “magic” in builds, deployments, and/or runtimes to attempt to rename package references on-the-fly. Such magic may be incomplete and will break jar signatures and will complicate builds and tool chains. It will not be transparent.

Jakarta EE must minimize the inflection point/blast radius on the Java community caused by the undesired constraint to rename packages if they are changed. The larger the inflection point, the more reason you give to developers to consider alternatives to Jakarta EE and to Java in general. The Incremental approach minimizes the inflection point providing an evolutionary approach to the package naming changes rather than the revolutionary approach of the Big Bang.

Some Jakarta EE specification may never be updated. They have long been stable in the Java EE world and will likely remain so in Jakarta EE. So why rename their packages? The Big Bang proposal even recognizes this by indicating that some specification will be “frozen” in their current package names. But, of course, there is the possibility that one day, Jakarta EE will want to update a frozen specification. And then the package names will need to be changed. The Incremental approach takes this approach to all Jakarta EE specifications. Only rename packages when absolutely necessary to minimize the impact on the Java community.

Renaming packages incrementally, as needed, does not reduce the freedom of action for Jakarta EE to innovate. It is just a necessary part of the first innovation of a Jakarta EE specification.

A Big Bang approach does not remove the need to run existing applications on earlier platform versions.  It increases the burden on customers since they must update all parts of their application for the complete package renaming when the need to access a new innovation in a single updated Jakarta EE specification when none of the other Jakarta EE specifications they use have any new innovations. Just package renames for no technical reason.  It also puts a large burden on all application server vendors. Rather than having to update parts of their implementations to support the package name changes of a Jakarta EE specification when the specification is updated for some new innovation, they must spend a lot of resources to support both old and new packages name for the implementations of all Jakarta EE specifications.

There are some arguments in favor of a Big Bang approach. It “gets the job done” once and for all and for new specifications and implementations the old java or javax package names will fade from collective memories. In addition, the requirement to use a certified Java SE implementation licensed by Oracle to claim compliance with Eclipse Jakarta EE evaporates once there are no longer any java or javax package names in a Jakarta EE specification. However, these arguments do not seem sufficient motivation to disrupt the ability of all existing applications to run on a future Jakarta EE 9 platform.

In general, lazy evaluation is a good strategy in programming. Don’t do a thing until the thing needs to be done. We should apply that strategy in Jakarta EE to package renaming and take the Incremental approach. Finally, I am reminded of Æsop’s fable, The Tortoise & the Hare. “The race is not always to the swift.”

Friday, April 04, 2014

Java 8, bnd and references to compile-time constants

Java 8 was recently released and I wanted to test it out with the OSGi build. I installed JDK 8 on my Mac, pointed JAVA_HOME at it and started a clean build of OSGi. After the build completed, I ran the Core Compliance Tests to verify the build. Unfortunately several of the tests were now failing.

In order to diagnose the issue, I compared it to the same build built with JDK 7. The JDK 7 build passed all the tests running under JDK 7 or JDK 8. The JDK 8 build failed the same tests running under JDK 7 or JDK 8. So the issue was in the building with JDK 8 not the running under JDK 8. The failure was that one of the test bundles was failing to resolve. This was caused by an import for a  package that was not exported by any bundle. When building with JDK 8, bnd added a package to the test bundle's Import-Package statement that is not present in the Import-Package statement when building with JDK 7.

Digging further into the test bundle, the only reference to that package was to a static final String constant. During compilation, javac must copy the referenced string into the compiled class since the final String is a compile-time constant and thus the referenced field is not referenced at runtime.

So in the example code:
public class Referencer {
    public static void main(String[] args) {
        System.out.println(Constant.hello);
    }
}
class Constant {
    static final String    hello = "Hello, World!";
}
the class file for Referencer has its own copy of the string "Hello, World!" and does not access Constant at runtime to obtain the string to print it out. In fact, the Referencer class file compiled by JDK 7 has no references to Constant.

But when compiling with JDK 8, javac adds a constant pool entry for the class holding the referenced constant even though this class it not reference at runtime by the compiled class. So for the example above, the Referencer class file now has a constant pool entry for the Constant class.

During the building of a bundle, bnd analyzes the class files in the bundle to find class references to generate the necessary Import-Package statement.  So the presence of this new constant pool entry for the constant holding class caused bnd to add the package of that class to the Import-Package statement. bnd assumed that all class entries in the constant pool were runtime references.

An email conversation with Alex Buckley, JLS spec lead, confirmed that this new behavior for javac in JDK 8 is intentional. javac is now adding compile-time dependencies to the constant pool to support compile-time dependency analysis using class files.

This means that bnd's assumption that all class entries in the constant pool were runtime references is no longer valid for classes compiled by JDK 8. So Peter Kriens is making fixes to bnd for the 2.3 release to do deeper analysis so only runtime references to classes will result in their packages being added to the Import-Package statement. Compile-time only dependencies wont result in their packages being added to the Import-Package statement. So stay tuned for bnd 2.3 if you plan on using JDK 8 to build your bundles.

This also means that anyone doing bytecode analysis for runtime dependencies of class files needs to be aware that the constant pool can now also contain compile-time only dependencies.

Monday, October 14, 2013

API Design Practices That Work Well With OSGi

Introduction

This post describes some API design practices that should be applied when designing Java API to ensure the API can be used properly in an OSGi environment. Some of the practices are prescriptive and some are proscriptive. And, of course, other good API design practices also apply.

The OSGi environment provides a modular runtime using the Java class loader concept to enforce type visibility encapsulation. Each module will have its own class loader which will be wired to the class loaders of other modules to share exported packages and consume imported packages.

A package can contain an API. There are two roles of clients for these API packages: API consumers and API providers. API consumers use the API which is implemented by an API provider.

In the following design practices, we are discussing the public portions of a package. The members and types of a package which are not public or protected (that is, private or default accessible) are not visible outside of the package and are implementation details of the package. 

Packages must be a cohesive, stable unit

A Java package must be designed to ensure that it is a cohesive and stable unit. In OSGi, the package is the shared entity between modules. One module may export a package that another module can import. Because the package is the unit of sharing between modules, a package must be cohesive in that all the types in the package must be related to the specific purpose of the package. Grab bag packages like java.util are discouraged because the types in such a package often have no relation to each other. Such non-cohesive packages can result in lots of dependencies as the unrelated parts of the package reference other unrelated packages and changes to one aspect of the package impacts all modules that depend on the package even though a module may not actually use the part of the package which was modified.

Since the package is the unit is sharing, its contents must be well known and the contained API only subject to change in compatible ways as the package evolves in future versions. This means a package must not support API supersets or subsets; for example, see javax.transaction as a package whose contents are very unstable. The user of a package must be able to know what types are available in the package. This also means that packages should be delivered by a single entity (for example, a jar file) and not split across multiple entities since the user of the package must know that the entire package is present.

Finally, the package must evolve in a compatible way over future versions. So a package should be versioned and its version number must evolve according to the rules for semantic versioning.

Minimize package coupling

The types in a package can refer to the types in other packages. For example, the parameter types and return type of a method and the type of a field. This inter-package coupling creates what are called uses constraints on the package. This means that an API consumer must use the same referenced packages as the API provider in order for them to both understand the referenced types.

In general, we want to minimize this package coupling to minimize the uses constraints on a package. This simplifies wiring resolution in the OSGi environment and minimizes dependency fan-out simplifying deployment.

Interfaces preferred over classes

For an API, interfaces are preferred over classes. This is a fairly common API design practice that is also important for OSGi. The use of interfaces allow implementation freedom as well as multiple implementations. Interfaces are important to decouple the API consumer from the API provider. It allows a package containing the API interfaces to be used by both the API provider who implements the interfaces and the API consumer who call methods on the interfaces. In this way, API consumers have no direct dependencies on an API provider. They both only depend upon the API package.

Abstract classes are sometimes a valid design choice instead of interfaces, but generally interfaces are the first choice.

Finally, an API will often need a number of small of concrete classes such as event types and exception types. This is fine but the types should generally be immutable and not intended for subclassing by API consumers.

Avoid statics

Statics should be avoided in an API. Types should not have static members. Static factories should be avoided. Instance creation should be decoupled from the API. For example, API consumers should receive object instances of API types through dependency injection or an object registry like the OSGi service registry.

The avoidance of statics is also good practice for making testable API since statics cannot be easily mocked.

Singletons

Sometimes there are singleton objects in an API design. However access to the singleton object should not be through statics like a static getInstance method or static field. When a singleton object is necessary, the object should be defined by the API as a singleton and provided to API consumers through dependency injection or an object registry as mentioned above.

Avoid class loader assumptions

APIs often have extensibility mechanisms where the API consumer can supply the name of a class the API provider must load. The API provider must then use Class.forName (possibly using the thread context class loader) to load the class. This sort of mechanism assumes class visibility from the API provider (or thread context class loader) to the API consumer. API designs must avoid class loader assumptions. One of the main points of modularity is type encapsulation. One module (for example, API provider) must not have visibility to the implementation details of another module (for example, API consumer).

API designs must avoid passing class names between the API consumer and API provider and must avoid assumptions regarding the class loader hierarchy and type visibility. To provide an extensibility model, an API design should have the API consumer pass class objects, or better yet, instance objects to the API provider. This can be done through a method in the API or through an object registry such as the OSGi service registry. See the whiteboard pattern.

The java.util.ServiceLoader class also suffers from class loader assumptions in that it assumes all the providers are visible from the thread context class loader or the supplied class loader. This assumption is generally not true in a modular environment.

Don't assume permanence

Many API designs assume only a construction phase where objects are instantiated and added to the API but ignore the destruction phase which can happen in a dynamic system. API designs should consider that object can come and they can go. For example, most listener APIs allow for listeners to be added and removed. But many API designs only assume objects are added and never removed. For example, many dependency injection systems have no means to withdraw an injected object.

In a modular system, modules can be added and removed, so an API design that can accommodate such dynamics is important. The OSGi Declarative Services specification defines a dependency injection model for OSGi which supports these dynamics including the withdrawal of injected objects.

Clearly document type roles for API consumers and API providers

As mentioned in the introduction, there are two roles for clients of an API package: API consumers and API providers. API consumers use the API and API providers implement the API. For the interface (and abstract class) types in an API, it is important that the API design clearly document which of those types are only to be implemented by API providers vs. those types which can be implemented by API consumers. For example, listener interfaces are generally implemented by API consumers and instances passed to API providers.

API providers are sensitive to changes in types implemented by both API consumers and API providers. The provider must implement any new changes in API provider types and must understand and likely invoke any new changes in API consumer types. An API consumer can generally ignore (compatible) changes in API provider type unless it wants to change to invoke the new function. But an API consumer is sensitive to changes in API consumer types and will probably need modification to implement the new function. For example, in the javax.servlet package, the ServletContext type is implemented by API providers such as a servlet container. Adding a new method to ServletContext will require all API providers to be updated to implement the new method but API consumers do not have to change unless they wish to call the new method. However, the Servlet type is implemented by API consumers and adding a new method to Servlet will require all API consumers to be modified to implement the new method and will also require all API providers to be modified to utilize the new method. Thus the ServletContext type has an API provider role and the Servlet type has an API consumer role.

Since there are generally many API consumer and few API providers, API evolution must be very careful when considering changes to API consumer types while being more relaxed about changing API provider types. This is because, you will need to change the few API providers to support an updated API but you do not want to require the many existing API consumers to change when an API is updated. API consumers should only need to change when the API consumer wants to take advantage of new API. OSGi is now defining documentary annotations, @ProviderType and @ConsumerType, to mark the roles of types in an API package.

Conclusion

When next designing an API, please consider these API design practices. Your API will then be usable in both OSGi and non-OSGi environments.

Friday, January 20, 2012

Juke Box Hero, Got Stars In His Eyes


I learned Wednesday that I was named a JavaOne Rock Star for my Why OSGi? presentation with Peter Kriens at JavaOne 2011. Nice!

Friday, December 09, 2011

Bndtools at the OSGi Alliance

The OSGi Alliance has been using bnd for a long time in the OSGi build. bnd is used by the ant build to create the bundles and execute the compliance tests as part of our continuous builds. It is also installed in Eclipse as an IDE plugin to provide IDE support for compilation classpath and test execution by OSGi members working in the Expert Groups.

Recently Bndtools development has been underway to create a better integration of bnd with the Eclipse IDE for bundle development. Bndtools 1.0 was just released and is available for installation into the Eclipse IDE as a replacement for bnd's Eclipse IDE support.

Since the OSGi Alliance has long used bnd, we already had the bnd infrastructure in place for our build. All that we needed to do to start using Bndtools was to update each project's .project file (using the Add Bndtools Project Nature menu item). This simple change then enabled Bndtools to manage the project within the Eclipse IDE. The OSGi Alliance will continue to use bnd in the ant build, but for our Eclipse IDE use we have moved to Bndtools.

Thanks to Neil Bartlett, Peter Kriens and the other bnd and Bndtools contributors for their hard work in making bnd and Bndtools the premier tooling for OSGi development.

Tuesday, October 04, 2011

Java 8 and the 1990s

I attended the first Jigsaw session at JavaOne today where Mark Reinhold presented the latest on Jigsaw. After what appeared to be the end of the presentation, Mark continued on and began discussing why OSGi is wrong for using packages as the primary unit import and export between modules and why Jigsaw is right for requiring (aka. importing) modules (but apparently, and non-symmetrically, exporting packages and types). Mark made several strange arguments.

He found the idea of a resolver matching package importers up to package exports (essentially a broker pattern where the resolver acts as the broker) to be bad/complicated/etc. He prefers the developer to effectively be the "resolver" and declare the specific modules to be imported. This removes an important level of indirection between the thing being provided and the artifact providing it. This is like saying, "Don't use interfaces, use concrete implementation types," because we don't want to have to figure out how to map the use of the interface onto a concrete implementation type.

Mark also stated that requiring modules mapped well onto native package managers (e.g. rpm, apt) while importing package provided for no simple mapping. So therefore requiring modules is the way to go. It seems rather sad to me that the design of modules for Java 8 is being driven by the capabilities of native package managers designed in the 1990s for native code. Shouldn't the design of a module system for Java be driven by the capabilities and attributes of Java?