Datasystems problems in detail

The central classes in the Datasystems API contain some of the hardest-to-understand and perhaps buggiest code in the NetBeans Platform. It is rather old and the basics have not changed much in the past several years, but small workarounds and modifications have accreted since then and are hard to separate from the basic functionality. This complexity adversely impacts reliability, and some very desirable performance optimizations are too hard to do in the current code. Even the developers assigned to maintain the API are unwilling to make many changes for fear of introducing regressions.

Datasystems uses a complex threading model which is not really understood. While folder recognition is basically single-threaded, modules can start or intercept the process from any thread, meaning that fine-grained locking is necessary. Some modules, and even the core window system, are unable to guarantee that certain operations will always work, though they succeed almost all of the time. Other problems can cause data corruption and deadlocks.

The API is also difficult to understand for a programmer getting his or her feet wet in NetBeans. There are too many options, many of which have been unused for years. Naming conventions are not always logical. Numerous assumptions are not documented anywhere.

The clustering of files into one data object has made problems for version control integration which need to be thought out carefully. Coupling visualisation with data layer itself is evil.

Finally, there are high-level architectural concerns over Datasystems' place. The current API directly depends on a wide variety of other code, making it difficult to test in isolation. Some of the more advanced features are unwanted overhead in applications based on the NetBeans platform, and this may even be true of a couple of features in the NetBeans IDE, if the new Projects system is implemented.

Settings system problems

Although settings could look like separate area to solve it is not the case. Most aspects of settings currently use Datasystems internally, which is an unwanted dependency. This whole system was never designed as a whole - individual aspects of it were accreted during each historical NetBeans release cycle, subject to compatibility and resource constraints. The lack of Datasystems independent Settings APIs needs to be addressed as part of the Datasystems redesign.

The settings system itself has armful of problems. One problem with settings is coding complexity - few beginning NetBeans programmers really figure out how all of these things work together. While most application frameworks would permit you to write a few lines of configuration to a *.ini file or the like, there is nothing comparably straightforward in NetBeans. Even if you buckle down and copy the boilerplate from a sample module, making your own customizations, any kind of minor error will lead to runtime failures that are very hard to diagnose without expertise in NetBeans internals.

Furthermore, all of the existing coding styles require at least some Java classes to be written for each distinct kind of setting. While this is sometimes appropriate, it can also be overkill, and just serves to increase JVM memory consumption. Registration of options and services also ties into the Lookup subsystem in such a way that care must be taken to avoid settings being loaded before anyone asks for them - all such objects are placed in one global area although in practice they are only needed in isolation. There is some inherent overhead in how settings are stored, some of which has been optimized away in NetBeans 3.5 at the expense of added internal complexity.

Project-specific settings are possible using one of several semi-documented tricks. It is likely that the NetBeans 4.0 Projects infrastructure will not use the current system of making settings project-specific.

Use of Java serialization for options and services is also a problem in the current system - serialization does not work well in practice for long-term persistence of data.

Concrete known Datasystems issues

  • During a recognition of a FileObject all loaders are asked to recognize it, more structured hierarchy needed to prevent all loaders to be initialized in memory until needed and also to minimize the amount of client code that is called
  • There has to be clear division between recognition of (mime) type of a file and creation of DataObject
  • It is necessary to allow other modules to declaratively extend capabilities of foreign loaders (add cookies to them) (Issue 20191)
  • It is necessary to allow other modules to declaratively add actions to foreign loaders
  • Loaders are still serialized, they should be handled by the settings infrastructure
  • Package org.openide.loaders is full of garbage (API, SPI, support, deprecations) and should be redesigned
  • DataLoader should no longer extend SharedClassObject
  • The SPI should be separated and simplified (no need to subclass both MultiFileLoader and MultiDataObject)
  • The Datasystems should be separated from the above layers, the Nodes API and any windowing framework. The Datasystems API can be used standalone without the presentation layer. The Node.Cookie has to be replaced and because Lookup is becoming more standard in variety of APIs (Looks, Actions, etc.) shall be reused in loaders package too.
  • The problem with recognizing DataObjects - (see hack in DataFolder.handleMove) - (Issue 8705)
  • The problem with 500ms timeout for recognizing DataObjects - (Issue 20022)
  • There is a implementation and intefaces hiearchy mixed in the loaders package. Nearly every implementation of DataObject is subclass of MultiDataObject, but there are two public classes DataShadow and (used to be) DataFolder that subclass directly the DataObject. As such it is very inconvenient to create own subclass of those objects, because lack of support in MultiDataObject.
  • Recognition of list of templates shall not initialize loaders from all modules, otherwise we will not be able to improve performance.
  • The Filesystems API supports direct implementation of move operation. But the current semantic of DataObject.move and especially DataFolder.move cannot use this implementation if they want to honor modification of content of the file during the operation. See dev@openide for details.
  • When a module wants to create a data object consisting of more than one file it can start writing the individual files to the disk (to the filesystem layer). When there are only partial set of files another loader can grab any of the files. There should be a way how to temporarily stop recognition and let the module finish its work (and than start the recognition process).
  • After a data object/loader recongizes that a set of files belonging to the data object has changed it should fire to notify modules about such a change.

Project Features

About this Project

openide was started in November 2009, is owned by Antonin Nebuzelsky, and has 92 members.
By use of this website, you agree to the NetBeans Policies and Terms of Use (revision 20160708.bf2ac18). © 2014, Oracle Corporation and/or its affiliates. Sponsored by Oracle logo
Please Confirm