Shearer Software

 

 

SD East 2002

Software Development Conference, Boston

Tuesday, November 19, 2002


Java 1.4 - New I/O, Exceptions, Logging (Half-Day Tutorial)

Jason Hunter, www.servlets.com

My opportunity to catch up on the Java world, having been working more in Python and other languages recently.

New I/O

Much improved design exposing many of the more advanced I/O features in modern operating systems. New classes for buffers and channels (generalized sockets), supporting scatter reads and gather writes, as well as memory-mapped files: read-only, read-write, or copy-on-write. All this goes direct to the OS for great performance.

Jason showed sample file-copy code from a presenter at JavaOne 2001, which fit easily on a slide. Unfortunately, even though the code was only a few lines, he pointed two things that were buggy or questionable. (1. Use of a file-size variable, which would normally be redundant if the underlying read call returned EOF correctly, and 2. the code didn't correctly handle the case where the destination only completed a partial write, and it failed in a way that could cause bits of lost data throughout the file; he speculated that FileChannels must always complete writes in full and that's why the author wrote the code that way.) This kind of subtle issue really makes me think that the API failed. How are developers supposed to do it correctly on the fly if experts can't figure out whether 10 lines of code are right?

One benchmark gave 47% (new buffers) to 96% (transferTo method) speedup over previous Java code. Native code gave a 116% speedup, so it's fairly close. The range was something like 70MB/sec to 140MB/sec.

Finally, native file locking. Methods on FileChannel support exclusive and shared locks and byte ranges, either blocking or non-blocking. Locks depend on the underlying operating system, which may enforce them (a la Windows) or treat them as advisory only (typical in Unix). This same difference led to a few problems during the Mac transition to Unix-based Mac OS X, when some apps had depended on enforced file locking that suddenly became advisory-only.

Select() support for async I/O. As an example, would allow Tomcat to support keep-alive without keeping a huge number of threads running.

Preferences

JNDI too heavyweight, other solutions have drawbacks too.

Pluggable backends, by default stores in XML on Unix, registry on Windows.

Unfortunately, only available in J2SE 1.4, and it may be difficult for the user to find & edit files manually. Hierarchical under System tree & User trees, similar to Windows registry. Strongly typed interface.

Assertions

Compiler and runtime support for assertions. Disabled by default, but can be enabled with a flag at runtime. Very little overhead if assertions are disabled (essentially, a global is checked, but the test code is not run.)

You can catch assertions like exceptions if you want. There's new support for walking the exception stack trace to display it for the user with custom formatting, as in Python.

Logging

For diagnosing problems, for users, field service engineers, support, developers. Configurable levels, output handlers, filters, formatters. Localizable. The default file handler writes in XML format and supports log rotation. A memory handler just stores a rotating log of the last n entries, but on a trigger such as a SEVERE-level entry, will automatically dump its contents to another handler, such as a socket or a file. Sounds great for QA (though if you kept it enabled after release, you should really have the user's permission before sending via a socket; it would work like Netscape's Talkback).

It's conceptually similar to the existing open-source log4j. Use log4j instead if you need compatibility with prior versions of Java.

Trillenium

The trillenium is a trillion milliseconds since the epoch. It happened last year. Apparently, no one else celebrated it but him, even in Silicon Valley. He's been hoping to find a sympathetic audience member each time he gives a talk, but again left disappointed.


Hands-on XSLT (Half-Day Tutorial)

Elliotte Rusty Harold

XSLT (extensible stylesheet language - transformation) is a language for specifying the conversion of one XML document into another, or into an HTML document. It's a standard with multiple implementations, giving it a degree of future-proofness that I like. I've thought about adding XSLT module to my own HTML content management system, either as an input format, an export format, or both. The only problem with going all-XSLT has been its very difficult syntax and its complete incompatibility with current HTML editors, not to mention current HTML-editing brains. I just can't see large numbers of people taking to it the same way they have to HTML. Though of course I may just know HTML too well and XSLT too little.

First, the requirements. People had to bring laptops to this one. I realized this morning that I needed to install Saxon, an open-source XSLT processor, and it was only 15 minutes before the bus to the train station arrived. I still made it, though, as was able to test Saxon on the train. It worked out of the box, even piping into BBEdit. In the end it didn't matter anyway, because Elliotte distributed CDs with all the software as well as example files. On top of that, there was WiFi access live in the room! I don't know at what point in the day it was turned on, but it was real Internet access, and it was much snappier than the sluggish connection I made a few minutes earlier on Newbury St.

The whole thing was a bit of an experiment, and the success was mixed. Personally, I got something out of it. Despite reading about XSLT (and re-reading until I thought I understood it), I hadn't found the time to really experiment, except for one very limited test. This session gave me a whole bunch of problems to solve, and the successes and mistakes I made really did clarify my previously murky understanding. I started to get into the flow of writing xsl:template elements. There was some useful discussion of xsl:for-each (stylesheet pulling from source XML) vs. xsl:template only (stylesheet being controlled by source XML).

The class as a whole didn't go off exactly as planned. We only got to eight out of his eighteen exercises before he decided to ditch the rest of them and make an all-out sprint to the end of the lecture, and even so had to skip portions of the material. (By the way, the page breaks in the handout were terrible. It had been printed from a web browser, likely after being transformed from XML.) When trying to solve the example problems, some people in the room were lost, and it was hard to decide when to move the class along.

With the syllabus he'd set himself, and more than 50 people in attendance at varying skill levels, it was probably an impossible task. When he asked for a show of hands at one point on the progress on one problem after a few minutes, the room was fairly well split between no idea / getting there / already done. All in all, he did a good job of keeping the class moving even though the initial plan was too ambitious. I don't know for sure how a single person trying to manage that many other people could have handled it any better. Maybe more teaching up front would have helped. Probably those who didn't get the first problem working in time were sunk.

The discussion on XPath was very helpful. Pretty soon I was off and writing complex query expressions, such one to print the names of all the elements with an atomic weight greater than the one named with the atomic number 92. Come to think of it, you'd actually need two queries to do that in the current stable MySQL (that is, until it gets nested select statements). This code fragment does something reasonably complex, but doesn't look too bad. (Note the escaping necessary for the greater-than test.)

  <ul>
  <xsl:for-each select="ATOM[ATOMIC_WEIGHT &gt; ATOM[ATOMIC_NUMBER = '92']/ATOMIC_WEIGHT]">
  <li><xsl:value-of select="NAME" /></li>
  </xsl:for-each>
  </ul>

One of his closing statements: If you put an xsl stylesheet tag into an XML document (<?xsl-stylesheet ...?>), the type attribute is "text/xml". Please don't use type="text/xsl". That MIME type doesn't exist. It's a figment of Microsoft's imagination.

His slides, the XPath chapter of his book XML in a Nutshell (second edition), and the XLink chapter and XPointer chapter of his other book XML Bible (second edition) are online.