>Ineffective XML Design

>There is a common XML design flaw I see more often than I should. I especially see it in eclipse EXSD designs. I saw it again today, after Mik posted their Discovery Site extension point definition:

name=“Mozilla Bugzilla”
provider=“Eclipse Mylyn”

The problem is defined in chapter 12 of Elliotte Rusty Harold‘s must read book, “Effective XML“. Basically it’s the battle over attributes or elements. Elements are extensible, attributes aren’t.

The problem in the above markup is the summary attribute in the Overview element. A summary in practice should be a couple of sentences, but what if for some reason somebody needs a couple of paragraphs for the summary? There is no easy way to do this. Sure the content has an externalized string, but this only puts the issue off to a properties file.

This same issue appears through out Eclipse related extension points or where XML is used to describe possibly long descriptions. The XML markup for the Templates files is notoriously bad. Unfortunately nobody is willing to fix it or revisit it.

Please if you are using XML try to use it effectively. A little more thought into how you layout your XML and what you are using it for can go a long way in both maintainability and useability.

This entry was posted in eclipse, xml. Bookmark the permalink.

11 Responses to >Ineffective XML Design

  1. AlBlue says:

    >Uneffective? Ineffective or unaffected are words, but uneffective isn't. Oh, and I think you mean 'their' rather than 'there'. Oh, and the sentence 'but what if …' is a question, so should probably end with a question mark. Having said that, I agree with the sentiments of the post. 🙂

  2. >A useful reference in this context is also the Best Practices for XML Internationalization document, which lists a number of drawbacks with linguistic content in XML attributes.

  3. David Carver says:

    >@AlBlue: I've corrected the other items, but I believe "there" is correct according to this page.My Composition teacher from High School, had these as "No Excuse" errors. Mixing up There, They're, and Their would automatically fail your paper.

  4. AlBlue says:

    >"I saw it again today, after Mik posted there Discovery Site extension point definition:"there – a location; if you meant a link to a blog then perhaps I can see what you mean. However, if it's the Directory Site extension point belonging to Mik, then it's their. I parsed it in the latter format, and I'm not sure how to parse it in the former.

  5. David Carver says:

    >@AlBlue…a good catch…it was hiding. 🙂

  6. David Green says:

    >Perhaps you can enlighten me as to why summary=“%connectorDescriptor.overview.summary.bugzilla” is a bad idea? The summary text uses newlines to delimit paragraphs, which is very easy to do in a translation bundle.

  7. David Green says:

    >@David Carver I should mention some relevant points:Your blog fails to mention that Eclipse provides some well-established design patterns and tooling support for internationalization. Conforming to The Eclipse Way has significant benefits, and inventing a new way must be carefully and thoroughly justified. Most XML element-versus-attribute discussions focus on rich content containing metadata. These arguments (such as xml:lang, metadata, text directionality, including the arguments referenced by @Asgeir) are not relevant in this case. Eclipse handles the complexities of managing translation bundles, and the text in question is intended to be a lightweight short description that uses plain-text formatting such as newlines. Connector Discovery provides a means of linking to rich HTML/XHTML content that can provide document structure, metadata, etc.While I agree with the argument about XML and translation in the general case, Eclipse extension points such as Connector Discovery have the advantage of being able to leverage the Eclipse platform and its internationalization and localization features, which are not available in a normal XML document.Really this is a case of using the right tool for the right job. Suggesting that all translatable text in extension points should become XML elements is really suggesting that we re-invent well-established, proven Eclipse mechanisms and is contrary to the principles of Agile development.

  8. David Carver says:

    >@David I would argue that the design choices come from what is most convient for databinding not what is necessarily the best design and use of XML as a whole.Typically the XML mapping is properties = attributes. Class = Element. Unfortunately, this leads to very bad design choices and serialization. Let's take the summary attribute for example. Since attributes can't preserve space, or keep carriage return lfs in their output, these are stripped when read into the DOM and re-serialized. A better design for the overview is as follows<overview …> <summar> <p> Some text</p> <p> Some more text</p> </summary></overview>This approach avoids much of the problems associated with long descriptions. Not everybody does internationalization. Eclipse provides a way that is Java like, but isn't XML platform neutral like.If you are going to use XML for your storage use it efficiently and life is actually much more simple. The problems that people have with XML comes mostly from their prior design choices, not in XML's design.I believe XML is fine, just the choice of using an attribute in this case is wrong. Especially according to the reasons outlined in Effective XML.

  9. David Green says:

    >@David Carver This is not about databinding or XML mapping. You're argument is that user-visible text should not be loaded from XML attributes. That's not what we're doing here. We're loading text from a Java resource bundle, which provides facilities for preserving whitespace.

  10. philho says:

    >I am not sure if you are right in this precise case, but I agree that Eclipse generates awful XML.I precisely look at .classpath file where I have a looong list of files and paths in the 'excluding' attribute of a 'classpathentry' tag. Using some custom (albeit simple) markup instead of XML facilities (yet another sub-parser, even if it is here a simple split on pipe).Not to mention (escaped) XML in the XML attributes that we see here and there in the metadata…Note on the "don't put localizable strings in attributes": funnily it is badly broken in HTML (title, alt attributes among others…). But well, that's for historical reasons.

  11. David Carver says:

    >@David Regardless, it's a fundamental design flaw. The summary attribute is not being represented correctly. As @phil has said, the item I choose is not necessarily the best example. A better example is the escaped XML that has to be done for say the WTP XML templates that are in the standard Template.xml file format.Eclipse internationalization only puts the editing into another format, if you were to reserialize that into a valid XML document, you would run into the issues regardless. If you have a plugin that isn't internationalized, it is user visible text at that point. Shifting where the text is stored just hides the issue.XML is treated as a second class citizen. Do not even get me started on the lack of DTD or XML Schemas (EXSD is sort of a schema but really it's being used for Modeling…not validation).

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s