>Builds: We make them complicated.


The fault, dear Brutus, is not in our stars, But in ourselves…

Ed Merks gave a talk about Builds and Provisioning. How programmers hate and love them. While I agree with many aspects, I think some of the conclusions are wrong in where the problems with builds lay.

Ed is correct that programmers tend to shift the blame from themselves to that of the build when it is broken. After all if it works on their system it can’t possibly be broken. However, the reasons that Ed outline’s are only part of the problem (i.e. dependency on the IDE to produce the build which is in itself an Anti-Pattern of Continuous Integration.).

Many of us use IDEs, and most IDEs have some kind of build management process within them. However these files are always proprietary to the IDE and often fragile. Furthermore they need the IDE to work. It’s okay for IDE users set up their own project files and use them for individual development. However it’s essential to have a master build that is usable on a server and runnable from other scripts. So on a Java project we’re okay with having developers build in their IDE, but the master build uses Ant to ensure it can be run on the development server.

Martin Fowler, Continuous Integration

In my experience working as both a programmer and a maintainer of build systems, 90% of the failures happen because of integration issues. Not necessarily integration dependencies from other projects, but integration issues within your own project. Many programmers do not follow good practices from Continuous Integration. Faults that happen:

  • Code checked in with out running all the unit tests locally.
  • Not synchronizing with the source code repository before checking code in.
  • Not running integration tests with code that has been synchronized.
  • Checking code in at the end of the day with out running tests.
  • Not writing tests to make sure what we wrote is working.

These basic items will reveal problems that the build servers tend to catch, because the build server doesn’t forget to do these items. If following true continuous integration practices, then a broken build should be fixed in less than an hour. Reason is that the developer has gotten early feedback.

Overly Complicated Builds:

The problem in general with Eclipse based builds are legacy. They have made assumptions on how builds should be done that do not necessarily fit the more general practice. Regardless of the way the current eclipse ant tasks are implemented, there is no practical reason that you need to run Eclipse Headless to produce a build. I’ve done several builds outside of eclipse with very simple Ant scripts using the Ant4Eclipse project. This only depends on the eclipse OSGI runtime jar, and optionally the eclipse compiler. No need to launch eclipse headless to run ant scripts.

The key to making builds reproducible is to do The Simpliest Thing that Can Possibly Work. In many cases we tend to over architect the solutions. The recent work with P2 really has made the builds more complicated than they need to be. I’ve been working with some P2 provisioning for a Product build, and it really does add a lot of extra complications to the whole process. Instead of a simple feature build with an update site, now we have to add in steps for directors, provisioning of the repository, provisioning of the product, and other items. In my opinion P2 has unnecessarily complicated matters not made them easier.

Much of the excellent work that Nick Boldt has done with the Athena Common Builder is to work around the assumptions that are made by a PDE based build. Imagine if PDE would rework their builds what benefits and fewer work arounds that would have to be made.

Over complication is my same opinion of the eclipse MAP based builds. I have yet to run into a problem with builds that were built directly from HEAD. Allowing HUDSON or another build system do the tagging for a particular build is just as convient and workable solution as a map based build. The problem with MAPs is that they do not catch integration issues early, so the programmers blame the build for the failures instead of their own lack of discipline. They also do not match what a programmer is building from. A programmer is typically building from a copy of Head or a very close version of it. MAP based builds do NOT catch integration issues, they tend to hide them.

Ed is correct, unfortunately, in the Corporate world, if a team has a build engineer, it is usually the most junior programmer that has this responsibility. However, this is wrong, everybody should be responsible for the build. A good build can be reproduced by anybody with minimal effort. The corporate world tends to encourage knowledge silos, unfortunately on eclipse projects we can’t continue this practice. Everybody needs to wear multiple hats, including that of build engineer.

If your project consistently has bad builds, don’t necessarily blame the build engineer, or say it is because the builds are too complicated. Look in the mirror, it is probably your own fault that the build is broken.

This entry was posted in build, eclipse, xml. Bookmark the permalink.

25 Responses to >Builds: We make them complicated.

  1. Ed Merks says:

    >I never suggested the IDE itself be used to do the build. I'm suggesting that exactly the same build process/infrastructure/technology as you use in the IDE be used as part of a continuous integration build. After all, OSGi runs quite nicely on servers or anywhere else last I checked…

  2. Antoine says:

    >"The recent work with P2 really has made the builds more complicated than they need to be."+1 !My take is that a build must be reproducible, therefore the dependencies of your project must be available constantly. I think that's the big missing piece in Eclipse today.

  3. David Carver says:

    >@Ed You may not have said it, but the proposed solution implies it. Unfortunately the way Eclipse based builds are setup you have to use the IDE (unless you are using Ant4Eclipse). I.e. Run eclipse in headless mode to do a build. This in and of itself requires the IDE to create a build. A build that an eclipse developer uses is an IDE build. A headless eclipse build is an IDE build with out the developer pressing buttons.B3 from the sounds of it is just following this pattern even more, by requiring an eclipse based plugin running headless to produce a build. Personally, we need to get away from tying the build to running eclipse headless.@Antoine from a build perspective I really despise P2. Way more complicated, increased build times, and an unpleasant work flow that I need to now manage and debug. The P2.INF files are worse to me than a good structured XML configuration file.@All One other comment on why people don't like Ant. I don't think it is necessarily that it is XML based. Ant itself is more of a functional language instead of the more OO or Procedural language that people are used to using. The syntax is not that bad especially with the content assistance, and there are other even more wordy languages in my opinion that people love but follow a workflow that more people are used to. Functional programming makes you think different.

  4. Antoine says:

    >Get on IRC David, I need to talk to you.

  5. Eric Rizzo says:

    >I couldn't agree more, David. Last product company I worked at spent several man-months trying to get an RCP and Update Site headless build working; what we had was brittle and rife with assumptions. Then they moved to Eclipse 3.4 (p2) and split out some new Features and the whole thing fell apart.As an example of how bad the situation is, ask Nick Boldt for his EMF build script one day; it'll make you want to gouge out your eyes.

  6. Ed Merks says:

    >You're using the term "IDE" in a way different from how it's used at Eclipse. Headless is not simply the same as IDE. You mention exploiting OSGi and the same Eclipse compiler developers use in the IDE as part of a simple build; that's exactly the point.I will continue to argue strongly that solving the same problem in two different ways will result in there endlessly being two sources of issues instead of just one. Furthermore, if one of the solutions isn't used by the developers themselves, that solution will be poorly understood and therefore fragile and error prone. Unfortunately one person's simple script is another person's reason for wanting to gouge their own eyes out. I believe builds can be better driven from declarative data rather than from obscure scripts.

  7. David Carver says:

    >@Ed I think the main difference of view is where the problem lies. The problem from my point of view is not a technical problem per say, but more of a development practice problem. Most projects and developers do NOT practice good continuous integration practices. Practices that reduce drastically the number of bad builds.The problem with Eclipse builds is that the ANT tasks that are used in the builds depend on the eclipse IDE running headless. You can't just run a PDE build with just Equinox and the eclipse platform. It needs all the extra stuff from the PDE plugins, which require hooks into the Eclipse IDE plugins as well. So we are talking semantics. You view the IDE as UI layer, I view it as the whole eclipse plugin architecture the dependencies that are inherited from that.Again, if I have to run Ant using eclipse headless, then I'm adding extra overhead to the build process that is unnecessary, and I'm depending on functionality that is only available in the IDE, not outside of it.Ultimately though the issue of build failures is not entirely the fault of different builds, but the majority is because of the programmers development practices employed.

  8. >> The recent work with P2 really has made the builds more complicated than they need to be.There was a false impression of simplicity in the older model because it pushed off important problems such as "will these pieces actually work together" to the runtime. In the past you could have features that would happily build, but then fail at install-time or run-time because their dependencies were not satisfied (either dependencies between bundles or environmental dependencies). The additional complexity of p2-based builds is pushing these install-time and runtime issues up the develop cycle so they are caught and handled at build-time instead. So yes, the build is more complex now, but I would argue that it is better to solve these resolution problems at build time rather than install-time or later when the problem is in the end user's face. This is one of the reasons why "the simplest build is the best build" is just not true. For example interpreted languages don't require compilation, which means their build is simpler, but type and dependency resolution problems instead arise at runtime. Simpler builds, more end user headaches.

  9. David Carver says:

    >@John I think we have no gone to the other extreme. There has to be a middle ground somewhere. Also I'm still not sure why we had to invent yet another dependency management system, and couldn't leverage existing ones from the open source community.In many cases when we are talking about builds. A smaller more specific build is simplier to maintain and enhance than a generic monolithic build that can handle every possible use case that could be tossed at it. Because if something goes wrong in that monolithic build, you have no clue what caused the issue.Builds are like any other programming artifact, they need to be designed, documented, and treated as a first level deliverable. Unfortunately in many projects they are an after thought or something that developers don't want to deal with. They are also the most critical piece of the development process.

  10. Ed Merks says:

    >Ask anyone who's trying to set up a build if there are any technical problems with doing that.Another question one might ask is if Equinox, the platform, and JDT dependencies for a build are okay, at what point do similar dependencies become too many dependencies. I.e., which straw breaks the camel's back?Everything about builds can be improved: the technology, the practices, the attitude, and the processes around it. Of course continuous integration builds are invaluable…

  11. David Carver says:

    >@Ed agreed…there are technical issues to be delt with, but I still think they are 5% to 10% of the problem. 90% is caused by the developer not following good integration practices.

  12. Kim Moir says:

    >>Instead of a simple feature build with an update site, now we have to add in steps for directors, provisioning of the repository, provisioning of the product, and other items. In my opinion P2 has unnecessarily complicated matters not made them easier.Your assertion that a simple feature build was better overlooks the fact that some teams need to build products. Of course, most people are just looking to build a few bundles.If your deliverable is a product, p2 does make constructing this product much simpler. Pre-p2, the process to build the Eclipse SDK with the correct launchers and root files for multiple platforms was very error prone. Now, all the product components are expressed properly in the metadata. Before p2, components such as the executables weren't expressed in the update site and therefore couldn't be updated. Yes, frequent builds do alleviate integration issues. However, projects must also have the expertise to understand what they are building, how it's built, and how to fix it when things go wrong. If there are projects such as Athena to help teams get started with their builds, great! However, it is disingenuous to believe building complex software will be a push button process with no understanding required of the underlying processes. When the build breaks, who will fix it?

  13. David Carver says:

    >@kim I've been working with both Feature and Product builds lately. Provisioning and setting up a product build correctly is still very much black boxish. The P2.INF has magical incantations that control how the product is generated. If you need to update the P2.INF you have to update property entries like. something.2.requires=blahblahblah. If your items change, you have to go and update all the sequence information to be correct. It can also require a large amount of time to properly provision using the P2 director and if you have something wrong in your P2.INF you don't notice until you try to run your product. P2 might have improved things for products, but for those that aren't doing products, it adds more complications.And I think we agree that the responsibility for the build and knowing what it does, is a team responsibility. This is the knowledge silo that must be eliminated. With this also, comes the need for better documentation from the eclipse community on the various pieces that go into a Build. Currently everything is scattered. It is the hope that the suggested crowdsourcing of the documentation at PDE can help address this in the future. I don't necessarily think that Athena is the simpliest thing that can possibly work. In fact the athena build script is very complex, because it has to work around the design decisions made by PDE Build and the legacy design of eclipse builds in general. Simple doesn't necessarily mean one step solution. Idealy it does, but it has to be simple to maintain and update as well. Nicks done a great job getting it so that builds are much easier for a person to get running, however, in general the commnity needs to do a better job of documenting the various ways to build.

  14. Ed Merks says:

    >I don't agree with the assertion that 90% of the problem is bad developer practices. As Kim suggests, who really understands the builds well enough to set them up and fix them? The fact that the answer is very few people is in and of itself a problem. Yet all developers need to know how to set up a development environment so they can perform their daily activities and they do so regularly. Why are these two completely different problems?

  15. David Carver says:

    >@Ed It's only complicated because we make it so. How many developers do you know that want to take the time to understand how their software is put together. A vast majority just don't care. The build is somebody else's problem, it isn't there problem. The attitude, the coding practices, and the sense of craftsmanship, just isn't there in many cases.I agree that there are techincal hurdles to overcome, but I still feel that unless the programmers take and feel responsible for the builds, the greatest technical solution is not going to address the biggest cause of build failures.Those that understand the builds, are the ones that have taken the time to understand them. Just like those that have know modeling or any programming language, they took the time to learn it.

  16. Eike Stepper says:

    >Let's not argue about exact numbers, but from my own experience with the modeling builds at Eclipse I'd rather say that 90% of the build failures result from problems in the build. Too often I had to experience that none of my committers changed a single line of code and all of a sudden the builds crashed. There are just too many layers of Ant scripts (PDEBuild, BaseBuilder, ModelingBuild, our own scripts) with too poor tooling support for Ant scripting. In contrast to Java/JDT there is no static checking available, no proper cross referencing, etc. Whenever one of the different teams that provide Ant script layers for my build decide to refactor or fix something in their layer my build ceases to work without prior notice.The one thing I wish most in the whole world is to get rid of any Ant scripts in my builds! I'm a Java developer and not an Ant/XML scripter. I see no reason and can not imagine any argument that could turn me into one. For me the only alternative to pure Java/OSGi based builds are fully declarative builds. Both approaches would be well understood and do support excellent static checking and problem finding tools.Martin Fowler is right in that continuous integration is invaluable, but how does he argue that Ant scripts must be used for that?If one would choose an imperative approach to builds IMO Ant is the poorest imaginable solution. Even the simplest statements like if-then-else or loops are not possible without add-on libraries or target juggling. Given that I suffer from broken Ant builds for more than 8 years now I think it's justified to repeat that there's probably nothing I hate more than Ant.I fully agree with Ed, though: builds don't need to be so awful. We (developers) get them right in our IDE so what prevents us from using the same meta information to drive UI-less builds on a server or wherever?

  17. David Carver says:

    >@Eike You hit on a point that is all to common. Programmers like to deal with their comfort zone. Ant is a Functional scripting language. In many cases what you are used to doing or the way you are used to do it, is NOT necessarily the best way to do it. Don't knock the language because it is designed differently than you are used to working.Ant is designed to be a general purpose language. There are alternatives to Ant, Buildr (Ruby), Maven, Raven, etc. Personally I don't think we need to tie the builds or how an item gets built to any particular scripting/language. The proposals I've seen do not address a general purpose scripting need. There is much more that needs to go into a build besides just compiling, generating, and distributing plugins. I'd hate to have to write all that in a language like Java.The vast majority of problems I see are not with the builds once they are setup. They just run. You are right, there is a poor amount of useful documentation with the current scripts. Which leads to the black box affect. Plus running eclipse headless..you can't debug the Ant script (yes, you can set Break Points and Debug Ant scripts with the Eclipse Debugger).So views on where the problems lay depend on perspective. Regardless if we can make it easier for everybody, and make developers care more about the way the build works, then I think we have a long term win.

  18. Eike Stepper says:

    >Some points I like to comment on:"Ant is a Functional scripting language"I doubt that."In many cases what you are used to doing or the way you are used to do it, is NOT necessarily the best way to do it."That's certainly true, but what's best is often determined by criteria that are different in different contexts. Beside "technically best" there could be e.g. "economically best" because Java engineers are easier to find, etc."Don't knock the language because it is designed differently than you are used to working"I didn't knock it because I'm used to work differently as a Java developer but rather because I find it an akward stoneaged technology for the purpose I'm forced to use it for."Ant is designed to be a general purpose language"That makes it perfectly comparable to Java, which is also a general purpose language. If Ant were a good such language I wonder why we (developers) prefer Java to solve our customer's problems and not Ant. Strangely, we only torture ourselves with it 😛"Personally I don't think we need to tie the builds or how an item gets built to any particular scripting/language"That's why I also prefer the declarative way."I'd hate to have to write all that in a language like Java"Of course we'd need convenient libraries, as we do with Ant. I still think that writing the glue code for all these library calls in Java is way more appropriate than doing it in Ant/XML."They just run"I envy you! As I said, my experience is just the opposite. Even without touching the project code. Were the builds written in Java, I'd quickly find the root cause of problems, even if some build framework code was not written by me, even if Equinox was used in a headless build. The tools are just better. There are just more people whom I could ask for help.

  19. David Carver says:

    >@Eike: Ant isn't a pure functional language, it's is more of a hybrid. It has aspects of procedural, but it also has aspects of functional programming. Properites are write once, looping is handled through the execution of targets. Dependenices on targets will cause other targets to execute before hand. It isn't a pure functional language like XSLT, Scheme, or Lisp, but it is very close.Ant and Java are both general purpose languages in a sense. However, Ant became popular for a reason. It provides a simplier way to setup builds, than a pure custom Java implementation. As I said, you can debug Ant in eclipse, however, the way eclipse builds are setup and the fact that you have to run Eclipse Headless with PDE build is the problem, not ANT or the fact that it isn't written in Java.Some even say Ant is declarative, but I won't go that far.A good build file is one that can be created once, and doesn't necessarily have to be modified that often. PDE Build again is probably not the best example to use for a comparison on what makes a good build file. There are still many legacy issues with PDE Build that don't allow it to play nicely.Eclipse has good tools for Ant, the fact that you CAN debug through an Ant script is a good thing. The problem is the way PDE Build and Eclipse builds are currently setup. They take away that debugging ability and break point ability.You say we need a convient library, but don't we already have that with the Ant tasks? Is the real dislike because of the XML syntax? If we had a DSL like:target: SomeTarget depends on SomeOtherTarget { property name = "some value" task javacc classpath=something}Would that make Ant better? I'm not sure about that. Ant tasks are written in Java so I don't see the issue. You can always write a Java program that uses the Ant tasks directly I suspose, but I'm not sure that is going to make the vast majority of programmers care anymore about builds. They will still see it as somebody else's problem, because it worked on their system.So I'm still convinced that complication is not in the build or the construction of the build, but the fact that the vast majority of programmers just don't care to know how the build works.

  20. >David Carver said:The problem with Eclipse builds is that the ANT tasks that are used in the builds depend on the eclipse IDE running headless. You can't just run a PDE build with just Equinox and the eclipse platform. It needs all the extra stuff from the PDE plugins, which require hooks into the Eclipse IDE plugins as well.What bundles that are needed depends highly on what it is you are building. We run eclipse builds headless without the IDE everyday with Buckminster so the IDE is certainly not a requirement. If you want the ant-tasks found in PDE-build, then sure, you need that bundle, but you still don't need the IDE. The separation between the IDE itself and the build system is good. A full blown configuration capabable of building using PDE is perhaps a sixth in size compared to the classic IDE. And that's using it as it stands today. If a proper build model is integrated into the system, I expect we can improve the separation of concern and create more targeted and much smaller configurations.I have a hard time understanding why the Ant way of plugging things into a build is so great while the OSGi way of doing it is so bad. Personally, I think Ant is a poor mix between an attempt to be declarative and a scripting language. I know that James Duncan Davidson, who gave birth to Ant, agrees with me..

  21. David Carver says:

    >@thomas Personally, if I have to run OSGI, i.e. launch it load plugins, etc is something that I don't want to do. It's not uncommon for an eclipse headless (which is still eclipse without the graphical interface), to have at least two or three instances running. The instances that is running the build, and maybe one or more instances that are running tests.Whether you use Ant or don't use ant as the scripting engine. You have add complications and added overhead. Leading to more confusion, not less of what the build is doing.Truelly the best architected solution I've seen when it is stable is the Ant4Eclipse project. It only relies on the OSGI JAR file, and the eclipse compiler. Can't get much smaller than two jars, and it's own for making OSGI bundles. Everything else I've seen requires a full instance of eclipse to be running.Whether you think eclipse headless is the same as the IDE is semantics. If it requires eclipse it is still requiring the features of the ide.Maybe I'm arguing semantics, but I still think programmers don't take their own responsiblities for why builds fail. Builds do what they are told to do, programmers tell them what to do.

  22. >@dave: So what do you consider a "full instance of eclipse" to be? Is it the Equinox OSGi runtime? Or does it become eclipse when you add the org.eclipse.core.runtime bundle?Ant4Eclipse in all due respect, but my builds need the P2 publisher, the PDE branding mechanism, OSGi filter evaluation, map files, XML parsers and serializers, and more. Not just is all of this is readily available to me as OSGi bundles. I also have everything configured. From within my IDE, I just right-click and build my update site. I do the exact same thing headlessly and the exact same code executes.Is the pain of using an OSGi runtime really worth the efforts involved maintaining another build system on the side?

  23. David Carver says:

    >@thomas You ask a good question, where does one draw the line? Once you start adding the eclipse platform code on it, and all the plugin dependencies that go with that, you become eclipse at that point. It's eclipse without the gui.I don't need or even like MAP files, I build my code from head. I liked the simplicity of unzip the plugins and features into the appropriate spot, and they could be added to a classpath and referenced if needed.Using an AntRunner adds complications to the build and also complicates the debugging process. The less layers there are the easier it is to find the root cause of a problem. You can still use P2 from normal Ant, don't need to launch it from eclipse headless. So everything you said you need, you can still get, but in a way that you can control better and debug better. Builds shouldn't be everything to everybody, they should be designed for the needs to build that particular application. Common tasks and patterns should be made into Tasks so they can be reused, but not necessarily tied to having to be launched from Eclipse. They need to be simple as possible so that anybody can debug them if they break. I'm hoping that is what comes out of B3. Only time will tell.

  24. >I would argue that Ant adds a level of indirection that makes debugging, if not impossible, then at least very hard. Keeping it all in one OSGi runtime however, adds great benefits in that area. Suddenly, you can trace everything with one debugger. Single step every code line, inspect every variable. It just works. No extra command line options, no nothing.You can call P2 from ant, sure. But when you do that, you also start an OSGi runtime. This runtime contains, at minimum, the Equinox OSGi runtime, Eclipse runtime, Eclipse jobs, Eclipse net, Eclipse Communication Framework (typically configured with Apache httpclient), and about 15 or so P2 bundles including the SAT4J resolver. Do you call that runtime an Eclipse? If not, what's the difference between that and headless Eclipse?Building from head is great. But as your project matures, you undoubtedly run into situations where you must maintain at least one release with bugfixes while you continue the development of your next release. At that time branching is inevitable.It sounds so simple when you say that you unzip your files into the right spot. But from where does those zip's come? How do you know where to find them and that they are exactly right for you? How do you keep track of it all over time?

  25. David Carver says:

    >@thomas I already said where I make the cut off between what I consider the Eclipse IDE whether it is is running headless or not.You can have osgi applications with out them being eclipse, based on the plugin configuration you describe. Again, we seem to come down to the comfort zone in where some are comfortable working and debugging. OSGI is fine for some things, but it necessarily isn't the correct tool for everything. The main arguement I've heard from people so far is that, "I'm an OSGI Java Programmer, I want Java/OSGI plugins to build things." That's fine, but I work with multiple systems that I need to build, many aren't OSGI and an OSGI solution doesn't work there. OSGI is great, but we tend to use it as a hammer for every situation, instead of where it works best.I'm not going to be surprised if B3 doesn't end up coming up with some DSL for the builds or coming up with some new ANT like system or API. As for the need to build from other branches, etc. Not that hard to manage other branches, and tell a system like Hudson, Luntbuild, or even an Ant script which code to checkout to build a particular release. Always building from a particular tag which MAP file cause, hides potentional integration issues, which are the cause of the majority of build failures in my opinion.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s