Eclipse Modelling Framework: Q&A With the Authors
This week at EclipseZone has been all about EMF, with the launch of the refcard, a review of the new EMF book and today we meet the four authors behind the second edition of the Eclipse Modelling Framework book.
The gang of four that wrote the book are Dave Steinberg, Frank Budinsky, Marcelo Paternstro and Ed Merks. I asked them each about the book and about the Eclipse Modelling Framework including their top tips for it's use.
Competition Time! DZone, together with the Eclipse Foundation, are happy to offer you the chance to win either the new edition of the EMF book or an Eclipse shirt. To be in with a chance of winning just tell us about your use of modelling, EMF or what you like about EMF. We will then randomly select a winner based on the entries received by 3 February, 2009.
James Sugrue: Could you please introduce yourselves?
Ed Merks: My name is Ed Merks. I'm the lead for the EMF project and the co-lead for the top-level Eclipse Modeling project. I work as a software consultant in partnership with itemis AG. I've been working on EMF since the very beginning.
Frank Budinsky: My name is Frank Budinsky. I'm a senior architect at IBM and a founding member of the EMF project at Eclipse. I've been involved in framework and generator design for many years, including as design lead for the EMF-based Tuscany Service Data Objects (SDO) project at Apache, and as co-chair of the SDO technical committee at OASIS.
Dave Steinberg: My name is Dave Steinberg. I'm a software developer at IBM and an EMF committer. I started working on EMF with Ed and Frank before it was open-sourced in 2002
Marcelo Paternostro: My name is Marcelo Paternostro. I've been an EMF committer since 2004 and a user for even longer. I am actually quite proud for being one of the first developers to adopt it, when it was still an internal component at IBM. Before joining the EMF team, I've worked on TPTP's first incarnation, Hyades, and on components for IBM's WebSphere Studio Application Developer, now Rational Application Developer (RAD).
James: This has been a long awaited book. Did you find the second edition difficult to write? Were there any particularly difficult sections?
Ed: The long waiting was the most difficult part because I only wrote a very small part of the content. Mostly I worked very hard on my delegation skills, i.e., I delegated the challenging task of writing the book to Marcelo and Dave.
Frank: The problem with writing a book about a fast moving technology is that one can easily fall into the trap of not writing quickly enough to keep up with the changes happening to the technology itself. The distance to the finish line never seems to change since there's always something new that has to be covered. We fell into that trap for a little while on this book but with a concerted push at the end, we managed to finally get it done. With Dave acting as lead author for this edition, there would be no compromise in the quality and completeness, just to get it done. I think the end result speaks for itself and was worth the wait.
Dave: Yes, writing the second edition was a difficult task. EMF has grown so much since the first edition that a lot more judgement was needed in determining what to cover and in how much detail. I also think we set the bar higher for ourselves this time, demanding a higher level of quality in terms of content, consistency, and language.
For me, the most difficult sections were actually the same ones as last time: the ones that could really be described as reference material. For example, there are chapters describing the mappings from UML, annotated Java, and XML Schema to Ecore. There are also sections describing every last generator and resource option. There were lots of little details to cover in those sections -- many more than when we did the first edition -- and we were really striving for completeness. So, gathering and organizing all that information was pretty challenging.
By contrast, the rest of the book talks a lot about concepts and is very example-driven. I think that kind of material is much easier and more fun to write. It's probably more fun to read, too, but people need the reference material, too, when they're looking for an answer to a specific question.
Marcelo: This is my first book. To me it was very surprising to realize how hard it is to write about a technology that I know from end to end. It is amazing how challenging it is to put concepts in words and how distinct the writer's and developer's mind-sets are.
On a more personal note, English is not my native language (I am from Brazil and have moved to Canada 8 years ago). This book has given me a fantastic opportunity to improve my communication skills. It was like a huge "pair programming" experience to me: I would write the content that I set myself to work on and then often rewrite it with Dave, who I usually refer as an "English Grammar God". I like to think that he noticed a huge improvement between my very first lines back 4 years ago and the ones I wrote in 2008 ;-)
Dave: Okay, clearly I'm getting good value for the money I pay Marcelo to heap praise on me. And, yes, his writing has definitely improved since we started this project. Actually, I'm pretty sure that's true for all of us.
James: For those who have already read the first edition,and are familiar with EMF, what are the main chapters that they should take notice of?
Marcelo: We've revised and improved every single chapter of the first edition and added a bunch more. The new edition has more pages than the first one and it doesn't contain the API summaries!
Trying to answer your question without saying "all chapters", I would suggest Chapters 15, 20, and 21. The first is a very deep and complete trip to the persistence world. The second shows how to run EMF in environments that are not the usual Eclipse IDE and serves well as a good review of some important concepts. The last one, Chapter 21, could be called "what we were also doing while we were writing this book", as it shows the main features introduced in the 2.3 and 2.4 releases.
Dave: I'd definitely agree with Chapter 21. It covers a lot of important new topics, including generics, content types, reference keys, and Ecore validation.
For people working with XML Schema, I'd add that Chapter 8 and 9 are very important. The mapping from XML Schema to Ecore is totally different from when the first edition was written. Chapter 8 introduces the extended modeling concepts in Ecore that support this mapping and Chapter 9 details it.
There's so much new material in Chapters 14-18, all about programming with models and the EMF runtime, that I couldn't even choose.
Frank: I'd say someone that is familiar with the first edition should leaf through the table of contents and notice how much new material there is and how much more detailed the coverage is in many of the original chapters. EMF has come a long way in the last 5 years, and the second edition has thorough coverage of all the new features. I'm sure that most readers, no matter how familiar with EMF they may be, will notice lots of neat things they didn't know about.
James: In your opinion, what is the most under-utilized part of EMF?
Marcelo: Although I can certainly think of parts that are less used, I don't believe there is a definitive answer to this question. When a developer decides that it is time to move past the basic "this is a code generation tool" idea, she ends up finding something in EMF that fits perfectly with what she is doing or that is in total agreement with her way of designing and implementing code. And, fortunately I guess, this "something in EMF" varies from person to person. I truly believe that we have at least one user for each line of code we've written.
Dave: Yeah, I agree with that. I'm not sure I'd say that it's under-utilized, but there's a whole lot of flexibility in EMF that many people don't know about. Most people probably assume that a modeled class in EMF maps to a generated class with one field per feature. In fact, there are numerous memory-saving options available in the generator, and you can even opt to delegate all storage of feature values onto an arbitrary external backing store, giving you total flexibility to do relational persistence, soft references, and custom lazy loading, for example.
Most people probably don't need anything fancier than one field per feature. But, if at some point they find they do, they might not immediately be aware that EMF will probably still be valuable to them.
Ed: I think EMF is little like an iceberg, not so much because it's so big it can sink the Titanic, but rather that only the 10% above the waterline is noticed. As the guys suggested, that water line is defined differently by different people.