Archive for October, 2010

RDF or not in Gen2Phen – 6th Assembly Meeting


This is a personal account and not necessarily my employer‘s view.

Until two weeks ago, I had never heard of Gen2Phen. Then my colleague Livia asked me to join her to go to their 6th general assembly meeting and present something about UniProt in RDF.

Gen2Phen is a big consortium, including SIB, working on genotype-to-phenotype information. They have two years to go in their grant, and are thinking about adopting SemWeb technologies to enhance data exchange and integration, data interpretation, and to impress funding agencies. Therefore, they invited someone—me in the end— from SIB to speak about our experiences.

My presentation consisted of two parts, an introduction to RDF and why we provide it, and a tour of UniProt‘s RDF. I aimed for 15 minutes, and got only five to present it due to the packed schedule. So I explained the very gist of “why RDF”, showed some examples, and talked about the problems we are encountering.

The problems got, predictably, most attention. Semantic Web “believers” spreading the vision are plenty. Hands-on experiences with complex data sets such as UniProt’s are rarer. I need to write about this in depth at some point. Suffice it to say, I think I dampened some enthusiasm. This despite the fact that I repeatedly stressed that I think of RDF and related technologies as valuable building blocks in the bigger picture, and as clear steps forward on some problems. But the Semantic Web seems to be an all-or-nothing affair for most people.

Tony Brooks is right in saying that given there are only two years left to go for Gen2Phen, it might be late to start with SemWeb technology. A large modeling effort and uncertain scalability challenges could delay the benefits until it’s too late. On the other hand, it’s not that much work to start experimenting. Install Virtuoso and D2R, fire up Protege, write some RDF using Jena, and get a feeling for the whole thing. Design some RDF schema that expresses the basics of the information at the heart of Gen2Phen, and see if existing systems can add it as in- and output format. That would be my recommendation, which I might or might not have gotten across — it was a packed event about an unfamiliar project where the SemWeb was only one of many sessions, so communication was somewhat difficult.

The meeting as such was very nice. Good conversations and awesome food — La Maison de la Lozere in Montpellier was brilliant. So was the city itself; I enjoyed wandering around the beautiful old town.

One other presentation I found interesting was Gudmundur Thorisson’s about ORCID. This initiative aims to unambiguously researchers with an ID instead of their name, which might occur many times. ORCID will then map an article’s DOI to the IDs of the authors, when it’s submitted. Also, and perhaps even more important, ORCID aims to do the same for data sets. Science really needs more, larger, better data sets in the open for people to analyze and train their algorithms on, but currently there is very little benefit for researchers to publish them. ORCID is not really functioning yet, but is backed by more than 120 organizations, and so has a decent chance at becoming the de facto norm in academia.

FrOSCamp 2010 Zuerich


So, another one of those belated meeting/event reports: on 2010-09-17, I was in Zurich for the first-ever FrOSCamp. It was an Open Source/Free Software event with an exhibition floor, talks, and “a fancy party with creative commons licensed beer and music”—what’s not to like!

I presented my “Praktisches RDF in Perl” talk that I recycled from the German Perl Workshop, to spread the word some more. This time, I had prepared an English version, but as I only had German speakers in the audience, I presented in German.

Unfortunately my presentation only drew a handful of people this time. Note to self: work on the abstract some more. I had suspected that my FrOSCamp one was wordy and not catchy, but didn’t get around to rewriting it. At least the audience were pretty engaged and asked lots of questions, which I prefer to a larger crowd that’s half asleep.

The presentation was recorded and is now online as slides+audio. This was a first for me. I could forget about it while presenting, but I was pretty nervous listening to it for the first time, not sure what mess of incoherent rambling and half-finished sentences to expect. Fortunately, I found it ok in the end. Of course, I found several things to improve, but I guess that’s expected for someone who doesn’t present often and is just getting started. My list of the main points to improve is:

  • The introduction should be much shorter and more focussed. A bit like a sales pitch, not as in being obnoxious and fake, but as in focussed on getting the audience’s attention and appreciation for the topic.
  • Too many sentences didn’t flow properly. Simply doing one or two more dry runs should fix that.
  • Have some more visualizations such as diagrams on the slides.

On the other hand, I was pleased with a few things about my presentation: the style of having little text on the slides and more verbal explanation worked well, the code samples seemed to be the right size to digest during a talk, and the questions at the end showed that people had gotten the key points.

Before my presentation, I got to see Renee Baecker‘s talk about Perl::Critic. I’m using it on my code and thus knew the basics, but I appreciated the advanced example towards the end, where Renee walked us through writing our own critic rules. This works via PPI, so you can find patterns in the AST that match the constructs you want to check. I also found it interesting to hear Renee’s personal experience with the severity levels: he’s typically on 3, sometimes 2, but 1 is too harsh.

Other than that, I was mainly hanging out at the Perl booth, a first for me! The booth was staffed by Renee and Roman from Winterthur (CH), two really nice guys whom I had a great time with, discussing everything from Perl modules to freelancing.

BTW, remember the blurb from the FrOSCamp website I quoted at the top about creative commons licensed beer? That wasn’t a joke. FreeBeer is an organic beer, produced by an independent brewery near Zurich, and the recipe is online under a CC license. And it tastes great! A cloudy, full blonde just how I like it :-)


Get every new post delivered to your Inbox.