I arrived in Portland last night for OSCON. From a conference standpoint, it feels very similar to when I attended in 2005. The attendance doesn't appear to have grown much and the exhibition hall is about the same size. Though there appear to be a few more "community" exhibitors than previously.
As far as hardware goes there appears to be an abundance of Apple laptops (probably around 70%). Of the remaining it seems that a good 2/3 of them are running windows. Perhaps I'm too much of a Linux bigot, but I find it amazing that a conference that Mark Shuttleworth keynotes at, that is name "Open Source" conference, that most attendees are using proprietary operating systems! Oh well, I digress.
The keynotes this morning were pretty good. They were of the short/sweet variety. Each person talked for 15 minutes only. Tim O'Reilly started off by talking about how to see the future of technology (watching alpha-geeks). He also talked about how web 2.0 is removing some of the openess of the web. Data lockin and removing the ability to "fork" networked applications were some of the issues he mentioned.
Some Intel folk did a clever trick by dressing up in a suit and tie and pretending to pitch/sell their new Threading Building Blocks library. Then they ending saying they were open sourcing it
Continuing on the threading thread, an MS researcher Simon Peyton, explained Transactional Memory and the basic building blocks, "atomic", "retry", and "orElse". After he explained these, their need and the problems they solved he indicated that they been implemented in one of his projects, Haskell for over a year. Should be interesting to see constructs such as these gain adoption given that Moore's law seems to be relegated to multicores these days.
Tim then interviewed Mark Shuttleworth. The main point I gathered from their 15 minutes was that Ubuntu was really successful in "innovation in and through collaboration". Their Launchpad tool is an example of hub in a collaboration engine for end developers as well as upstream component providers.
SOLR - Kimsal
I went to a session on Solr that demoed setting it up and adding some data to search engine. Solr is a standalone search server that provides an easy resty interface.
What does "Open Source" mean?
This was an interesting panel led by Danese Cooper. At issue was SugarCRM and Socialtext's use of attributions licenses that weren't approved by OSI.
Two announcements were made, one that non-vanity license, CPAL for attribution was being accepted by OSI (after SocialText went through the ringer for it), and that SugarCRM is going attribution free by adopting GPL3 for their V5 project.
There was a lot of discussion about the impracticalities of placing a large amount of attribution text in small gif ad's on web pages of derived projects.
Hadoop - Doug Cutting
Hadoop is an open source implementation of Google's MapReduce toolset in java. Originally it was part of the Nutch search engine, but it has now been pulled out and is mainly developed by Yahoo.
The basic idea is that to scale to petabyte datasets your data will not fit in memory. You need to put it on disk and that can be problematic since disk seek speeds haven't improved much over the years. So to deal with this you build a cluster of machines that can churn through data at maximum disk i/o.
Yahoo, for example, is using Hadoop to perform queries on their large dataset. These are batch mode queries and some are queries that weren't really thought of when the data was placed on the disk. The MapReduce paradigm allows for this kind of adhoc querying.
Foundations of Open Source - Waugh
I then went Jeff Waugh's talk about the foundations of opensource. His belief was that too much focus on open source licensing has led many to neglect the finer and more important points of what it really means to be "open source", such as community and governance. He then described his experiences with Ubuntu and the Gnome project. Ubuntu was lucky enough to start on the shoulders of giants, other open source projects that had been in existence, and had structured itself to avoid some of the hurdles that these projects ran into.
He was using Clutter for his slide show (fancy open-gl), which I hadn't heard of previously.
Atom Publishing Protocol - Gregorio
Joe Gregorio presented on the Atom publishing protocol. Through some nifty deception he was able to guide simpletons through some of the finer complexities of the protocol.
Generating Gorgeous Word Documents - Koziarski
I liked the methaphor used during this preso. He compared document generation to web mockup generation. Basically the designer gives the developer some html layout filled with static text, "Lorum impsum, foo..." and the dev will then leave the pretty formatting but replaced the static elements with the appropriate business objects.
Koziarski needed to create doc and pdf files, but using COM/MS Office APIs wasn't an option (need to run on linux), rtf, latex and pdf generation also didn't work because the designers couldn't visually "design" with them. The solution was to have designers create documents in Open Office, and then stripping out their static text in the resulting XML.
Because they ran into issues about OOo requiring X11 on linux (headless X servers gave errors), they ended up complicating the process by throwing Amazon's S3 and SQS into the mix.