" /> Dreams of a Rarebit Fiend: December 2005 Archives

« November 2005 | Main | January 2006 »

December 30, 2005

All My RSS Subscriptions

"Hi, my name is John, and I'm addicted to news." It's true, I used Netscape's old portal because they allowed you to pull RSS feeds and put them in boxes that were of the same status as blocks of information they supplied. After Netscape lost interest in that, I wrote my first RSS aggregator HotSheet because I wanted to pull a dozen or more feeds and get them in one "stream of news". Now it's about five years down the road from the earliest RSS stuff I can remember and RSS feeds are available for just about every website that updates regularly. Many people get them automatically with their weblogs simply because the tools they use (WordPress, Blogger, etc.) create them by default. So I monitor a large number of weblogs and other news sites via RSS and occassionally people ask me for a complete list of everything I subscribe to. So now you are going to see the depth of my addiction...

I roughly divide these up into the following categories: Comics, Java, Podcasting, Weblogs, and Misc. I'm using JetBrains Omea Reader 2.0 to read them right now but I'm not really enamoured with it. It's just better than several others I've tried. How's that for an endorsement :)

Here's the complete list of everything I'm subscribed to as well as an OPML file containing all the URLs to subscribe to the various RSS feeds:

Is it any wonder I want to create next generation applications that filter down this mess to something you can actually read? Some channels barely have any traffic at all, others have a dozen or more per day. As an aggregate I'd guess there is more than a hundred items per day from them at the moment and it's slow because of the holidays. If you add a feed like JavaBlogs to this mix you can figure on several hundred more just from it because it aggregates more than 1500 weblogs together itself.

Paper Bookmarks

I like these little page corner bookmarks. They are easy to make and you could do some clever custom variations with little effort.

I printed some on 110lb. cardstock and while they seem nice and very durable, I think they might be more likely to fall off the corner of a page because they are so stiff and solid. In this case a flimsier paper alternative may be a better choice. I intend to manufacture some to see.

While I'm talking about paper I'd like to plug two more paper project related things. One is the awesome pair of scissors I use. They are made by Fiskars and you can see them in the photo. There is nothing unique about small bladed sharp scissors intended for cutting small projects, what sets these apart is that the handles are actually large enough for my hands. I don't know who they intend most fine point scissors for, but they are positively medieval in their design for non-Lilliputian people. There is never any padding, just a tiny metal loop for you to squeeze a finger into and cut cut cut. I've used little detail scissors like that before for projects like the paper automata cupid but my beloved Fiskars would make that same experience faster and ten times less painful to do all over again.

The other thing I'd like to plug is Jaime Zollar's Paper Forest. It's a cool weblog devoted to paper automata, cardstock models, etc. with regular postings of neat projects. It might not be as complete as some of the link farms you see out there for paper projects but he makes up for it by highlighting some of the most interesting things and offering pictures of most everything he talks about.

December 28, 2005

New Version of Audacity Will Be More Podcast Friendly And Thoughts On Podcasting "Helpers"

A lot of people already use Audacity for sound recording and/or for editing of podcasts. So any new release, especially one with some new features for podcasters is likely to be a big deal. The new version is still considered to be very early and too crash happy to be really used, it's intended more as a technology preview of new features including:

  • Various changes to make editing easier and the UI friendlier
  • FTP upload of files directly from within Audacity

Sadly though, there's still no way to set the genre to "podcast" in the ID3v2 data from within Audacity when you are exporting your MP3 files. Thus most podcasters will still end up using iTunes or some other program to set the type, etc. and that invalidates the idea of doing FTP directly from Audacity. It's just not a one stop shop.

So that brings us to podcasting tools like EasyPodcast. You pick the MP3 file, it applies tags to the MP3 as well as a logo, it creates the RSS, it uploads both the RSS and the MP3 file.

Personally, I think this is a great idea, sort of. I don't need the RSS generation, that's being handled for me by Blogger and FeedBurner but I can see a whole host of little utilites that could all be hooked together for you to progress through to prevent errors and streamline the process. Right now I record with Audacity, then edit (also with Audacity), use iTunes to apply ID3v2 tags to the resulting MP3 file, gather the times of various events in my show, create a Blogger entry to go with the show, upload my show via FTP, and finally post my new Blogger entry to my weblog and the show is done. In the background FeedBurner notices the new weblog entry and updates the RSS feed it is providing to anyone interested in the show. Seven steps plus an automated one. I'll bet others have even more.

Software to take you through all of that could be cool. Especially if you have multiple people having to upload things or several people working on the same show. But do I want a special tool for podcasters that includes a dozen different podcasting steps and I select and order only the ones I want (not just the three that the EasyPodcast guy thought I needed) or would I be better off with a tool that was all about creating the workflow?

But then that thought leads to the Automator software that ships with Mac OS X. Because, I mean, it's all about the workflow. They just realize that a simple linear workflow is not that complicated a thing and with steps that can be written by anybody we can have simple ones (rename this MP3 file) to highly sophisticated ones (scan the file for dead air and isolated "um" and "uh" noises and strip them from the file). Then I can mix those into any workflow I can come up with. Perhaps even one that includes show prep or post show activities. Why doesn't a simple workflow tool like that exist, doing one in Java that would work on any major OS would be fairly trivial and tools like the Java Plugin Framework should make it pretty easy? Good question.

December 22, 2005

Celestron SkyScout Answers "What Star Is That?"

It will likely be insanely expensive (it incoporates optics, a small computer, a GPS, and additional sensors) and it might not even work, but the SkyScout is an insanely great idea for a product.

With it you can just look at a star or other sky object and it will tell you about it using both text and audio or you can reverse the process, picking the object you are interested in from a built in list of several thousand and the SkyScope will guide you to it. USB connectivity and the ability to take memory cards (SD format) allow it to be updated and expanded.

December 21, 2005

Nostalgia Looks Better On Paper Sometimes

Since the Atari 2600 represented not only the first real console game that was a mega-seller and also thousands of hours of gameplay for me personally, I jumped at the chance to buy an Atari Flashback 2 recently (under $30 retail). It has two original looking joysticks, the feel and weight don't seem quite the same but they look right, and the new console is much smaller and replaces the old metal toggles with plastic buttons but it even incorporates the small strip of faux wood that was on the original! In general it resembles a 2600 which went through a shrinking machine.

It comes with 40 games built in (20 old and 20 new) and even though the various paddle games were some of the best, there's no paddles or paddle games. Why we needed 20 new games rather than 20 more classics to try and improve the overall selection is my first problem with the console. More Activision titles and some more 2600 classics would have boosted the quality. As it is though, you can play Combat, Missle Command, Pitfall, Asteroids, Adventure and many others. They don't include a full manual in the box, just a few pages to tell you how to plug it in and get started, but Atari has a full manual in PDF form on their web page that describes each game individually.

Now, before I say what I'm about to say, I want you to know that there are lots of old games which are just as much fun today as they were 20+ years ago. Galaga, Ms. Pac-man, Donkey Kong, etc. are still great games so just age alone doesn't make a game bad. But most of the games on the 2600 are just bad. The console had such profound limitations in the size of game programs that gameplay really suffered. It's not the crude sound capability or the even cruder graphics which hold it back, it's the games themselves which are weak.

I played Sonic The Hedgehog recently and it was just as much fun today as when it came out. Most of the 2600 games I played were not. I guess we were far more desperate for entertainment then and we saw more of the potential in the medium than we saw what was actually produced.

Sometimes something is just much better in your memory than in really was... My Flashback 2 went back to the store and I got my money back. I guess I'll have to wait for the Super Nintendo in a box instead.

December 20, 2005

Data Processing On A Huge Scale: Google's Story

Years ago, I naively thought that Google somehow had amazing machines and software that managed to do most everything in real-time even though the huge amounts of data they process pretty much preclude doing any such thing if I had bothered to think about it rationally. I imagined that they were processing each site they crawled as soon as they found it and into the search engine it went. Each news item from RSS was similarly fed straight into an index and made available immediately and no batch processing of reams of data was done.

Fortunately, such magical thinking has not persisted. Google does not use elves in a hollow tree to produce their results, they use intelligent engineers and many of the same tools available to you and me. They have developed all kinds of innovative solutions in order to be dealt with the huge amounts of data they have. Those solutions include:

  • Building a truly enormous array of commodity PCs on which they run Linux to handle the computing needs for all of Google. When individual computers fail, their software simply shifts the workload to other functional machines. Supposedly, they buy large quantities of parts in bulk and make their purchases in a variety of ways to avoid being gouged by vendors.

  • They created a distributed filesystem that spreads all files across hard drives on three separate machines in order to reduce the chance of failure causing loss of data.

  • Built software that makes it easy to handle machine failures, distribute computing tasks across a large number of CPUs, etc.

The best thing about all of this is that they haven't been particularly quiet about how they do a lot of it. For example, if you go to their Research Publications site you'll see papers about The Google File System and Web Search for a Planet: The Google Cluster Architecture.

Now, I'm not going to snow you on this, if you aren't of a technical bent, this stuff is going to be a hard boring slog. Michael Chabon it's not. But, if data analysis of truly ginormous data sets interests you, then you want to read their paper on MapReduce: Simplified Data Processing on Large Clusters [PDF].

It's all about how they split up many data analysis processing in such a way that it is easy to write the algorithm to process the data and not spend time worrying about hardware failures, how many machines you might be allocated to run your software, or how to optimally use those machines to get the data processed in the least amount of time. Instead, it forms a kind of support system that reminded me of using the genetic programming package JGAP. I'll talk in a future entry about how JGAP can make it easy to find optimal or near optimal solutions for problems that would be tedious or impossible for humans. But the important thing it did was to make it easy for me to focus on the specifics of my problem and not on the mechanics of a framework. MapReduce is one of Google's means to achieve that same kind of focus and I think it makes for a really interesting read.

The Java Nutch project includes a Java version of MapReduce and a distributed file system that you could use as part of your own huge data set processing so reading these articles isn't just an academic exercise. You can actually put this to use if you have a project that needs it. Be sure to check out the wiki for the Nutch project for more helpful information.

Jive Messenger Becomes Wildfire Server And Gets A Speed Boost

I recently mentioned that we installed Jive Messenger at work to get a good instant messaging server that we could control and which didn't result in important conversations leaving the building to talk to distant servers. In the time since I wrote that, Jive Messenger has been renamed to Wildfire Server and it has had a dramatic speed improvement. Jive Software: Wildfire Optimization is an article briefly detailing the optimization Jive Software did for the new version of the server and might be instructive if you haven't done optimization on a Java project before.

It's Not Highbrow Humor But Google Video Can Be Seriously Funny

The Internet has seen a large set of text, image, and video items which circulate around through email and forums for years and years. Some of it is urban myth, obscene, funny, strange, amazing, and everything in between. But videos typically don't get passed around as much in email just due to their sheer size. Google Video has become a catch all for these videos so you can just point to them and everybody can enjoy. Here's a few of my favorites.

December 15, 2005

Macintosh Folklore

The original Macintosh was a righteously cool machine that was massively different than any IBM PC or Apple II of the time. Folklore.org: Macintosh Stories is a really cool website that is working to preserve stories about the creation of the machine and the software on it written by the people who actually worked on it.

December 14, 2005

PDF Christmas Cards You Can Print

I don't like all of them but there's at least a handful of really cool Christmas cards at this site: Happy Holidays!

They are giving away the PDF files you need to print the cards and they are adding more over time.

December 9, 2005

Intel's Sour Grapes Over $100 Laptop

Nicholas Negroponte wants to get laptop computers to kids throughout the world who do not have computer access today. His means to do so is a $100 laptop which the MIT Media Lab has designed and which various manufacturers and other contributors have lined up to help shepard into existence.

But not everybody is happy about this. Take for example Intel: Wired News: Intel: Poor Want 'Real' Computers

The gist of Intel's remarks is basically that a laptop with a megapixel screen, 500Mhz processor, wi-fi networking, and 1Gb of memory isn't a "real" computer. After all, processors in desktops these days start around 2Ghz (4x), they run Windows rather than Linux, and they have hard drives. Best of all, about half of them have Intel Inside! (insert bing bing bing bum notes).

This is about the saddest form of sour grapes I've ever seen. Both Apple and Microsoft approached the group about running OS X or Windows rather than Linux because if this succeeds it will obviously be a big boon for Linux. In the course of just a few years, Linux's share of the world OS market could increase dramatically and it could be second nail in the coffins of these two OS powerhouses (the first nail being the success that Linux has already had in the server environment). But Apple and Microsoft haven't chosen to make disparaging comments about the project.

But Intel is feeling threatened because not only are their processors not in the laptops but also, their major competitor AMD has signed on as a whole hearted backer of the project and has dumped a couple of million dollars in as well: The $100 Laptop Moves Closer to Reality. Likewise Dell has chimed in with their negative comments, "It's important that a computer prepare students for the applications they'll be using after they get out of school."

What computers are those going to be in developing countries like Thailand, Brazil, Sri Lanka, etc.? The latest Alienware PC, a high end Dell desktop, or what? Are the kids going to turn up their noses at a laptop because it won't play Quake 3 or because the unit doesn't include a hard drive to hold their MP3 collection?

In a word, this is malarky. Any box that meets these specs will be ten times what my Palm handheld is, and that is a useful device already. This will have a full sized screen and a real operating system that will handle browsing the web where one connection can be gotten and shared among many students. Editing a paper, sending an email, playing a game, or many other uses. With the USB connections built into the box you can hook it to external storage so the kids can save their work. This is a real computer and real people will be better off when it is available.

December 6, 2005

NBC Selling Shows Through iTunes

Following ABCs lead, NBC has decided to make available a variety of shows including part of season one and season 16 for Law and Order, Battlestar Galactica, old episodes of Dragnet, etc. It's good that they are starting to realize that they need to be more flexible than their current, "You watch the TV and the commercials and you get it all free, and if you have a Tivo we'll try to screw with the show times to mess it up," way of doing things.

The problems are ones of DRM and pricing. Apple unwisely seems to have settled with the networks on a pricing of $2 for an episode even though most people won't watch it more than once, it doesn't come down in a quality suitable for viewing on a TV, and even if it did come down with enough resolution they don't allow you to burn it to a DVD. So it's just like buying a song from iTunes, only it's completely not. Because you should be able to do so much less with video, right?!?

Oh, and when the $2/episode thing doesn't make them enough money to suit them, as with the Battlestar Galactica mini-series where it would cost you $4 to get started watching the series... OH MY, that's not priced nearly high enough. It'll have to be a package deal and we'll price it at a $16 price we pulled out of thin air. Seem fair? OK, so we're pricing via perceived value, I see. Then the 20 minute episodes of the comedy The Office are just a buck then. Right. One thin $1? Ha hA! No.

I guess I'll continue to get my TV via the antenna where the opportunity to re-negotiate the exchange rate for watching exists only in the form of a gradual slide of commercials from 15 minutes/hour 30 years ago to 20 minutes per hour today. Or I'll check out the DVDs from my local library where they get enough use from the discs that I guess they don't mind paying the ridiculous disparities in DVD prices (Band of Brothers - on sale at only $6/episode, Lost Season 1 - currently about $1.60/episode, cheaper than downloading and you can watch them on TV too) because a lot of people get to use them.

December 1, 2005

Creating Reusable Components Requires Extensive Experience

I've rewritten this entry because I was told by multiple people not the least of whom was my wife (a person who can actually write sentences that make sense) that an instant message chatlog between myself and Don Thorp didn't make for a readable weblog entry.

It all started when Don pointed me to this entry about what should and should not go into Ruby on Rails: http://www.loudthinking.com/arc/000407.html. The way he put it, it gave him heartburn.

I read it and agree. The position of the author is that higher level constructs have no place in Rails and that in general it's futile to try and construct higher level software components for websites. I consider that to be a big mistake. They are missing is that the underlying model of, for example, ZWNews and MovableType and Wordpress is fundamentally the same. Most blogging software, forums, comment software, link counting, polls, etc. can be reduced to a basic subset of features which are in just about any of the software that various sites are using. If there's that much commonality then you can produce that subset of functionality, offer a few simple hooks into it so it can be extended in a couple of places and it'll probably serve the needs of 80% of the people who are building sites.

I would argue that they make this mistake because:

  • They don't spend their time building one weblog after another or five or six forums in succession so they don't see the common elements which underly all of them. This is similar to the fallacy of so many game developers who start off by building a "reusable" sprite library or 3D engine before they begin their first game. They either fail completely or they succeed at building something but pronounce it impossible when they cannot then reuse it themselves later for another project or persuade anyone else to reuse it. It's a poor fit because they didn't grow the API based upon the needs of multiple projects, they instead attempted to divine the interface and capabilities based upon what they thought it would need.

  • They imagine both a model and a UI which goes with it. That never works. You can have a UI which is a starting point or an example but the focus of the code has to be on the model and the administration for that model.

Kasai is an excellent example of this. It's design is a good one for an authentication and authorization system which will serve the needs of 80% of all web applications. I know what I've needed in the past and I can look at it and assess whether this would have met the needs of a large number of sites I've worked on and the answer is yes. In fact it really could stand to be reduced or simplified in a few areas and it would still have served my needs.

I think what most people fail to see is what is really a reusable component in real life. In the world of IC components a digital micromirror is a reusable component. It's not a TV all by itself. It has no tuner, memory, light source, etc. etc., yet it's a reusable piece. It's going into all manner of TVs today, all different in subtle or large ways, and for all I know somebody is building a huge array of them for a high resolution video wall.

So when I switch to the world of software, specifically, web applications I should be able to identify reusable pieces that occur over and over again with variations. In the world of websites you frequently have comment systems that have the following characteristics. There are a large number of unique conversations. The conversations are not linked to each other. Each one is a straight linear series of comments. Each comment needs to be attributed to an individual which could easily be referenced via a unique ID. Each comment may have some numeric rating associated with it or a pointer to another set of properties. That is a reusable component. I can build it and you could drop it into the user ratings at the bottom of Amazon products or the file comments at Stock.Xchng or Fark or half a dozen other places and you wouldn't notice that it had changed. Even if the first generation of the component isn't a great fit and needs work (like Kasai) the second or third will be because it was adapted to the real needs of a lot of users.

It's good that there is a major emphasis on making the infrastructure solid in Rails but saying that there shouldn't be a set of libraries that go with it to provide some reusable components for counting links clicked on, comments, authentication/authorization, etc., etc. is like saying that Java would have been better off being just like C++. A language without its huge supplementary library. That library, a well designed one which provided pieces we use every day for things like collections, XML parsing, regex, etc. determined well what some "high-level" components were which could be widely reused. Rails doing the same would only strengthen the framework not weaken it.