I've been playing with Podcasting and hacking my home brew (PHP) RSS aggregator to support it. After 3 weeks or so, this is what I think I've learned. (If you don't know what podcasting is go here)

- Most podcasts right now are like amateur talk radio. Well my life involves words. If I'm not programming or reading web pages, I'm reading a book or a newspaper. So I have very little time in a typical day when I'm doing an activity where I can allow another source of words to distract me. It's just not possible to read text and listen to people talking at the same time. So I just don't listen to talk radio. Which means that podcasting as it is currently has very little appeal to me. However, using the enclosure tag in RSS feeds has a lot of possibilities for distributing sound, video and graphic images. It may not be central to podcasting as it's currently being implemented but the technology is the same. A good example of this would be the Chillosophy group distributing chill out mixes they had developed. Or a B3ta group distributing Flash mashes they'd done.

- My first experiment was just to build a handler for enclosures that grabbed anything that appeared in a feed and copied to the local disk. Now my aggregator frequently took hours to run instead of seconds. So the next trick was to kick off a background job to do the collection. This is trivial in Linux but a pain in the neck under XP. It involved using AT and SOON to create batch jobs running bat files in background a minute later. Yuk!

- After running this for a couple of weeks I had 2Gb of MP3 files none of which I listened to. So clearly it's really important for the aggregator to manage this and expire the enclosures. I also added a switch for each feed to decide whether the enclosure should be fetched or just to add a link to it in the display. One of the fun bits of glue and string I've put together is to grab the enclosures from a Flickr feed of Everyone's pictures on a 3 day expiry. So I've got a directory containing a rolling 250 or so JPGs of random stuff people have posted to Flickr. I've then pointed the "My Pictures Slideshow" screen saver at this directory so I have a screensaver that shows random photos.

- Several podcasters are being hit by significant bandwidth needs from all the enclosure collection going on. As I've written before there's a natural synergy between aggregators running in background, podcasting and BitTorrent. This brings up some issues.

- Firstly take one look at Azureus and you'll realize that writing a BT client is non-trivial. It makes far more sense to offload collection of a Torrent to a dedicated BT client than to write a bad client built into the aggregator. Unfortunately Azureus is not completely transparent at this and tends to popup a UI window when the request is made. It doesn't necessarily save the result where you'd expect. And because of the indirection, it's that much more difficult to get the aggregator to expire the resulting files.

- BitTorrent is still a little awkward to use for the publisher. It takes just a bit too much effort to create the torrent, post it's existence on a tracker and then do the initial seeding. So at the moment relatively few podcasts use BT. This is a shame given that it's such a good technology. We need to get to the point where publishing an MP3 via BT is a one click operation.

- BT suffers like a lot of P2P tech that it doesn't always just work when the client is behind a NAT firewall. So again there is an issue working against everyone using it.

- At this point, we probably need feeds to provide the same enclosure in plain JPG, MP3 or whatever and also in BT with the BT one taking precedence if the particular client can handle it. But RSS2 doesn't have a good mechanism for saying "Use this BT enclosure if you can or that MP3 if not, or stream it in high quality here and low quality there. They're all the same". Meanwhile Atom and RSS 1 are still arguing on how to code enclosures at all, at all. Given the current state of RSS and Atom standards control I'm not sure how we get through this one.

- Going back up to my approach of adding a link to the enclosure into the aggregator. At this point why not simply stream the file rather than downloading it? Well Podcasting originally involved syncing the file onto an iPod so that it could be listened to at times and places where streaming was impossible. This is still valid but downloading where streaming is possible appears to smack of scarcity thinking. Except that streaming puts a nasty load on publishers that get popular. Scaling streaming is hard and expensive. At this point I remember Swarmcast and all the work being done in that area at the height of the P2P boom. Where are you now? It's also worth delving into what Skype are doing with automated supernodes.

- I've been looking at turning group Skype Conference calls into podcasts as a way of doing low tech, low effort podcasts from a group of experts. I think I've found a way of doing it, but it's actually quite hard without being professionally involved in sound production. You can just about do it with audio grabbing software and the plain old Windows mixer, but it's tricky to get it all right. If anyone has some tips I'd like to hear them.

- Anyone can write text. Most people can take a half decent photograph. Few people can produce quality audio. Almost no-one can produce quality video. It's great that podcasting is democratising distribution of high bandwidth media, but it's a few factors of magnitude harder than blogging to actually create the content.

So at the end of all that, Podcasting looks like being a driver for all sorts of technologies and there's still plenty of work to do. And while we're looking at this distribution method, it's worthwhile looking at other P2P distribution of dense media such as Mercora or 3Degrees. I think they're all related. [from: JB Ecademy]


[ << Lazyweb: Fun little dotnet project - XP Flickr Screensaver ] [ Project Honey Pot: Distributed Spam Harvester Tracking Network >> ]
[ 21-Nov-04 12:45pm ] [ , , ]