The Blog




Results from the Defcon Wireless Shootout

35 miles from a mountain down the Great Basin Highway. "On Friday, August 1, the antenna was built completely from scratch in the desert, on the side of the mountain, in the rain. The large horn was comprised of metal pipes and window screen wire mesh. It had a transmitting element made from cardboard, duct tape and aluminum foil -- and both components worked spectacularly. A bath towel was used to provide shade over the laptop screen, which was otherwise unreadable in the glare of the desert, even with the clouds overhead."

And 5 miles with a pair of cantennas. [from: JB Wifi]

The Register : Some 20 respondents said a 'Wi-Fi hot-spot' is something that has been left out in the sun too long and gone a bit rank.

But at least 30% got it right.

Methinks the Hotspot operators could do with a little marketing in the name of basic education. If the customer doesn't know about what your selling or even what the product is, then they won't buy it. [from: JB Wifi]




Google News Alerts breaks cover from the labs.

But come on Google, where's the XML or RSS version of Google News and News Search? We know you're thinking about it. [from: JB Ecademy]




Richard Allan's Weblog is from another UK MP. Who's next? [from: JB Ecademy]

Now this I like. Stupid Security: Exposing Fake Security Since 2003 Does exactly what it says on the tin. I hate security, it just gets in the way. And there's always a way round it. [from: JB Ecademy]

One of those re-installing Windows horror stories. dive into mark tells it with humour. [from: JB Ecademy]




The XML specs say that you must reject a file that is bad XML. This is good for XML because it encourages people to produce good well formed XML. But our goal is the data not the spec. We're also likely to be collecting embedded rdf in web pages among other things, where the overall file may not be XML at all. So there's a need here for an "Ultra- Liberal RDF parser" like the one Mark Pilgrim has produced for RSS. see http://diveintomark.org/projects/rss_parser/ Most current RDF parsers are built on top of XML parsers which will reject bad XML so you'll never see the data.

If you find bad XML or bad RDF then tell the author. Don't just ignore it or dump it. They will probably be very thankful. And if you're generating RDF then check it with the W3C validator http://www.w3.org/RDF/Validator/

We've learnt a lot from RSS about how to play nicely if you're writing aggregators, spiders and scutters. All this is relevant to FOAF along with some other issues.

Bandwidth
The first thing to be aware of is bandwidth. Both yours and the sites your collecting FOAF from. The simplest tweak is to support gzip encoding. foaf, rdf and xml files are essentially text so they compress well. And most web servers out there support gzip encoding if the client tells them that it understands gzip. This should be completely transparent to your code, but make sure your http collection toolkit supports gzip and uses it.
See Mark Pilgrim's page about this.

There are two standards for applications to make known to a client that data hasn't changed. etag and last-modified. Your scutter should store these against the feed data and put them back into the headers when requesting data. If the server is properly setup it should use these to return a 304 not modified header rather than the full file. See this tutorial on using them

Even if you don't get these headers, it's useful to know if a feed has changed. One way is to generate an MD5 hash and store it. Then compare it next time you visit.

Use HTTP Codes
There are a set of well specified HTTP codes that servers generate. Use them. Particularly important are 301 Permenanent redirect and 404 not found. If the server is telling you that the data has permanently moved, use that fact. And if the file is no longer there, don't go on asking for it repeatedly.
Mark Pilgirm has a set of test files for all the different codes.

Play Nicely
You're writing a robot. So pay attention to robots.txt

Some sites, like Ecademy serve large numbers of foaf files. It's not fair on the server to hammer requests at them as fast as you can. This is effectively a denial of service attack and webmasters who notice this may well band you permanently. It's good practice to wait a second or two between successive requests to the same host. Note that some hosts use sub-domains like mine.domain.com, your.domain.com. So Wait 1 between calls to domain.com

Refresh times
RSS can change quite quickly whereas FOAF probably won't. So a refresh strategy is less important. One good approach is to start with a refresh of say 6 hours. If the data hasn't changed (see etag, last-modified) then double the refresh to 12 hours up to a maximum of a few days. If the data does change go back to your minimum refresh period.

There's an RSS problem where lots of client aggregators all collect feeds at :00:00 after each hour. If they're clocks are reasonably well synced, popular sites will get hit by large numbers of requests at :00:00 So when you set up your cron job pick a random(ish) time in the hour to start.

Caching the data and default pages
I think a lot of FOAF applications are going to collect the data on the fly. This means that every hit on your application is going to generate a hit on the data source. Now given that you probably have lots of alternate pages to look at I don't think this is going to be that much of a problem. But think carefully about the bandwidth implications for the source. If you get slashdotted and you,ve coded well to withstand it, you don't want to slashdot the source in turn. Probably the most obvious danger is having a default page with a link to a default source.

The alternative to this is to cache the data locally. On the surface this looks like a good approach for several reasons. But it leads you into a set of issue to do with how current the data is. And these issues are still being worked out. Take an example. Say you collect data that says that A knows B. A week later A falls out with B and they both remove the foaf:knows statement. Your data is now wrong.

Mirrored from the Ecademy FOAF club




TypePad News: TypePad Features and Pricing Note the auto-FOAF support. [from: JB Ecademy]

Here's one to worry about. What legal responsibilities do hotspot owners have for activity conducted by their customers?

This is prompted by the discovery that Wayport run an open email server for their customers and T-Mobile intercept email destined for other servers and route it via their own email server (I think). Both of these are reactions to a real need for their hotspot customers to send email. But both are effectively open relays with no authentication. How much information and logging do they really have in place to positively identify the customer?

The real answer here is for the mainstream ISPs to support SSL and STMP-AUTH so that their paid up customers have access to email services no matter how or where they connect to the internet. At home, work or from a hotspot. Until then hotspot providers need to provide some solution so that hotspot users can send email. But just giving blanket access to anyone using the hotspot is not the answer. [from: JB Wifi]

How we all laughed at the Americans and their litigious society when the RIAA started sending out sub-poenas to their customers and asking for and getting private information from the ISPs. And their politicians proposing that if you share music files you should go to jail and your computer was fair game for the RIAA to hack and destroy. Thankful of course that "it could never happen here".

Well folks, the legal underpinnings for exactly the same crass behaviour designed to protect dinosaurs from extinction will be in place in the EU and UK in October 2003. The BPI are already rubbing their hands at the possibility of being able to go after their customers using legal means.

When music ("with a repetitive beat" c 1984, M Thatcher) is made illegal, the goal of total criminalization will be complete.




A website idea. Start with a map and location search. Add the ability for people to upload a photo from their cameraphone with the street and number. Have a Wiki like history showing the current photo and the archive of photos of that location. Allow people to annotate the photo with text about that location. So I could then go to the website find 56 Camden High Street and then look at photos people had taken of that shop and the other shops near it in the street.

Could we use this to harness the power of the masses to build a photographic record of the city?

This is one of those "ought to exist" ideas that I dn't have the time (or maybe the ability) to make happen. Feel free to steal the idea and make it your own. I've picked London purely because I live near there. It could easily be extended to other cities and towns. [from: JB Ecademy]

WRBB Wireless Rural BroadBand - Sunshine Wideband Internet is a WISP starting up in Rutland radiating from STamford and using 802.11G [from: JB Wifi]




If a standard remains write-only and it's too hard to use it will remain academic wanking and never reach critical mass. To my eyes, this is currently true of RDF and almost every individual RDF namespace with the possible exception of DC (Dublin Core). For application writers (as opposed to tool makers) each namespace is a standard in it's own right that also has to reach critical mass.

If all you want to do is encode data that you and only you are going to consume later, then RDF is extraordinarily powerful. Mozilla is a good example of this. It uses RDF extensively internally to manage defaults. But if you want to try and create something for really widespread adoption, RDF is so flexible that it encourages fragmentation. And so I don't believe you will ever get that widespread adoption. Like perl, there are always multiple ways of saying the same thing so people use multiple ways of saying the same thing. And so the people building apps to consume that data have an impossible task.

RSS is the prefect counter-example. You can view source and derive enough hints to write your own. But even though 50% of it is bad XML, and 50% of that is bad RSS (I exaggerate) and we needed ultra-liberal, non-validating parsers, it's simple enough to use. And despite the RDF people trying to generalise it, we never lost sight of channel, item, title, link, description; the bits that are actually just plain XML. The end result was *the* most widely implemented metadata standard. If FOAF is ever going to get similar adoption it needs to learn from this.

I keep going round and round this problem. I understand that the authors of FOAF are trying to satisfy many different use cases. And I understand that the people generating RDF files with some FOAF are busy using 50 or so namespaces for their own private requirements and extending them as they need to. And it's RDF not XML so it's only the triples that matter so what appears to be awkward layout is irrelevant. But the barrier to entry and the difficulty of being able to parse and extract the expected data from some arbitrary RDF file with some FOAF tags means that I don't think FOAF is going to go mainstream.

And I really don't have the stomach for a breakaway in the style of PIE/Atom/Necho.

What to do? FOAF is too good an idea to waste.




Update on Linksys WRT54G hacking attempts. O'Reilly Network: Is Linksys shirking the GPL? [July 30, 2003]
- The firmware production and update process is now open.
- They've got a shell.
- It looks like the WRT45G contains and uses modified GPL code, but the code updates don't appear to be available which is required under the GPL.

"I hope we can get through to these large manufacturers that while they are certainly welcome to use Open Source software in their products (and thereby benefit from the tremendous public effort spent on them), the community expects them to likewise respect the terms of licensing."

One quick thought about all this. There are people who are clearly getting deeper and deeper into the WRT54G code. What happens if they discover an exploit? [from: JB Wifi]




The American authorities are fond of requiring warning stickers on everything. Take for instance, McDonald's coffee cups with "Warning: May contain hot fluids". And yet cars seem to be relatively free of this apart from the famous "Objects in mirror may be closer than they appear".

I was leafing through the latest Wired just now admiring the adverts for GM and the Chevrolet Suburban among others and it occured to me that there really ought to be warning stickers on SUVs. Perhaps something like the old WWII Jeep stencils "Sudden turns equals sudden death". Or maybe "This vehicle's excessive fuel consumption may be responsible for war in the Middle East. Please use responsibly". Or how about "This vehicle is not a weapon".

Why do people buy SUVs? [from: JB Ecademy]




Bruce Sterling has an interesting article: Barbarians at the gate in this month's Wired. He cites 4 problems for Europe.
  • It lacks a real government. Managing Europe by remote control through 15 national authorities was unwieldy, but 25 looks downright impossible.
  • European countries that haven't yet been absorbed are in steep decline.
  • Europe is lightly armed. Although Europeans in general scorn the American cowboy tradition of blowing the living daylights out of bandits, the EU's fringe hosts its share of vigilante bloodletting - thanks to church-burning Balkan bandits, tin-pot dictators in Belarus, Albanian heroin gangsters, and cold-eyed al Qaeda theology students.
  • Today's children are the citizens of the future, and Europe has very few of them. While Asia's population spills out of its own borders to colonize the West, Europe's is aging and shrinking.

    He finishes with a comment I've raised before. "And it doesn't have to stay put on any particular continent, either." Where are the natural borders of the EU? I'd say the Urals, The Sahara and the Gulf. What would Europe be like if everything inside that was absorbed?
    [from: JB Ecademy]

  • Chris Lydon runs a radio show in the USA. He's been interviewing the A-List bloggers and putting the results up on his site as MP3s. Worth a listen if you have a few hours to spare. [from: JB Ecademy]



    1 to 20 of 3860