03 Aug 2003 The XML specs say that you must reject a file that is bad XML. This is good for XML because it encourages people to produce good well formed XML. But our goal is the data not the spec. We're also likely to be collecting embedded rdf in web pages among other things, where the overall file may not be XML at all. So there's a need here for an "Ultra- Liberal RDF parser" like the one Mark Pilgrim has produced for RSS. see http://diveintomark.org/projects/rss_parser/ Most current RDF parsers are built on top of XML parsers which will reject bad XML so you'll never see the data.
If you find bad XML or bad RDF then tell the author. Don't just ignore it or dump it. They will probably be very thankful. And if you're generating RDF then check it with the W3C validator http://www.w3.org/RDF/Validator/ [ 03-Aug-03 1:04pm ] We've learnt a lot from RSS about how to play nicely if you're writing aggregators, spiders and scutters. All this is relevant to FOAF along with some other issues.
Bandwidth The first thing to be aware of is bandwidth. Both yours and the sites your collecting FOAF from. The simplest tweak is to support gzip encoding. foaf, rdf and xml files are essentially text so they compress well. And most web servers out there support gzip encoding if the client tells them that it understands gzip. This should be completely transparent to your code, but make sure your http collection toolkit supports gzip and uses it. See Mark Pilgrim's page about this. There are two standards for applications to make known to a client that data hasn't changed. etag and last-modified. Your scutter should store these against the feed data and put them back into the headers when requesting data. If the server is properly setup it should use these to return a 304 not modified header rather than the full file. See this tutorial on using them Even if you don't get these headers, it's useful to know if a feed has changed. One way is to generate an MD5 hash and store it. Then compare it next time you visit. Use HTTP Codes There are a set of well specified HTTP codes that servers generate. Use them. Particularly important are 301 Permenanent redirect and 404 not found. If the server is telling you that the data has permanently moved, use that fact. And if the file is no longer there, don't go on asking for it repeatedly. Mark Pilgirm has a set of test files for all the different codes. Play Nicely You're writing a robot. So pay attention to robots.txt Some sites, like Ecademy serve large numbers of foaf files. It's not fair on the server to hammer requests at them as fast as you can. This is effectively a denial of service attack and webmasters who notice this may well band you permanently. It's good practice to wait a second or two between successive requests to the same host. Note that some hosts use sub-domains like mine.domain.com, your.domain.com. So Wait 1 between calls to domain.com Refresh times RSS can change quite quickly whereas FOAF probably won't. So a refresh strategy is less important. One good approach is to start with a refresh of say 6 hours. If the data hasn't changed (see etag, last-modified) then double the refresh to 12 hours up to a maximum of a few days. If the data does change go back to your minimum refresh period. There's an RSS problem where lots of client aggregators all collect feeds at :00:00 after each hour. If they're clocks are reasonably well synced, popular sites will get hit by large numbers of requests at :00:00 So when you set up your cron job pick a random(ish) time in the hour to start. Caching the data and default pages I think a lot of FOAF applications are going to collect the data on the fly. This means that every hit on your application is going to generate a hit on the data source. Now given that you probably have lots of alternate pages to look at I don't think this is going to be that much of a problem. But think carefully about the bandwidth implications for the source. If you get slashdotted and you,ve coded well to withstand it, you don't want to slashdot the source in turn. Probably the most obvious danger is having a default page with a link to a default source. The alternative to this is to cache the data locally. On the surface this looks like a good approach for several reasons. But it leads you into a set of issue to do with how current the data is. And these issues are still being worked out. Take an example. Say you collect data that says that A knows B. A week later A falls out with B and they both remove the foaf:knows statement. Your data is now wrong. Mirrored from the Ecademy FOAF club 02 Aug 2003 TypePad News: TypePad Features and Pricing Note the auto-FOAF support. [from: JB Ecademy]
[ 02-Aug-03 3:10pm ] Here's one to worry about. What legal responsibilities do hotspot owners have for activity conducted by their customers?
This is prompted by the discovery that Wayport run an open email server for their customers and T-Mobile intercept email destined for other servers and route it via their own email server (I think). Both of these are reactions to a real need for their hotspot customers to send email. But both are effectively open relays with no authentication. How much information and logging do they really have in place to positively identify the customer? The real answer here is for the mainstream ISPs to support SSL and STMP-AUTH so that their paid up customers have access to email services no matter how or where they connect to the internet. At home, work or from a hotspot. Until then hotspot providers need to provide some solution so that hotspot users can send email. But just giving blanket access to anyone using the hotspot is not the answer. [from: JB Wifi] [ 02-Aug-03 9:40am ] How we all laughed at the Americans and their litigious society when the RIAA started sending out sub-poenas to their customers and asking for and getting private information from the ISPs. And their politicians proposing that if you share music files you should go to jail and your computer was fair game for the RIAA to hack and destroy. Thankful of course that "it could never happen here".
Well folks, the legal underpinnings for exactly the same crass behaviour designed to protect dinosaurs from extinction will be in place in the EU and UK in October 2003. The BPI are already rubbing their hands at the possibility of being able to go after their customers using legal means. When music ("with a repetitive beat" c 1984, M Thatcher) is made illegal, the goal of total criminalization will be complete. [ 02-Aug-03 8:12am ] 01 Aug 2003 A website idea. Start with a map and location search. Add the ability for people to upload a photo from their cameraphone with the street and number. Have a Wiki like history showing the current photo and the archive of photos of that location. Allow people to annotate the photo with text about that location. So I could then go to the website find 56 Camden High Street and then look at photos people had taken of that shop and the other shops near it in the street.
Could we use this to harness the power of the masses to build a photographic record of the city? This is one of those "ought to exist" ideas that I dn't have the time (or maybe the ability) to make happen. Feel free to steal the idea and make it your own. I've picked London purely because I live near there. It could easily be extended to other cities and towns. [from: JB Ecademy] [ 01-Aug-03 3:10pm ] WRBB Wireless Rural BroadBand - Sunshine Wideband Internet is a WISP starting up in Rutland radiating from STamford and using 802.11G [from: JB Wifi]
[ 01-Aug-03 1:10am ] 31 Jul 2003 If a standard remains write-only and it's too hard to use it will remain academic wanking and never reach critical mass. To my eyes, this is currently true of RDF and almost every individual RDF namespace with the possible exception of DC (Dublin Core). For application writers (as opposed to tool makers) each namespace is a standard in it's own right that also has to reach critical mass.
If all you want to do is encode data that you and only you are going to consume later, then RDF is extraordinarily powerful. Mozilla is a good example of this. It uses RDF extensively internally to manage defaults. But if you want to try and create something for really widespread adoption, RDF is so flexible that it encourages fragmentation. And so I don't believe you will ever get that widespread adoption. Like perl, there are always multiple ways of saying the same thing so people use multiple ways of saying the same thing. And so the people building apps to consume that data have an impossible task. RSS is the prefect counter-example. You can view source and derive enough hints to write your own. But even though 50% of it is bad XML, and 50% of that is bad RSS (I exaggerate) and we needed ultra-liberal, non-validating parsers, it's simple enough to use. And despite the RDF people trying to generalise it, we never lost sight of channel, item, title, link, description; the bits that are actually just plain XML. The end result was *the* most widely implemented metadata standard. If FOAF is ever going to get similar adoption it needs to learn from this. I keep going round and round this problem. I understand that the authors of FOAF are trying to satisfy many different use cases. And I understand that the people generating RDF files with some FOAF are busy using 50 or so namespaces for their own private requirements and extending them as they need to. And it's RDF not XML so it's only the triples that matter so what appears to be awkward layout is irrelevant. But the barrier to entry and the difficulty of being able to parse and extract the expected data from some arbitrary RDF file with some FOAF tags means that I don't think FOAF is going to go mainstream. And I really don't have the stomach for a breakaway in the style of PIE/Atom/Necho. What to do? FOAF is too good an idea to waste. [ 31-Jul-03 4:37pm ] 30 Jul 2003 Update on Linksys WRT54G hacking attempts. O'Reilly Network: Is Linksys shirking the GPL? [July 30, 2003]
- The firmware production and update process is now open. - They've got a shell. - It looks like the WRT45G contains and uses modified GPL code, but the code updates don't appear to be available which is required under the GPL. "I hope we can get through to these large manufacturers that while they are certainly welcome to use Open Source software in their products (and thereby benefit from the tremendous public effort spent on them), the community expects them to likewise respect the terms of licensing." One quick thought about all this. There are people who are clearly getting deeper and deeper into the WRT54G code. What happens if they discover an exploit? [from: JB Wifi] 29 Jul 2003 The American authorities are fond of requiring warning stickers on everything. Take for instance, McDonald's coffee cups with "Warning: May contain hot fluids". And yet cars seem to be relatively free of this apart from the famous "Objects in mirror may be closer than they appear".
I was leafing through the latest Wired just now admiring the adverts for GM and the Chevrolet Suburban among others and it occured to me that there really ought to be warning stickers on SUVs. Perhaps something like the old WWII Jeep stencils "Sudden turns equals sudden death". Or maybe "This vehicle's excessive fuel consumption may be responsible for war in the Middle East. Please use responsibly". Or how about "This vehicle is not a weapon". Why do people buy SUVs? [from: JB Ecademy] [ 29-Jul-03 1:40pm ] 27 Jul 2003 Bruce Sterling has an interesting article: Barbarians at the gate in this month's Wired. He cites 4 problems for Europe.
He finishes with a comment I've raised before. "And it doesn't have to stay put on any particular continent, either." Where are the natural borders of the EU? I'd say the Urals, The Sahara and the Gulf. What would Europe be like if everything inside that was absorbed? [from: JB Ecademy] Chris Lydon runs a radio show in the USA. He's been interviewing the A-List bloggers and putting the results up on his site as MP3s. Worth a listen if you have a few hours to spare. [from: JB Ecademy]
Flash Mob London Soho Main Site Aug 7th [from: JB Ecademy]
[ 27-Jul-03 9:05pm ] 25 Jul 2003 There's some cold hard facts here for hotspot operators and networks. VC Returns -- Do They Exist in Wi-Fi? :: AO : The fully loaded cost of offering free Wi-Fi access is less than $6/day. Operating a billable hotspot costs over $30/day. Now if we understand that the $6/day includes bandwidth costs. And that bandwidth would have been available anyway for the use of the venue, we get down to maybe $4/day.
So what we have is that WiFi at a hotspot is either too cheap to charge for or too expensive to sell. Tricky. Especially if you're doing a competitive analysis of pricing. It would be easy to compare WiFi pricing with notional 3G pricing. But the real competition is the people giving it away either for free or free with a purchase from the venue (buy a sandwich get 30 minutes). As AO put it, "By viewing Wi-Fi as an entirely new network, the mobile carriers have exaggerated Wi-Fi's profit opportunity. Realistically, Wi-Fi is simply an extension to the high-volume, low-margin IP-over-Ethernet business, not a new, large profit opportunity." [from: JB Wifi] I was mulling over this morning's crop of emails and came up with this. There's a mismatch here that has spilled over into technical discussions about RDF, XML parsers and such like. I'd like to present two views of FOAF. They're not mutually exclusive and there are others.
1. FOAF is RSS for people. I'm not the first and I won't be the last person to come to FOAF and go "Wow, this is RSS for people. That's totally awesome. I want to play." This approach says that FOAF is potentially an easily writable, easily parseable standard format for coding up information about people and the relationships between them. And since people and their relationships have huge applicability, this could end up being very widely implemented. Bigger than RSS. And that will enable people to build all sorts of apps that we haven't thought of yet. Like the potential to let us build de-centralized versions of the social networking sites that are "This Year's Big Thing" or at least to link them together. And to link all the blogs together. And to... 2. FOAF is yet another namespace. A number of people working in the RDF metadata area found there was a need for a bunch of RDF classes and properties to describe people and the relationships between them. There didn't seem to be anything useable in existing namespaces and there weren't any that could be obviously extended in this direction. So they created one. FOAF will get used by people who find RDF useful and need a namespace like it for the data they're working on. As RDF grows and becomes more widespread, so will FOAF. And because people and their relationships appear in a lot of problem domains, the FOAF namespace will probably get used a lot in RDF data. Incidentally. If you're following this, I can strongly recommend reading the following to help understand the issues. Missing isn't broken: data validation and freedom on the Semantic Web Parsing FOAF with PHP [from: JB Ecademy] [ 25-Jul-03 3:10pm ] 23 Jul 2003 From a press release from Airpath
Internet Magazine - News/Advice/Reviews/ISPs/Hosting BT Openzone customers will soon be able to use WiFi access points in the US, thanks to a new deal between BT and Airpath Wireless. Airpath has already secured 1,500 hotspots in the US, and plans to have 4,600 in place within a year. BT customers will initially have access to 350 of them in September. BT customers can already access 660 hotspots in Scandinavia, France, Belgium, Thailand and London, thanks to a roaming agreement with European operator TeliaSonera. www.bt.com/openzone www.airpath.com British Telecommunications (BT) and T-Mobile have also joined an Asian consortium called the Wireless Broadband Alliance (WBA) that offers WiFi roaming in the Far East. [from: JB Wifi] [ 23-Jul-03 6:40pm ] 22 Jul 2003 Just had a press release from MyZones - The world's first integrated Wi-Fi Internet Service Provider
"On 25 July 2003 changes to part of the EU Regulatory Framework (or "Telecoms Package") will allow MyZones to provide full commercial Wi-Fi broadband services. Changes to this act will allow our customers to act as resellers of their own Wi-Fi broadband connections. Share the connection, share the price. Share your broadband over Wi-Fi with one person and pay under £18 each per month, share with two and pay under £12 each per month, share with three and pay only £8.81 a month! Why not have Wi-Fi installed during the summer? ….enjoy the freedom of broadband internet access without the wires, connect any device within 300 feet... Surf in the garden, connect your second PC, share your broadband with your neighbours, lose the wires, and be free." [from: JB Wifi] [ 22-Jul-03 9:40pm ] We've now got 10 people with a link to an external FOAF file in their profile. I've had to remove the data for a few people who obviously didn't understand what that is as they had put in their web home page.
If you want to play, take a look at Foaf-a-matic, then go to the profile edit page and put in the url of the file you've just created under "External FOAF file" just above the field for fifty words. [from: JB Ecademy] [ 22-Jul-03 2:40pm ] 21 Jul 2003 CATTERICK VILLAGE GETS BROADBAND
04 July 2003 The first phase of the community broadband network in Catterick Village is now installed and ready to offer a super-fast Internet connection to homes, businesses, and the new LearnDirect centre. Community champions, local business people, and the DigitalDales.co.uk team have worked together to raise the necessary funding locally and to bring broadband to the village using the latest satellite based services together with innovative wireless technology. Wi-fi, the wireless technology used, eliminates the need to use the telephone system for delivering broadband to each user, and means that a community network, run by the community for the community, can be set up without requiring BT or other telcos involvement. In Catterick Village, Purple Technology Limited, a local business, provides the broadband connection to the Internet using Aramiska's unique satellite system. A key feature of the system is its scalability, allowing upgrades as more people subscribe to use the network. Community networks are springing up around the country in response to market failure, and such networks create local employment, keep investment within the region, and encourage new enterprises. The initial capital investment required is minimal, meaning that even the smallest rural community can create their own sustainable broadband network. The Catterick Village network is a flagship project, leading the way for other regional projects to bring broadband and its benefits to rural areas in the near future, enabling communities, and encouraging regeneration. Phil Upton, Managing Director of Purple Technology, said "Broadband allows both business and residential communities in Catterick Village to have the same advantages as larger towns. We now have educational and training opportunities, as well as being able to compete as businesses in a global market place." Digital Dales is holding a conference on July 17th at Harlow Carr Gardens in Harrogate entitled "Broadband in Yorkshire". This includes updates on the developments in the region, the central Government rural broadband agenda, news from industry and communities, followed by the DIY Broadband workshop. This workshop covers all the required elements for setting up your own community network - funding, business models, technology options, licensing, case studies, best practice, and much more. All are welcome. Register now on digitaldales.co.uk/broadband The Digital Dales team spokeperson, Lindsey Annison, who is also co-founder of the Access to Broadband Campaign, said, "This is a really great achievement in Catterick Village. In just 3 months, the initial funding has been raised, the village surveyed, and the equipment installed. This should give hope to all rural communities that broadband is available to everyone, now." The LearnDirect centre in Catterick Village main high street will be opened by the Rt Hon William Hague on Saturday 5th July at 10.30am, Purple Technology and the Digital Dales team will be on hand to answer questions, advise and encourage other communities to follow the lead of the broadband pioneers in Catterick Village. FOR EDITORS Purple Technology is a local business based in Catterick Village offering a range of services to the region including Sales & Marketing, Information Technology & Software Development and Management Training Courses. Digital Dales is a community project started in the Upper Dales, with a national reputation. The team of local people is responsible for organizing national broadband seminars, promoting and educating businesses and citizens about broadband, and installing wireless broadband networks with its partner organisation Digital Fells, e.g. the DTI/NWDA-funded EdenFaster (www.edenfaster.com) project in Upper Eden valley. Investigations are now under way into providing broadband access along the Wensleydale Railway. Lindsey Annison, Program Manager for Digital Dales, is also co-founder of the Access to Broadband Campaign (www.ABCampaign.org.uk), and will be speaking at the ABC Rural and Regional Broadband conference in London on July 17th. Aramiska is a European Internet Service Provider for businesses. It offers Internet services via satellite, based on the advanced two-way open standard DVB-RCS (Digital Video Broadcasting-Return Channel System). Aramiska controls the end-to-end connection and is available everywhere. Aramiska's head office is located in Eindhoven, the Netherlands with local offices in UK, France and Spain. Funding for the Catterick Village project has come from the LSC, European Objective 2 Priority 3 and the Vital Villages programme. Contact Details: Purple Technology - www.purple-technology.biz (info@purple-technology.biz) Phil Upton - 01748 812547 Digital Dales - www.digitaldales.co.uk (team@digitaldales.co.uk) Lindsey Annison - 07967 670759 Joel Smith - 01423 712591 / 07768 803758 Catterick Village - Community Works Office (rose@communityworks.fsnet.co.uk) Melva Steckles - 01748 810050 Aramiska Broadband - www.aramiska.com (p.gumm@aramiska.net) Peter Gumm - Marketing Communication Manager 020 8313 7717 [from: JB Wifi] [ 21-Jul-03 9:10pm ] |
The Blog


