Boxes and Arrows: All About
Tuesday, December 10th, 2002Boxes and Arrows: All About Facets & Controlled Vocabularies: just a teaser for an upcoming series. Looking good!
Boxes and Arrows: All About Facets & Controlled Vocabularies: just a teaser for an upcoming series. Looking good!
Ironically the MIT DSpace Business Plan PDF is a 404.
Bliss Classification Association: a fully developed faceted classification system I didn’t know about.
tima thinking outloud. : Announcing MT-Meta: A Meta Data Plugin for MovableType. If I understand this correctly it uses the title text entry field to let you enter keywords. I am planning to get Taxomita to plug into MT on release 2.0.
I’ve used cloudmark for several months now, and it is no doubt the best anti spam software around. Their business model is brilliant as well: the consumer version is free, in return they get the collaborative anti-spam filtering of millions of people, and they sell the enterprise model that uses that intelligence. I hope it works out and they can keep the consumer version free, if not they’ll stand no chance against MS.
Carl Linnaeus, father of all taxonomy: “Before Linnaeus, species naming practices varied. Many biologists gave the species they described long, unwieldy Latin names, which could be altered at will; a scientist comparing two descriptions of species might not be able to tell which organisms were being referred to. For instance, the common wild briar rose was referred to by different botanists as Rosa sylvestris inodora seu canina and as Rosa sylvestris alba cum rubore, folio glabro. The need for a workable naming system was made even greater by the huge number of plants and animals that were being brought back to Europe from Asia, Africa, and the Americas.” (via Ben Hammersley)
My Google pagerank is now 7/10 (it was 6/10 a while back). I think it means that good sites link to me. Google ego striking :)
I updated the XFML software page and decided to put links ad stuff on XFML on this blog. The XFML.org page just contains official release announcements now.
And more reactions to Mark’s XFML post:
metaGarbage: “XFML is a new kid on the block and yet another metadata format, somewhat similar to RSS. I%u2019m not quite sure of what use it is to me at the moment, but there%u2019s a feed available.”
Sean McGrath :”Breaking out of rigid hierarchies with faceted classification and XFML. Doesn’t this look nice? [...] I see a bright future for XFML.”
Mickblog: “This is an interesting variation on RSS/RDF. It allows you to describe your site in a much more categorical manner and allows you to create more poweful navigation aids. I’m sure it does much more but thats good enough to be interesting.”
G10.log: “My question to Has is, “how expensive would it be to implement XML, XFML into a site to make the site’s content accessible to anyone?” There are many companies that fall into the $10,000 and under price tag for a site, is it possible to build some of these “advanced” technologies into a site without dramatically increasing the cost?” How about, like, for free?
qweb: promoting quality in Web interaction design. The only example there is very nice.
[BOT] Concept Dictionary. This is a set of topics and subtopics about drugs. Manually, a set of weigthed generating terms were added. A spider searches the web for stories related to drugs and automatically assigns topics to these stories. The end results gets exported in this XFML feed, which is the first known case of someone using the occurrence strength concept which lets you indicate how much trust you have in the occurrence. I say cool.
XFML (through Mark’s excellent post) is getting people thinking. The word imagine crops up regularly in these posts.
Heal Your Church: “Using a format such as XFML, or at least a much smaller node-like structure based upon an XFML element, the system then goes out and pushes the necessary information into our waiting queue, emails the appropriate moderator for final approval. No forms, no typos, no fuss, no muss. ”
Traumwind: “XFML, Lua and Traumtank
well, maybe not in that order… But that’s what is keeping me ticking these last days.”
Rowboat: “This has the data structuralist in me drooling! It’s like having that pile of Lego in front of me again!”
Column Two: “This is a really useful case-study that shows how faceted classification information can be converted into a range of navigation and searching tools, amongst other wonders. “
plasticbag.org: “If I was a better geek, this article on XFML at diveintomark.org would be fascinating, illuminating and revelatory. Instead I stare at it in desperation, terror and confusion as the words change and resolve themselves in front of me to read, “Rhubarb rhubarb rhubarb rhubarb rhubarb“. This is not the kind of thing I’m supposed to admit in public.”
What I’ve been working on.: “I want to build a service that allows individuals to monitor, on a daily or weekly basis, all official activity of their elected officials in Washington.”
Quiver’s QKS Classifier: “We wanted to compare the results from a Quiver test with those of a manual process and a rules-based categorization tool. This article describes the results [...]“. But the article is hidden behind a password. Frustration - they could at least give an overview of the conclusions!
Via Lee: Ten taxonomy myths
Ben Hammersley wants to know about your current taxonomy: “I’m interested in how you devised the taxonomy…is it just random words, in a flat structure, or is it based on a tree.” Me too. Go to his blog and share!
Microsoft works to create back-up brain: “Researchers at Microsoft%u2019s laboratories in San Francisco are working on ways to create a %u2018back-up brain%u2019 that will record and catalogue every picture you take, document you write and conversation you record.
[...]
The researchers recognise, however, that the biggest challenge will come with deciding on how best to organise the material. They are currently working on developing a taxonomy that will accommodate the huge range of associations and relationships the material will require.”
Content Inventory follow up: if you want to make a local copy of a site (really useful for working from home, on the train, or just having fast access), Offline Explorer is the best product I’ve found. It works through https, through a VPN, you name it, and the tech support is pretty good. And it’s $50 per user.
At a discussion with a bunch of smart IA’s a few days ago it was mentioned that IA’s often have lots of ideas but aren’t very good programmers (that’s often why they became IA’s - I know I did). So they can’t experiment much in real life, and ideas stay in that fuzzy cool idea-without-real-life-feedback stage.
I have the same problem developing Taxomita: I have a beta going but my coding isn’t top. So now I am experimenting with the old have-a-programmer-in-a-cheap-country approach. I will blog on my experiences, but Joel on Software is proving to be a great resource to set this up.
Other people keep doing a better job of explaining XFML than I do:
Simon Willison: “Mark Pilgrim has discovered XFML. He provides an excellent description of the standard, but fails to mention XFML’s most powerful ability; sharing metadata. (I believe Simon means connecting metadata) Here’s how it works: (follows excellent and succinct description)
[...]
This is just the tip of the iceberg - apply the creative global mindset that is the blogging community and who knows what will happen :)”
*Market Research*: “‘Market Research’ is an ongoing project that captures footage by deploying smart cameras — sensors, cameras and transmitters — within products in the ‘market’. The camera systems are triggered by the interactions of the user with the device — systematically collecting ‘evidence’ of the actual conditions of use. Once captured, footage used to evaluate assumptions embedded in the design of the products and the conceptualization of the market.”
Mark’s clear explanation of XFML got people thinking. I realize now I never did a great job of explaining it.
asterisk*: Yet another interesting technology…XFML: “But I’ll need to explore it more. That or have Brian, the Web producer of our team, who is great at researching this kind of thing, do the rest of the leg work for me.”
Webgraphics: “Mark explains XFML in the clear, cohesive manner that makes his site one of the best.”
Gimle: ‘[...] RDF for example is a very effective and powerful tool. The problem is that it’s too effective and powerful for what I want.
The cool bit only struck me today as I was browsing Dive Into Mark.
XFML.
Classic lightbulb scenario.
The XFML format provides you with an easy way of creating conceptual categories and topics for your website and then associate your webpages with the various topics it touches upon.” Clifton really gets it when discussing the topic linking capabilities of XFML: “That’s what I’d call proper intertextual contextualisation. This is classic Yin kind of power. Introverted, the primary focus is to know yourself (marking your data up properly, thoroughly and with care, this part can’t really be automated). Once that is done, the rest is easier and can be automated much more effectively than the content part.”
Jonathan Delacour: “Wouldn’t it be neat to have a central registry of Myers-Briggs Type Indicators for the inhabitants of our little corner of Blogaria? If you know your Myers-Briggs type, why not reveal it in a comment or send me an email? If you don’t yet know your type, you can take the Typology Test. I could create a MySQL database that stored each blogger’s name, URL, email address, and an entirely subjective description of their blogging style then create a PHP page to list the results. Perhaps Mark Pilgrim could summarize the hierarchical faceted metadata using XFML.”
Marek: “Mark Pilgrim shares an xfmllib library for Python, and explains XFML in a way that a human can understand.”
SmarterKids.com. Another shop using facets.
Simplicity vs. Innovation? But simplicity is innovation!
Even when a company creates a well thought out classification system, things often still go wrong. People put stuff in the wrong place, add a bunch of personal folders somewhere, and at the end of the day, a lot of stuff still can’t be found because it’s been misclassified or the classification system has been corrupted. Old style classification in real cabinets had the same problem: companies addressed this by making someone responsible for classifying all incoming documents, even though everyone had free access to take stuff out. How do we model a system so that things don’t get misclassified into an (otherwise) nice classification system?
My view: categories get internalized only by using them, or even better, creating them yourself. When you create a category chances are you’ll use it more or less correctly. When someone else creates one, the probablility of correct filing drops steeply. And the mental effort it takes to understand this categorization approach is big, especially because there are no direct rewards for you.
One approach that may work is distributed metadata. But that idea is in its infancy. So what do we do?
Amazon.com : Price “Too Low to Display” Explained: “this discount is calculated in the Shopping Cart”. Yeah right. Amazon is its usual nice self having a link to an explanation next to the marketing ploy “add to your shopping basket to see price”, but why then do they give us a dodgy lie: “Is calculated in the shopping basket”? Why don’t they just say it increases conversion rates - I’m cool with that.
John Robb’s Radio Weblog: “The reason adding P2P to weblogs will happen (perhaps sooner than most people realize) is that it will make it possible to publish original audio and video without spending the big bucks to host it.”
Jon’s Radio: “THE GOOD NEWS is that Office 11 supports XML Schema. The bad news is that XML Schema has been described even by XML experts as “confusing,” “impenetrable,” “fuzzy,” and “as user-friendly as a stick in the eye.”‘
The Scobleizer Weblog: “Linux copies Microsoft which copied Apple which copied Xerox. I guess when you copy UI’s too much they get ugly.”
Textism Google Hilite in PHP. Nice.
Mark Pilgrim is writing a python library for XFML: dive into mark - xfmllib: “XFML is a new format for providing hierarchical faceted metadata. Think of it as a way of expressing all the different cross-sections of a site. I couldn’t make heads or tails of it until a kind soul came along and mocked up an XFML representation of Dive Into Accessibility, my tutorial on web accessibility techniques. Then it all became clear.
[...]
You see, each tip in Dive Into Accessibility discusses a specific technique, the general design principles the technique embodies, the types of disabilities (expressed in the form of character sketches) that would benefit from its implementation, the web browsers involved, and (in some cases) specific instructions for implementing the tip in various publishing tools.
To express this in XFML, we first define six top-level facets: person, physical disability, technological disability, design principle, web browser, and publishing tool.
Within the person facet, we define topics: Jackie, Michael, Bill, Lillian, Marcus, and also Google, since many accessibility techniques directly impact search engine placement, and Google’s spidering bot can be thought of as a blind reader (a really, really voracious reader).
[...]
Now feed it into a portal-making script and it looks like a portal. Or feed it into a search-engine-making script and it looks like a search engine. That’s wicked cool.”
The Pages of Now & Forever - Star Control 2 : The Ur-Quan Masters: the story so far.
Jingle-Net is kind of cool, but I want to be able to (with some help) write text myself that then gets sung by a real person. I’d pay $15 to $25 for that. Where are those investors when you need them?
Mark added an XFML file to Dive Into Accessibility that describes the content metadata, which means you can now browse his excellent accessibility series in Facetmap by Person, technological disability, web browser, physical disability, design principle, publishing tool or any combination of these facets. Enjoy.
The map was done by Mark, first version by Albert de Klein following a crazy idea of mine.
Dave Winer: What is XFML?