Archive for November, 2003

Sunday, November 30th, 2003

Spanish blog by a friend of mine - give her some traffic!

Saturday, November 29th, 2003

Looking for a hosting company that lets me host multiple (smallish) sites, the usual goodies but nothing special (multiple mysql databases, …), and costing significantly less than 500 US$ or 350 UK Pound a year. Recommendations welcome!

Saturday, November 29th, 2003

What happens if millions of new pictures are posted to the web every day? The pictures could be auto-connected by non-subject metadata like location or timestamp. What is the value of a picture without the story? What will happen to abandoned picture-collecting websites? Is there a standard way to embed metadata about a picture in the picture itself, so it doesn’t get lost?

Thursday, November 27th, 2003

I like Livia’s new homepage.

Thursday, November 27th, 2003

Livia (from Brazil) writes a comment to my Racial and Ethnic classifications as an example of classification challenges post: “Not because it’s a bad or a good idea, but how they simply cloned it from the US, even though it makes absolutelly no sense to our population or ethnical background. We don’t have a racial problem in Brazil (we have a wealth distribution problem) and it really upsets me that they are turning it into something it is not.”

Classification systems really have a tendency to stick around, even when they’re no longer useful or just not applicable to the situation (as in Livia’s example).

Wednesday, November 26th, 2003

TeledyN: The End of RSS: can it scale? Sure - if aggregators play nice. Can’t they be forced to play nice? Slashdot already does something like this I believe.

RFP: XML format for timelines

Wednesday, November 26th, 2003

I’d like to see a simple XML format to express timelines with events on them, and a few tools to create this XML, turn them into Flash, HTML and such. The format should allow for merging timelines.

(Just some ideas:) A timeline exists of a StartingPoint, an EndPoint, and Events in between. Each Event is optionally identified by one or more URI’s (for merging), has a required StartingPoint (datetime) and an optional EndPoint (datetime, for events that happen over time). Events can be nested within events. Each event can have a URL (semantics: where to go when user clicks on event), a title, a description and an image.

We should be able to express things like conferences, or a personal timeline in this format.

The format should allow merging: time is universal. The events can be merged based on having at least one URL that’s the same. We should be able to merge your description of a conference (with your blog entries as events for example) with my description of the same conference.

Anyone interested in working out such a format?

Wednesday, November 26th, 2003

Some links about the international coffee trade:

The Campaign to Humanize the Coffee Trade:
- “The world trades more coffee than any commodity except petroleum (and illegal drugs).”
- “Starbucks buys a miniscule amount of its coffee from the Fair Trade system—less than 0.1 of 1 percent of all the beans that Starbucks buys. But, he says, don’t blame the company for that. Smith says the problem is that Fair Trade activists are trying to sell coffee that’s not always very good. He says Starbucks planned to buy—but then rejected—some shipments of Fair Trade coffee last year, because the beans didn’t meet Starbucks’ quality guidelines. [...] the company makes virtually the same profit, whether it sells beans stamped “Fair Trade” or not.”
- “‘One needs to choose,’ she says slowly, searching for just the right words. ‘You have only so much time in your life, and so you need to choose your issues. You need to choose the things that you want to be passionate about, the things you want to care about, give your money to, give your attention to.’”

The Campaign to Humanize the Coffee Trade:
- In order to get to that meeting, I just mentioned between Denaux and the farmers - a two-hour meeting - we had to drive for 10 hours over bone-crunching mountain roads. That trip didn’t unearth any scandals. But now the Fair Trade coffee movement has a face.

Child victims of coffee trade wars

Global Exchange : Fair Trade Coffee: “To become Fair Trade certified, an importer must meet stringent international criteria; paying a minimum price per pound of $1.26, providing much needed credit to farmers, and providing technical assistance such as help transitioning to organic farming.”

Wednesday, November 26th, 2003

Hey Googlebot!

Racial and Ethnic classifications as an example of classification challenges.

Wednesday, November 26th, 2003

If you want to learn about some of the specific challenges in developing taxonomies, have a look at the racial and ethnic classifications used in the US census. The development, evolution and discussions around this taxonomy highlight many of the problems you can encounter on a smaller scale when developing taxonomies for websites. These problems are inherent to what it means for us to classify. There’s no way around them.

In october 1997, the Office of Management and Budget (in the USA) announced the revised standards for federal data on race and ethnicity. The taxonomy is as follows:

Please choose your race (one or more):
- American Indian or Alaska Native
- Black or African American
- Native Hawaiian or Other Pacific Islander
- White
- Some Other Race
Please choose your ethnicity (only one):
- Hispanic
- Non-Hispanic

Hispanics can be of any race (so you can choose Black and Hispanic). The Some Other Race category was introduced in the census 2000 questionaires, not originally part of the standard taxonomy.

One could write a book about this taxonomy. I’ll try to keep this short and funky.

In 1977, the taxonomy was like this:

Please choose your race (only one):
- White
- Black
- American Indian and Alaskan Native
- Asian and Pacific Islander
Pleace choose your ethnicity:
- Hispanic
- Non-hispanic

Back then, the racial categories were considered scientifically valid and mutually exclusive. Obviously, things have changed since.

In 1990, “Other Race” was added, but the biggest change is that people can now choose more than one race. In the 1990 census, half a million people ignored the instructions and checked more than one box. Something had to be done. Imagine being a kid with parents of mixed race.

One result is that data from the 1990 census cannot easily be compared with data from the 2000 census. This is nothing new. Almost every census for the past 200 years has collected racial data different than the one before it, and extracting racial trends is deeply problematic.

Change in taxonomies is something we need to prepare for. It means we will not always be able to effectively compare data over time. It also means we should avoid building the taxonomies we expect to change (and most will) too deeply in the infrastructure of our websites (say, URL’s or database schemes).

Of course, a racial taxonomy is deeply suspect. Scientist these days generally agree that race and ethnicity are social constructions. Humans cannot be categorized in a taxonomy of races based on biological information in a scientifically valid way. However, race continues to be a social reality in the US. It is this social reality that the taxonomy is trying to capture. Since the social reality changes, the categories will continue to change. Are you recognizing any of this in your own work yet?

This is one reason why people are asked to self-categorize. In the past, census enumerators were instructed to report a person’s race based on observation - you can imagine the problems.

Self categorization of course has many problems: people may percieve their choice of race to have some influence on their future (job availability), which can affect their choice. And many people have only limited awareness of their own geneaology - they may not know what race their are supposed to be.

The race categorizations are heavily discussed and disputed every time they are changed. Many political groups argue for or against certain changes in the taxonomy.

The reason is simple: the categories have an impact on policy. If a certain group isn’t categorized in the taxonomy, they can’t be easily measured, and it becomes much harder to lobby for certain changes that should benefit that group. For example, for the 2000 census many advocacy groups for racial minorities encouraged multiracial people to check only a single race (the minority race). Classification is political, and if you’ve ever worked for a large company trying to implement an intranet, you’ll recognize this.

There is much (much!) more to say, and I feel bad for only touching briefly on such a fascinating topic, so here’s some bed-time reading to get you started:
- Recommendations from the Interagency Committee for the Review of the Racial and Ethnic Standards to the Office of Management and Budget Concerning Changes to the Standards for the Classification of Federal Data on Race and Ethnicity.
- Racial and Ethnic Classifications Used in Census 2000 and Beyond
- Using the New Racial Categories in the 2000 Census

Tuesday, November 25th, 2003

I got Joe’s picture as well.

Monday, November 24th, 2003

Like a look in the future, but happening today: anti-mega: keitai.

Monday, November 24th, 2003

Metadata Generation Research Project:
“The metadata generation research project is developing a model that will facilitate the most efficient and effective means of metadata production by integrating human and automatic processes.”

Monday, November 24th, 2003

Short film wows voters in seconds: “A surreal 15-second black movie comedy about an escapologist has won a short film contest.”

Monday, November 24th, 2003

Posted in General | No Comments »

Monday, November 24th, 2003

Orange Cone: A photogeoblog sketch: I like these stories that envision how things might work.

Monday, November 24th, 2003

Sam points out on Afongen that O’Reilly have a Blog This button, for example on this page.

Clicking it shows a popup window that lists some code and suggested text to paste in your blog entry. I don’t like it.

blogthis

Monday, November 24th, 2003

Bloug: “So a modest proposal: what if everyone involved in content management–the publications, the web sites, the meetings and conferences–banned CMS vendors for, say, one quarter? No vendor exhibitions at meetings, no product mentions on discussion lists, no CMS purchases, no nothing. Just discussion about all there is to content management besides the technologies.”

Lou is right: the CMS discussion is dominated by the CMS vendors - that needs to change.

Monday, November 24th, 2003

Many-to-Many: Otlet: Some ideas die because they are wrong: “The failure of universal subject classification working in concert with the mutable forces of scholarship didn’t happen because that idea fell out of fashion - it was fashionable as recently as 1998, with people being paid fabulous sums of money to pursue it. It failed because it does not work.”

Monday, November 24th, 2003

Simon Willison implemented daily links at the top of his blog. I really like the CSS treatment of the visited links: in a dense list like this it makes sense. The strikethrough links are the ones I just visited:
blogmarks.gif

Monday, November 24th, 2003

A newborn: InformationScienceTheoryWiki

Sunday, November 23rd, 2003

On the drawingcenter.org, I saw these directions:

If you are traveling by subway, you can take the trains to the Canal Street station.

In the code, the trains are a bunch of image tags. This can be easily made acessible by adding ALT tags, but I wanted to try something more semantic (not sure if it’s useful, just for fun, after my BLOCKQUOTE experiments).

My best take was to use SPAN tags and the letters of the trains, but I couldn’t get the CSS to replace the letter with the image… I’d be thrilled if someone could crack this!

Sunday, November 23rd, 2003

James pointed out that the Drawing Center have an exhibition about Mark Lombardi (runs until December 18). Check it out if you are in NYC and interested in social network mapping.
lombardi_01art.jpg

Saturday, November 22nd, 2003

(via Danny Ayers) SchemaWeb - RDF Schemas Directory: “SchemaWeb is a place for developers and designers working with RDF. It provides a comprehensive directory of RDF schemas to be browsed and searched by human agents and also an extensive set of web services to be used by RDF agents and reasoning software applications that wish to obtain real-time schema information whilst processing RDF data.”

Saturday, November 22nd, 2003

I got Tom’s picture as well now!

Saturday, November 22nd, 2003

I found a picture of Dare Obasanjo, still looking for Joe Gregorio, JayT, and Tom Hoffman.

Friday, November 21st, 2003

Center For the Ethnography of Every Day Life: “Before the abstractions of social science, there are people’s stories, the emotional worlds of disappointment and uncertainty, and the brave coping of everyday life. Established in 1998 with a grant from the Alfred P. Sloan Foundation, the Center for the Ethnography of Everyday Life fosters research and training to document the challenges of American working families. Working people, everyday lives explored in the tradition where ethnography and documentary come together.”

Friday, November 21st, 2003

Hey, so it’s ugly, but that’s all my fault. Really. Thanks to the excellent Bloglines service that I’ve been using lately (it just works for me), I’ve got a blogroll!

Friday, November 21st, 2003

Jon is experimenting with automated categorization of blog posts. XML.com: Working with Bayesian Categorizers: “There’s been some discussion in the blog world about using a Bayesian categorizer to enable a person to discriminate along various interest/non-interest axes.”

Friday, November 21st, 2003


Seb’s Open Research: “I’m sure Marc will love the way it’s crafted.” I’m not so sure, but I’d love some feedback. I suck at CSS (just look at the source of this page!).

Thursday, November 20th, 2003

If you want to order prints of your digital pictures online, I can really recommend Shutterfly. You can order 15 prints for free when you sign up, just to try them out. Although I found myself ordering a lot more on my first order because it was so easy.

Thursday, November 20th, 2003

Victor coins the taxonomy dance.

Basic level categories

Thursday, November 20th, 2003

I’ve been waiting for someone to write about basic level categories as they relate to information architecture. No luck so far (apart from a 1999 Peterme post in which he says: ‘the trick would seem to be to get people to the basic-level as quickly as possible.’). So I’m picking this up again. There’s gold in them mountains folks!

Coginitive science has been making many discoveries about how humans categorize, like: that categories have fuzzy boundaries, that members of a category may be related to one another without all members having any property in common (this is called Family resemblance), that some members of a category may be �better examples� than others (this is called centralicity), and most interestingly, that categories are organized into a hierarchy from the most general to the most specific, but the level that is most cognitively basic is �in the middle� of the hierarchy. These categories in the middle are called basic level categories.

For example, “cat” is a basic level category, “feline” or “Siamese cat” are not.

Basic level categories have some characteristics that make them interesting for information architects:

- Things are remembered more readily at basic level.
- People name things more readily at basic level.
- The basic level name for things is learned earliest in childhood.
- Languages have simpler names at basic level.

In short, people naturally, at a deep cognitive level, deal easier with basic level categories.

It is important to understand that basic level categories are not just easier on a superficial level, because they are shorter or something. Cognitive scientists say that basic level categories are cognitively real. They seem to be ingrained in the human mind somehow, in a way that makes it easier for us to deal with basic level categories.

Does this mean that information architects should be aware of the basic levelness of the categories they use? I think so, but I’m not sure how exactly. Remember that basic level cateogories are processed more easily, faster. That has got to mean something to us!

The only research I found about basic level categories in information retrieval is Using ‘basic level categories’ to retrieve multimedia from the World-Wide-Web Hoenkamp, E.C.M. (1999). Proceedings of the 21st Annual Conference of the Cognitive Science Society, 1999, 796.

Other interesting things I came across while doing research for this:
- one user�s classes are another user�s attributes.
- To test whether a category member is more or less central to the category, you can ask a series of questions, compare how long it takes people to answer.

Learn more:
- It’s all Eleanor Rosch’s fault, well explained by George Lakoff’s in Women, Fire and Dangerous things.
- More goodies.

Thursday, November 20th, 2003

Boundary objects are everywhere. Back in October, Denham Grey wrote about Boundary objects and KM. Judith Meskill adds a long and yummie list of research links.

Thursday, November 20th, 2003

Interaction design is discovering boundary objects: Shared boundaries. (Interaction Design Hub): “It is quite obvious that ours is a “community of interest” rather than a “community of practice”, and that “boundary objects” abound.” Sweet.

Wednesday, November 19th, 2003

A keeper: How to shop for a house.

Wednesday, November 19th, 2003

It just occurred to me we should have a tag to indicate where in a quote we are doing this: “[the program] crashed twice” (it’s not paraphrasing - what is this called?)

Wednesday, November 19th, 2003

Search Engine Decoder: who provides what information to who?

Wednesday, November 19th, 2003

Good Experience - The ROSE framework: “Business results are metrics that the CEO can understand.” Must be a typo!

Wednesday, November 19th, 2003

Kottke is redesigning his site, and trying out giving different types of content (book reviews, links, …) a different look on his homepage.

Jason identifies 5 content types on his site:

- Movie Review
- Book Review
- Remaindered Links (”hey look at this thing”)
- A Comment (a comment posted on another site)
- A Regular Post (Jason says this means “here’s what I think about this thing”. It’s really an ‘everything else’ category - classify here if a post fits in none of the other categories.)

Any given post can be easily classified in 1 and only 1 of these 5 categories - there are no obvious overlaps, especially if we treat A Regular Post as an ‘everything else’ category. No post will belong to more than 1 content type.

Next, apart from fields that are generic to all content (author, creation date, …), different content types each have different fields, “microcontent-specific entry fields”. For example, a movie review has a title, a link, a rating, a photo, and some text.

The discussion talks about how this can be supported by weblog tools, while staying generic enough. I’ve been doing this semantic stuff for the past year at my job, and it’s hard, but doable and useful.