Archive for April, 2007

1-page websites

Monday, April 30th, 2007

I realized today that the apps I use everyday are all 1-page websites. Gmail. Bloglines. Google search. Actually, Google is the king of the 1-page websites - almost all their products consist of only 1 page.

Twitter is 1-page, because there is 1 page that you spend 90% of your time on. Flickr is multipage, although it’s main function (watch photos) is 1-page. Digg is 1 page. Mmm…

Monday, April 30th, 2007

But what bothers me about these immersive worlds (isn’t there a better name?) is that they’re all supercommercial. Why would that be?

Monday, April 30th, 2007

Damn, these “immersive worlds” in the browser are popular.

Monday, April 30th, 2007

A good blogpost on what went wrong internally with the new Yahoo Mail.

links for 2007-04-29

Sunday, April 29th, 2007

Sunday, April 29th, 2007

You know this problem in IA when you design sites without real content, and before you know it there are loads of excerpts all over the page that don’t really mean anything, ending in 3 dots “…”? It leads to a homepage like this one for example, just lots of excerpted content that doesn’t really do much for anyone.

I got a word for that. Excerptitis. Maybe you have a better one?

Sunday, April 29th, 2007

Comments work :) It’s been a while!

Sunday, April 29th, 2007

Ah, I messed up the site, but now it works again, and THE COMMENTS WORK! (excuse the all-caps)

The comments work! Yey!

Sunday, April 29th, 2007

April was not only the hottest, the dryest and the warmest April ever in Belgium, but most likely (if there’s no rain tomorrow) it will also be the first month every without any rain at all.

So far so good, life in Belgium :) The weather has been incredible.

A bunch of presentations on scaling websites: twitter, Flickr, Bloglines, Vox and more.

Sunday, April 29th, 2007

(I changed the title because “top 10″ posts are indeed sucky. Also: looking for my colombia travel site?)

By the way, here’s the RSS feed of my blog, in case you’d like to subscribe.

I always love to read scaling discussions, especially about popular web apps, and there are loads of them out there. Here’s my overview of the best. By the way, the best book on scaling apps I’ve ever read is Building Scalable Websites, by Cal Henderson (the Flickr guy).

It’s dog-eared on my desk, and taught me about sharding (which I used extensively for mefeedia). Sharding is when you cut a really big table into pieces, so you can put those on separate servers. It means you have to make changes to your code, and your database isn’t so database-y anymore, but it works. For example, online games use sharding to grow their virtual worlds, because there’s no way they could serve all that information from 1 db cluster.

Scaling Twitter with Ruby.

Twitter is hot today, and they ran into some serious scaling problems, although the app itself is quite simple. It consists of messages of maximum 140 characters. Lessons are the same as most apps: Memcache like crazy, and optimize the database (the biggest bottleneck most of the time).

Also, Ruby on Rails scales pretty much the same way as PHP and other similar languages: shared nothing architecture. Shared nothing means that there is no 1 thing that is shared by all servers, since that would become a bottleneck.

PHP, for example, has shared nothing architecture out of the box, except perhaps for sessions, but that’s easily solved by storing sessions in a db (which then has it’s own scaling approach) and not in the filesystem. Here’s a talk by Rasmus Lerdorf that explain scaling with PHP5. (Here’s the mp3 audio recorded by Niall Kennedy).

Blain Cook made this presentation:

Scaling Flickr.

Cal Henderson wrote the above book, and also has a good presentation: Scaling Flickr slides as PDF’s.

One of the problems you get into when scaling something like Flickr where you store LOTS of stuff, is that you can’t just store that on a harddrive anymore: it’s not big enough. Apart from just using Amazon’s S3 service (which rocks - I used it for mefeedia and I know lots of startups who use it), there are other solutions. A good presentation of that by Cal is this one:

Cal (he’s a busy dude) also made this presenation about scaling web apps, generally:

John Allspaw (flickr plumbr) also has a good presentation about scaling Flickr:

Scaling LiveJournal.

LiveJournal was one of the first social networks, before that word meant anything, and they’ve partly invented how to scale standard php/mysql/apache apps. They developed memcached, which is now used by almost anyone who wants to scale their site.

Brad Fitzpatrick has a good set of slides on how they evolved the service, here’s a PDF version. And here’s the slideshow embedded:

Kevin Rose mentioned this was “the bible for scaling Digg” - and I think quite a few other web apps are based on this.

Six Apart.

The livejournal guys with all their scaling expertise were acquired by Six Apart, and they soon launched Vox. And of course, here’s a presentation on making Vox scalable:

Bloglines.

Bloglines’ scaling problems where slightly different from your average web app, since they are an aggregator of feeds. That means they have billions of blogposts they have to keep and serve to users, and that creates its own scaling problems. The Bloglines approach was to, instead of using a database, just store all that stuff in a special filesystem. Today it’d be easier to do this since there are a few filesystems that do that, or you could just go with S3 again. Mark Fletcher (who also sold Onelist to Yahoo which is now Yahoo Groups) has given a few talks on scaling Onelist and Bloglines: here’s the mp3 audio version, and here’s the PDF of that talk. And a text transcript.

Last.fm

Last.fm is one of the aggregation-type apps: they gather a lot of data about what music you listen to. Similarly to Bloglines, that causes it’s own scaling problems:

Slideshare.

All the slides in this post are hosted by Slideshare, an incredible service by my fellow information architect Rashmi Sinha and team. When I found out about the project, I emailed her: “brilliant and so obvious once you think of it”. Like many startups, they use S3 to serve their content, and they have the obligatory yet interesting slides to explain how:

I haven’t linked to lots of good thinking about scaling, or to technical resources and stuff. But the presentations should get you going in the world of memcached, perlbal, nothing shared and federation :) Enjoy!

PS: See also How I Unexpectedly Found Myself Doing Consulting For Startups (this is a post on my “professional” site. I haven’t been able to figure out when to post here or there, any tips on that?).

Update: more presentations.

Another great talk in video this time, from the MySQL Bay Area Community Meetup, May 2007:

Finally, Dan Pritchett has a good presentation on scaling eBay (PDF). 26 Billion SQL queries per day! 300+ new features per quarter! 4 architecture versions since 1998 and some pretty crazy scaling of the search.

New: presentation on how Facebook uses PHP APC cache (PDF).

A talk on Youtube scalability: “In the summer of 2006, they grew from 30 million pages per day to 100 million pages per day, in a 4 month period. Thumbnails turn out to be surprisingly hard to serve efficiently. (I ran into this with mefeedia too, luckily Amazon S3 came to the rescue by then.)” Youtube uses Python, Apache, MySQL, Memcached.

NEW: Front end scaling is important too, and often ignored. Here’s a good presentation from the Yahoo guys:

Sunday, April 29th, 2007

Microsoft’s profits continue to be staggering: with a quarterly revenue of $14.4 billion, it takes Microsoft only:

  • 10 hours or so (yes, hours!) to exceed Red Hat’s quarterly net income of $20.5 million.
  • four days to exceed Research In Motion’s quarterly net income of $187.9 million.
  • four days to exceed Starbucks’ quarterly net income of $205 million.
  • one week to exceed Nike’s quarterly net income of $350.8 million.
  • two weeks to exceed McDonalds’ quarterly net income of $762 million.
  • two weeks to exceed Apple’s quarterly net income of $770 million.
  • 18 days to exceed Google’s quarterly net income of $1 billion.
  • 23 days to exceed Coca-Cola’s quarterly net income of $1.26 billion.
  • five weeks to exceed IBM’s quarterly net income of $1.85 billion.
  • 10 weeks to exceed Wal-Mart’s quarterly net income of $3.9 billion.

Saturday, April 28th, 2007

This looks like a good PHP S3 API.

Friday, April 27th, 2007

Google investing 250 million Euro in a huge Belgian data center.

Friday, April 27th, 2007

In the continuing saga of illegible domain names, I’ve recently purchased wayut.com and xofy.net. Once you know what they are they’re actually easy to remember. 2 possible upcoming projects. Wanna guess?

Friday, April 27th, 2007

What’s wrong with the workhack todo list: it dissapears todo items that are done. I like to see what I’ve accomplished, to get that feeling of satisfaction, of knowing you’ve done at least *something* the past 2 days.

Friday, April 27th, 2007

you have better odds of winning $5M in the NY
Lottery than you do of selling your company to Google (or Yahoo) - in 2005

Friday, April 27th, 2007

Seems that prices for good developers in Bangalore are skyrocketing. That’s a good thing.

Friday, April 27th, 2007

Om: But there was a lesson learned: never be the me-too player in your business category.

Thursday, April 26th, 2007

http://www.thechickentest.com/vid/IASummit2007/Information_Architecture_and_Ethical_Design.mp3

Thursday, April 26th, 2007

http://www.thechickentest.com/vid/IASummit2007/Real_Information_Architecture_New_Mighty_Deeds.mp3

Thursday, April 26th, 2007

http://www.thechickentest.com/vid/IASummit2007/Using_Search_Analytics_to_Diagnose_Whats_Ailing_your_Information_Architecture.mp3

Thursday, April 26th, 2007

http://www.thechickentest.com/vid/IASummit2007/ProjectTouchstones-JessMcMullin.mp3

Thursday, April 26th, 2007

http://www.thechickentest.com/vid/IASummit2007/Systems_Thinking_Rich_Mapping_and_Conceptual_Models.mp3

Thursday, April 26th, 2007

http://www.thechickentest.com/vid/IASummit2007/Information_Architecture_and_Ethical_Design.mp3 I’m posting these links so they show up in my feed and I can hence get them easily in iTunes.

Wednesday, April 25th, 2007

This is a video of how RSS works, in rather plain English.

Tuesday, April 24th, 2007

A good explanation of the problem with databases. Read and write from memory is much faster. Things are changing in db land - where did I read that mysql will now have a tabletype that’s basically an rss feed?

Tuesday, April 24th, 2007

Actually a good post on creating a killer team for a startup.

Tuesday, April 24th, 2007

http://startupideatr.com: startup ideas. Ideas really are almost 100% worthless :)

Tuesday, April 24th, 2007

I hear mefeedia is doing great, numbers continue to grow fast. It’s very satisfying to see that the people I sold it to are continuing to build it out in the original spirit.

Tuesday, April 24th, 2007

Here’s an interesting seminar title (via Jon Udell): How to Read 100 Million Blogs (and How to Classify Deaths without Physicians)

Tuesday, April 24th, 2007

The web globalization report card 2007: “Among the many trends we noticed: The average number of languages supported by the 200 sites has increased to 18, up from 15 last year.
And the average number of languages supported by the top 20 Web sites
was 45. It used to be you could support 10 languages and pretty much
stand apart from the pack. Not any more.

Monday, April 23rd, 2007

mm, I upgraded my wordpress but comments still don’t work. Any tips?

Monday, April 23rd, 2007

Wow - shooting victim on youtube. Reporting really has changed. For better and for worse, at the same time.

Sunday, April 22nd, 2007

Lots of great links about the architecture of conversation.

Sunday, April 22nd, 2007

And I’m still doing good for the search terms how to make a documentary.

Sunday, April 22nd, 2007

Hey I’m nr8 in Google for Peter - just below Peter Gabriel :)

Rails flash in php

Sunday, April 22nd, 2007

A friend told me about Rail’s flash concept which makes it very easy to provide “success” messages and such after submitting a form.

This is some quick code that does something like that (assuming you have a session running) in PHP: (really just a test of posting code)

/**
* This class lets you set variables that are kept for 1 session pageload, then destroyed (hence the "Flash"), which on the next pageload get reflected in a template (for example).
* For use with submitting forms and returning an error message, for example.
*/
class sessionFlashBucket
{
/**
* write a new variable to FlashBucket with a value
*/
function write($var, $val)
{
// put this var in the users _SESSION
$_SESSION[$var] = $val;
return TRUE;
}
/**
* read and empty variable from FlashBucket (’message’ by default)
*/
function read($var = “message”)
{
// get variable from session
$result = $_SESSION[$var];

// delete variable from FlashBucket
unset($_SESSION[$var]);
return $result;
}
}

Sunday, April 22nd, 2007

Kiva lets you loan money (200, 500$) to a specific entrepreneur in a developing country with a specific plan. 100% of the loan goes to the entrepreneur.

As for, “will I get my money back?”:

“Your Kiva.org loan is a low-risk loan.  Microfinance loans
worldwide are generating repayment rates of 97%.
To date, Kiva.org’s repayment rate is 100%.”

Sunday, April 22nd, 2007

You can access my blog on your mobile more easily here.

Sunday, April 22nd, 2007

I just realized, the web/internet is becoming varied enough for it to become diverse. In the beginning, we all went to the same websites (Yahoo) and used the same services (email). Now, certain groups may use blogs and feedreaders, others may use Myspace and IM, and they never have to meet. And the functionality on the next will probably become more and more varied, so that there will be more space for different groups of people using different types of services.