Tahoe Blog

Jul 16
Permalink
Jul 13
Permalink
Jul 12
Permalink
Jul 10
Permalink

Visual Studio SP1 takes forever to install

This is crazy. This is my favorite part:

Because VS 2005 SP1 is so large, it takes a long time - typically around 10 minutes - to load the entire image into memory in order to generate a hash over the image.

Update: Now the patch is just sitting there saying “Time Remaining: 0 seconds.” I think Microsoft has lost their collective mind.

(via christopher baus.net)

Jul 09
Permalink
Jul 08
Permalink

Links for 2007-07-07 [del.icio.us]

(via christopher baus.net)

Jul 06
Permalink

S3 Twitter: What is needed is quick hash append

I dug up couple interesting posts from ‘Al’ at Folknologist (sorry, I can’t find Al’s full name on his blog).

First is a comment on the circleshare blog regarding Twitter’s database scaling issues:

The big problem is the inserts (if the backend is a db), every tweet has to be inserted. Thus even if you have a fast messaging (in memory) the write that accompanies it is relatively slow. In such cases you need some super fast hash append system rather than a database, something that literally just writes to a log like file. (Deletes can be handle by null writes on existing keys).

If somebody has a scalable appender like this in code let me know as I could do with one, especially if I can get it working with S3

Yes indeed, if someone can produce a reliable appender in a cost effective way using S3, I’d love to see it as well. After some research into S3, I don’t think it is feasible. Unlike gfs which supports record appends, S3 does not.

Second is the call for a database service for AWS.

AWS is built around an expectation that storage takes place using the highly redundant/reliable S3 infrastructure. This of course makes sense except in the case where one is using a database for storage as opposed to files.

There’s the real kicker. I can’t think of many significant web applications which don’t need at least some database services. Even if an app can make good use of EC2, such as mass video encoding, at some point the application must store something in a database, which makes EC2 and S3 solutions to a subset of a web site’s problems.

(via christopher baus.net)

Jul 04
Permalink

Links for 2007-07-03 [del.icio.us]

(via christopher baus.net)

Jul 03
Permalink

Twitter on S3: S3 objects as lists

S3 as a data store was still on my mind as I rolled out of the fog and into Marin on my bike ride today. Let me take the thought experiment a bit further. Let’s assume tweets could be reliably blocked and written to S3 objects. Just to put a number on it, let’s say 1000 tweets are stored in one S3 object.

If every tweet in one object is by a different user, how does the application iterate over one user’s tweets? There needs to be an index that points to the location of the user’s next tweet. One option would be to store the previous tweet’s location (object name + offset) with the current tweet. But like a linked list, the location of the head of the list would need to be stored and updated with every tweet. As I mentioned, it would be cost prohibitive to store this in an S3 object (unless again this data could be effectively cached and flushed periodically).

Also with such a strategy, the application would need to do a significant amount of caching to achieve reasonable performance because iterating over a list would require a round trip to S3 for each tweet which could not be pipelined.

The other problem with this strategy is that deletes are expensive. Because S3 objects can not be updated, changing the list pointers to bypass the deleted record would require reading the entire object, updating the pointer, and writing it out the object to S3. Plus it would require some sort of locking strategy in the case that the object was being updated by two users simultaneously. That gets pretty ugly quickly.

But deletes are kind of an odd ball operation with Twitter. Once you tweet something you can delete it from your history, but you’ve already broadcasted it to the world.

Well that line thought raises more questions than answers.

(via christopher baus.net)