Matt Bamberger - Five Million PCs

Five Million PCs

Thu, 01/19/2006 at 23:35

That's a lot of computing power

I recently made the pretty standard claim that the computational capacity of the human brain is equivalent to that of 5,000,000 modern PCs. There are probably many ways of building an AI that require a great deal less power than that. However, it's very likely that that level of power will be readily available within a few decades, and it's interesting to think about what kind of architecture you might come up with if you were determined to burn 5,000,000 PCs worth of processing power on a single AI.

I present one possible high-level architecture below. The design is somewhat flippant, and isn't meant as a serious, detailed design. Rather, my intention is to point out that we're talking about an incredibly huge amount of power, and that certain interesting architectural opportunities open up when you have that kind of budget.

This isn't an entirely abstract exercise: the cost of this kind of setup is currently about $5 billion. However, you can make a plausible case that 10 years from now, you'll be able to get this kind of power for $5 million. If you can cut your power needs by a factor of 10 (which still leaves you with a pretty obscene amount of power), you're looking at half a million dollars to put together a system similar to this one (or one identical to it that runs at 1/10th the speed).

I'm not saying that throwing horsepower at AI makes it easy, but I think it's worth thinking about whether certain classes of problems become more tractable with enough horsepower. If so, the reality is that you might reach the finish line faster by building for 2015 hardware rather than wasting engineer cycles trying to cram your AI into 2005 hardware.

In any case, let's see what 5,000,000 PCs buys you:

Low-level visual cortex: 1,000,000 PCs

The beginning of our visual processing system involves lots of low-level processing, just like the human visual cortex. Let's give our system a 10 megapixel camera (which probably outperforms the human cortex, if you lay out the pixels correctly). We'll analyze our visual input in 32x32 pixel cells, doing things like edge and motion detection. If we dedicate a PC to each cell, it'll take 10,000 PCs to process the whole visual stream. Let's have 20 different channels (for things like luminance, red/green color, blue/yellow color, slow motion, fast motion,etc.), and give each channel 5 tiers of processing.

High-level visual cortex: 1,000,000 PCs

Now that we've got a good stream of low-level data coming in, we'll want to do something with it.

  • Low-level geometry engine: 100,000 PCs. Let's turn our low-level visual data into some plausible triangle-level geometry. Let's dedicate a PC to each 32x32 pixel cell, using 10,000 PCs. Let's stack that system up ten tiers deep, just because we can.
  • Mid-level geometry engine: 500,000 PCs. Let's say a given visual scene contains 100,000 simple surfaces that we can extract from the low-level geometry. Let's dedicate a PC to each one, and stack that system 5 levels deep.
  • High-level geometry engine: 100,000 PCs. Our 100,000 surfaces comprise 10,000 potential "objects" in the world. Let's allow 10 variants of each object, and allot 1 PC to each one.
  • Shape recognizers: 100,000 PCs. Let's say you know how to recognize 100,000 things. Let's dedicate a PC to each one, so you have one PC dedicated to looking for pumpkins in the object stream, and another dedicated to looking for silver dollars, and so on.
  • Object tracking: 100,000 PCs. Once we've identified 10,000 objects in the scene, let's allocate 10 PCs to keeping an eye on each one and noticing any changes or relevant behaviors.
  • Scene analysis: 100,000 PCs. Let's pick the 100 most interesting objects in the scene, and allot 10 PCs to observe the interaction between each object pair. So we have 10 PCs thinking full-time about the relationship between the apple and the table, and another 10 thinking about the interaction between the apple and the plate, and so on.

Driving the robot: 1,000,000 PCs

I could do a detailed breakdown of how you could spend 1,000,000 PCs running the body, but I don't really feel inclined to do so, and I bet you can work it out ("10 PCs to coordinate the middle knuckle of the left pinkie finger, and 10 PCs to coordinate each inch of gut, and...").

Language processing: 500,000 PCs

  • Low-level auditory processing: 10,000 PCs. Let's allocate 10,000 PCs to low-level linguistic processing of the auditory stream (in addition to basic sound processing, which is covered under "driving the robot").
  • Phoneme recognition: 10,000 PCs. Let's say there are 1,000 phonemes (there aren't), and let's dedicate 10 PCs to listening for each one in the input stream.
  • Listemes: 200,000 PCs. If you know 200,000 listemes (ie, words, more or less), you can dedicate a PC to each one. It's in charge of listening for that word in the input stream, deciding whether it's a plausible candidate in a given sentence, remembering what it means, etc.
  • Language parsing: 100,000 PCs. As you parse a sentence, you need to consider a number of plausible interpretations of it. If there are 100,000 possibilities, you can dedicate a PC to each one.
  • Language generation: 100,000 PCs
  • Miscellaneous language tasks: 80,000 PCs. Hey, the budget still has a bunch of PCs left in it. Use it or lose it.

Thinking: 1,500,000 PCs

  • Analysis: 250,000 PCs You've got 250,000 PCs for general processing of incoming information.
  • Planning: 250,000 PCs Ditto for planning. You can have 10 PCs analyzing each of 20,000 different possible plans, and still dedicate 50,000 to looking at the big picture.
  • Knowledge: 1,000,000 PCs A pretty reasonable estimate is that you know 1,000,000 "chunks" of information. You can dedicate a PC to each one (remembering the information, thinking about whether it might be relevant to your current circumstances, etc.)