Sunday, January 06, 2008

Our species has 10 fingers, and we (mostly) count in base 10. It's easy to learn. Base 10 seems natural, but it hasn't always been the favoured base. The Mayans used base 20 (vigesimal), the Babylonians used base 60 (sexagesimal), and our computers use base 2 (binary). What makes our base 10 so special, and more importantly, is 10 the optimal base we could use? Is there a better one?

First, we have to define what is meant by optimal and better in the context of a numeric base (or radix). The only definition of optimal numeric base that I know is the base at which the area A = (#digits in a given number) * (#unique symbols in base) is minimized for all numbers. That is, the optimal base is where, given all numbers that can be represented, the least "space" to represent them is the least amount of area given by the product of the number of digits in a number and the number of symbols required in that base system. [1]

Using least "area" as our metric for most optimal, the area is given by A = D * B. B = the base, which is also the number of symbols used in the given base; D = the number of digits which can be modeled with the function D(B,N) = logB(N) (or more exactly, it is the index of the leading digit needed to represent the number). Since over all numbers we also consider numbers approaching infinity, this equation for area doesn't tell us much in its present form, so instead we can work with a derivative of it (pun intended). As any intro calculus student will tell you, the maximum or minimum point of a concave surface (in this case, A = D * B) is where d2A/(dD*dB) = 0. If this can give us a solution independent of N (the value of any given number), then there is an optimal base for representing all numbers.

d2A/(dD*dB) = d/dD ( d/dB ( D(B,N) * B ) )
= d/dD ( D(B,N) + B * d/dB ( D(B,N) )
= d/dD [ D(B,N) + (-1)*D(B,N)/ln(B) ]
= 1 + (-1)/ln(B)

This is the slope of a concave up [2] curve, where the optimal base occurs at the minimum point and the slope equals zero (0).

1 + (-1)/ln(B) = 0

or

B = e

It turns out e is the optimal numeric base! I don't find this particularly surprising; e is everywhere. (Not to mention lack of surprise from having previously read the final answer) The problem with this is that we humans tend to use integer bases. Having an irrational base isn't unheard of, but we humans don't have much intuitive use for it. [3]

Optimal Integer Base
Just as Aiken and Tropp discussed [1], three (3) is thought to be the most optimal integer numeric base because it is closer to e than two (2).

Is three really the best integer base? We can do better than just saying "it's closer to e than 2". (but to be fair, that's not an unreasonable thing to say) The function is non-linear, which makes the immediate obviousness/truth of the statement questionable. [4] The original equation for the symbolic-spatial area used by representing any number is a function of the base and the value of the number, so it's hard to use it to study a trend for all numbers. The first derivative (with respect to #digits and base), however, is only a function of the base, and is a lot easier to work with.

A'(B) = 1 - 1/ln(B)

Plotting this function over the domain (2:3) shows not only the optimal base to use (where B = e, at the zero crossing), but just as importantly, it shows the rate of change in the representational area as a function of (change of) Base.

To decide whether 2 or 3 is closer to the optimal base in a more quantitative way, we can integrate A'(B) from 2 to e, and again from e to 3, and compare which of the two areas between the curve and the B-axis is smaller in magnitude. It should be obvious from the curve of A'(B), above, that the area between the curve and the B-axis is considerably smaller between e and 3. [5] This means that if we use a base of 3, there is less change of the total area required to represent numbers. So, three really is the optimal integer numeric base, assuming the original metric for the definition of optimal is reasonable.

This has implications on more than just representing numbers. Three as an optimal base also gives insight into optimal menus. It provides insight into tree traversal/creation. If we switch to a tertiary number system implemented in silicon, we may be able to implement an equal size of integer processing (e.g. the equivalent of 32 bits, or a max of 4 294 967 296, can be represented within 21 tits of ternary, with a max of 10 460 353 203) in a slightly less amount of silicon. Alternately, you might implement a larger numeric representation in a roughly equivalent silicon area. (I haven't done a formal analysis, myself, but I have seen some papers published on it) All the opinions I recall reading on this, though, have suggested that we're currently too deeply ingrained in binary physical implementations such that switching would probably be a bit of a setback given our 50ish years of optimizing for binary implementations. I suspect that if a ternary implementation really is better, it will only get explored if we hit a wall with Moore's Law for our current setup.

The Second most optimal integer base
We still don't know what the second most optimal integer numeric base is, but it's probably 2 or 4. Again, refer to the same expression of A'(B), but now over the domain (2:4), and do the same integration on either side of the optimal point of B = e.

Visually, it looks rather close, but numerically it's no contest. The area from 2->e is 0.132 and from e->4 is 0.209, so 2 is less of an area change from e and is a more optimal integer base than 4. Four (4) still gets the bronze medal for integer bases, though, and still deserves its props. Not only is base 4 (quaternary) the third most optimal integer numeric base, but it is strongly related to the second most optimal integer base, binary, like octal and hexadecimal. Base 4 is the native numeric base of DNA (it has four characters, A G C T, to represent Adenine, Guanine, Cytosine, and Thymine). Base 4 is still more optimal than the one currently in use, base 10.

Where does base 10 (decimal) rank
Where does decimal (base 10) rank in this scheme? Once more, here is the same curve of A'(B) plotted this time over the domain (2:10).

After finding the order of the first few, the rest fall in line. If you were to integrate in the same way, the ranking for optimal positive integer numeric bases are:

  1. base 3 (ternary)
  2. base 2 (binary)
  3. base 4 (quaternary)
  4. base 5 (quinary)
  5. base 6 (senary)
  6. base 7 (septenary)
  7. base 8 (octal)
  8. base 9 (nonary)
  9. base 10 (decimal)
I don't know if base 1 (unary) should get a mention here; it smells like a special case. However, based on the same evaluative criterion as above, it still doesn't rate in the top 10 - it rates between base 14 (tetradecimal) and base 15 (pentadecimal). Most don't realize that when you are counting on your fingers, you are actually representing the count value in base 1, not base 10. A bonus of this is that the base 1 symbol is space rotatable. Place-value systems (like decimal) are not space rotatable; you cannot have the digits in a nonspecific order. With unary you can hold up any consistent number of fingers in any (physically possible) order and still mean the same value. (i.e. it has non-unique representations)

For those interested in irrational bases [3], based on these calculations base π is more efficient that base 2, as well, but only by about half as much as base 3.

Burton MacKenZie http://www.burtonmackenzie.com/

[1] This is not my definition. I'm having a hard time finding where I originally found it. I thought it was in Hardy's "A Mathematician's apology", but I've been combing through that book and come up dry. At the time I read the definition (not the date it was published), the author stated "and this has been known for at least a hundred years". I'm still looking for a real reference. Please let me know if you find one. In the "Computer Oral History Collection, 1969-1973, 1977" from the Smithsonian National Museum of American History on page 140, Howard Aiken says: "...this is a much more efficient system than the binary number system...", to which Henry Tropp replies "Well, theoretically, e is the optimal phase and three is a little bit closer approximation to e than two is".

[2] On an unrelated tangent, "Concave Up" by Jesse Reklaw is a great collection of Dream comix.

[3] From Mathworld, "Bergman (1957/58) considered an irrational base, and Knuth (1998) considered transcendental bases. This leads to some rather unfamiliar results, such as equating π to 1 in base π, π = 10π.

[4] For instance, y(x) = A*x is a linear function. If x=4b, y(x)=y(4b)=A*4b = 4*(A*b) = 4*y(b). Four times its argument results in the function being four times as big.
y(x) = x*x*x is a nonlinear function. If x=4b, y(x)=y(4b)=4^3*b^3 = 64*b^3 = 64*y(b). Four times its argument results in the function being sixty-four times as big! It does not increase linearly.

[5] I couldn't be bothered to integrate A'(B) over B, as it turned out to be longer than a five minute problem. Luckily engineers are more pragmatic and we don't require an exact value. Inspection of the relative areas under the curve is sufficient. A quick numerical integration (rectangular, 1000 slices per side) gives the results of the area from 2->e is 0.132 and from e->3 is 0.0132. So, there is an order of magnitude less change using base 3 than there is using base 2.

10 comments:

Mohammed said...

Great post. I've never really thought about this, but it does seem logical that e would be the optimal value.

burton mackenzie said...

Thanks, mohammed!
Since I posted this, I've been getting worked up about irrational bases (so far bases sqrt(2) and the cubic root of 3). I'll try and work them into a future blog post, too.

Alfredo said...

Tits?

Joe Fredette said...

I'd bet that something you should take into account is compressability/redundancy in the base, This is a fairly simple metric, so -- while interesting -- it doesn't really tell me a great deal about what really is the most optimal base. Part of the "niceness" of the base-ten metric is the fact that there are several numbers which allow for easy multiplication, 0 and 1, as well as the base-value itself , are fairly easy to multiply by in any metric. But in base 10, there are more numbers which are easily multiplied by, namely 2,5, and multiples thereof, are fairly easy to multiply, because they "look" the same. That is, 5555*5 is fairly simple to calculate quickly in base 10, in base 6, it's a bit harder. There was an article about this, though its location escapes me, the basic Idea is that there are two possiblities for bases which make them "better" or "worse" than others. The first is that composite numbers are better, because they have more "easy multiplication" numbers. for instance, base 12 is comparatively easier to multiply in than base 7, because base twelve has 2,3,4,6, and 8, plus the usual 0,1 and 12 as "easy" multipliers. The other camp says that prime bases are better, since they allow for more rational numbers to be expressed, or something like that. Basically, (no pun intended), the two options differ by who they're useful for, highly composite bases are good for computers and simple arithmetic, because multiplication-- a typically hard problem-- becomes easier for many cases; however, prime bases are better for mathematicians, since they allow numbers which are irrational repeating decimals to be written as irrational non-repeating decimals. This is also true of irrational bases, like e and the like.

It would be interesting to see a more complex metric similarly optimized. This is really a great example of how some simple calculus and a little intuition can go a long way in determining some interesting results.

Good stuff, keep it up!

burton mackenzie said...

Great comment Joe. This metric for which is the optimum base is only that - a metric devised by humans to measure a specific trait of numeric representation. (i.e. optimal space consumed by digits*characters in a given base)

This metric has been around for over a hundred years (and I'm still looking for my original reference), but different needs would demonstrate different optimum bases for those specific needs. For instance, the Babylonians used base 60, which may be functionally optimal for math that doesn't use digits to the right of the radix point, but heavily uses fractions. In base 60, it is trivially easy to represent and work with the fractions 1/2, 1/3, 1/4, 1/5, 1/6, 1/10, 1/15, 1/20, 1/30, and 1/60. If that's what's important to you, base 60 is much more optimal.

Ternary is great for many things, but one of my favourites is for intuitively showing that Cantor's dust exists. Balanced ternary (-1, 0, +1) is wonderful for other applications.

I know you didn't mean complex numbers in your last paragraph, but Knuth explored using a complex base as well. In it, all positive and negative numbers are represented without using a sign. (not to mention the complex ones)

Anyway, as I'm sure you know, "optimal" is defined by what you need it to do. :-) Thanks for the comment.

Brian Glanz said...

All this great work you're doing and I've been casually mouthing off for years that 12 is the better base. I almost joined a dozenal society (I think they called it). Ha!

For social, human use I'm sticking with lecturing about 12 to anyone who will listen, for its everyday computational value. With its factor edge over 10 it would be handy in a market, for example. Speaking of handy, there is also, still an organic aid with three parts on each of our four fingers and a thumb to point for counting, making 12 even handier than using both to count 10. There are also not too many digits in common values, as there would be in base 60, balancing simplicity in common fractional math. There is a more apparent symmetry in some basic calculations useful in math education, for 12 more than 10.

For machine use though, also in the context of Joe's comment, I want to look into e. 2, 10, 16, or otherwise for machine calculations ought to be less efficient than base e in some cases, and machines are notably less confused with its use or less prejudiced than people ;) Should I find any practical implications, I will be sure to recall your insight. BG

burton mackenzie said...

Thanks, Brian. Lately I've been interested in hardware implementations in base golden ratio. It has some interesting characteristics for power operations, but right now it looks like the translation to and from would diminish any computational advantage from using it.

I'd be happy to hear back on any hardware base-e implementations you muse on/design/build!

Jeff Q said...

Interesting idea, though I think the metric you are using is maybe a little bit simplistic. It might be good for computing applications where you are representing numbers as arrays for example. But if you are thinking more organically, perhaps about a time when we still wrote numbers on paper from time to time, I think a different metric would be preferable. One idea is essentially the number of pen strokes required to write a number. For example the number 10 is fairly simple to write in our base 10 system -- 1 straight line and 1 circle. If we were to write it in binary, we would have 1010 which requires twice as much work to write. Of course it is less obvious how one can decide how to optimally design written numbers. Certainly if we went to higher base systems, we would have much more complicated symbols. I think the number of symbols required to represent a given base would probably not be simply proportional to the size of the base (if it were your metric would be essentially sufficient). Maybe base 10 turns out to be a nice compromise where the symbols are still quite basic but we don't require too many digits per number.

c said...

You might have originally seen this here: http://www.americanscientist.org/issues/id.3268,y.0,no.,content.true,page.6,css.print/issue.aspx
I've always loved the concept.

Arthur said...

I never thought about determining the efficiency of a base in this way. I'll have to give it some thought.

Thus far, my thoughts on efficiency have been to consider cognitive utility. To begin with, I'm interested in the relevance of "the crow" (a term in philosophy pertaining to the maximum number of units/concretes the human mind can simultaneously hold, which serves as the basis for a great deal of conclusions in epistemology), which for most people is 7 plus or minus 2. This already leads me to believe that cognitively, 7 is an efficient base.

I think it's also relevant that 7 is a prime number. I suspect (though I haven't worked anything out) that you would be able to do all sorts of neat tricks with a prime base, relying on a lot of the facts that make the integers mod a prime number a field.

What are your thoughts on this?