April 2010

# On the asymmetry of mktime()

[Updated 4/27/2010]

Unix represents time in one of two ways: a
time_t value, which is the number of seconds since some arbitrary start; and a struct tm value, which represents time in calendar form: mm/dd/yyyy hh:mm:ss {dst}. The mktime() function converts a calendar form to a time_t value.

There are two ways to move forward and backward in time. One is to add or subtract seconds from a
time_t value. Four hours from now is time_t + 14,400. The other is to use mktime(). For example, given an initialized struct tm value, this code could be used to move ahead 4 hours:

tm.tm_hour += 4;
tm.tm_isdst = -1;
t = mktime(&tm);

Since 4 hours is 14,400 seconds, one might also consider writing:

tm.tm_sec += 14400;
tm.tm_isdst = -1;
t = mktime(&tm);

# The End of Time

This isn't about 2012, or the Second Advent, or 2038. Well, 2038 will be very briefly mentioned.

Many computer systems consider time to be the number of seconds since an arbitrary beginning. Time is represented by the
time_t data type. Unix/Linux declares that the start of time, t = 0, to be January 1, 1970 @ 00:00:00 GMT. The specification for the Lisp lanuage uses January 1, 1900 @ 00:00:00 GMT. The CableCard specification chose January 6, 1980 @ 00:00:00 GMT. The date where time begins is known as the "epoch".

If time 0 is January 1, 1970 @ 00:00:00, when does time end? If
time_t is a signed 32-bit quantity, then the largest positive value is 0x7fffffff which corresponds to 1/19/2038 @ 3:14:07 GMT. One second later, time will "wrap" from 1/19/2038 @ 3:14:07 GMT to 12/13/1901 @ 20:45:52 GMT. This is the basis for the "year 2038" problem, not unlike the Y2K problem.

If
time_t is an unsigned 32-bit value, then the largest possible value is 0xffffffff which corresponds to 2/07/2106 @ 6:28:15.

On Mac OS X,
time_t can also be a signed 64-bit value. When asked "what is the date that corresponds to a time_t of 0x7fffffffffff?", there is no answer (although the time functions don't tell you there is no answer. Some of the struct tm values are silently not set). Lisp says that it should be 12/04/292277026596 15:30:07. Yes, that's over 292 billion years in the future. Approximately 20 times the current age of the universe. Lisp says it will be a Sunday.

Why can't Mac OS X handle the largest positive 64-bit
time_t value, but Lisp can? When converting from time_t to calendar form, Unix represents the year as an int (cf. tm_year in the tm struct). This imposes an upper bound on the value that can represent a year. A signed 32-bit integer has a maximum positive value of 2,147,483,647. Since mktime() and timegm() take year - 1900 as input, the largest year that can be represented is 2,147,483,647 + 1900 = 2,147,485,547. Converting 12/31/2147485547 @ 23:59:59 gives the time_t value 0xf0c2ab7c54a97f. So 0xf0c2ab7c54a97f is the largest 64-bit time_t value that can be used with a signed 32-bit tm.tm_year element. That's over 2 billion years in the future. On a Wednesday.

# On Packing Data

[updated 16 April 2010]

The number of bits needed to store an integer, a, a > 0, is:

Using this formula, 13 bits are needed for the value 7333. In binary, 7333 is 1110010100101, which is indeed 13 bits.

The number of bits needed to store two values, a and b, independently would be:

Using the identity:

The number of bits needed to store the product of a and b is:

Therefore, we can save one bit by multiplying a and b when

As a trivial example, the values 5 (101) and 6 (110) each require three bits so storing them independently would take 6 bits. However, 6*5 is 30, which takes 5 bits.

Just for curiosity, this graph shows the fractional part of

using the equation

It's an interesting "sawtooth" pattern, with zero values at 2n. The larger version of the graph also shows lines of constant slope between the endpoints of each tooth (in blue), and the point where the tooth equals 0.5 (√2*2n - in red).

Click on the graph for a larger version.

Let's consider a practical application. The start and end of Daylight Savings Time can be specified as "the nth day of the week in month at hour and minute". For example, in 2010, DST in the US begins on the second Sunday in March at 2AM. Let:

d = day of week, Sunday..Saturday = 0..6
r = day rank, first..fifth = 0..4
m = month, January..December = 0..11
h = hour, 0..59
μ = minute, 0..23

Naively, this can be stored in 3+3+4+6+5 = 21 bits.

When multiplying 6 * 5 to save one bit, one of the numbers is needed in order to determine the other. Since we don't a priori know what values were multiplied together, we can't unpack the result. So while the possibility of saving a bit by multiplication made for an interesting analysis, it doesn't help here. There's no point in storing the product of n numbers if you have to know n-1 numbers to recover the nth number.

If we're going to conserve space, it will have to be through application of the first equation. Given two integers, a and b, a requires fewer bits than b when:

So we want to minimize the maximum value of the number being stored.

For the packing algorithm, construct a set, ω, of an upper limit for each item.

Then the largest number that can be encoded is:

To pack the values vi which correspond to the upper limit values ωi, evaluate:

To unpack rn into values vi which correspond to the upper limit values ωi, evaluate:

Consider the decimal number 12,345. For base 10 encoding, ωi would be (10, 10, 10, 10, 10), and the largest number that could be encoded would be 99,999. But suppose we knew that the largest possible value of the first digit was one. Then let ωi be (2, 10, 10, 10, 10). The largest value that could be encoded would be 19,999. Since 19,999 < 99,999, it requires fewer bits.

For the daylight savings time values, construct a set, ω, of the maximum value of each item + 1. For (d, r, m, h, μ) this would be (7, 5, 12, 60, 24).

Packing the start times would requlre 20 bits:

Packing both the start and end times would use 39 bits, for a savings of 3 bits.

# God, The Universe, Dice, and Man

In the realm of the very small, the universe is non-deterministic. Atomic decay, for example, is random. Given two identical atoms, one might decay after a minute, another might take hours. Elementary particles have a property called "spin", which is an intrinsic angular momentum. Electrons, for example, have spin "up" or spin "down", but it is impossible to predict which orientation an individual election will have when it is measured.

John G. Cramer, in
The Transactional Interpretation of Quantum Mechanics, writes:

[Quantum Mechanics] asserts that there is an intrinsic randomness in the microcosm which precludes the kind of predictivity we have come to expect in classical physics, and that the QM formalism provides the only predictivity which is possible, the prediction of average behavior and of probabilities as obtained from Born's probability law....

While this element of the [Copenhagen Interpretation] may not satisfy the desires of some physicists for a completely predictive and deterministic theory, it must be considered as at least an adequate solution to the problem unless a better alternative can be found. Perhaps the greatest weakness of [this statistical interpretation] in this context is not that it asserts an intrinsic randomness but that it supplies no insight into the nature or origin of this randomness. If "God plays dice", as Einstein (1932) has declined to believe, one would at least like a glimpse of the gaming apparatus which is in use.

As a software engineer, were I to try to construct software that mimics human intelligence, I would want to construct a module that emulated human imagination. This "imagination" module would be connected as an input to a "morality" module. I explained the reason for this architecture in this article:

When we think about what ought to be, we are invoking the creative power of our brain to imagine different possibilities. These possibilities are not limited to what exists in the external world, which is simply a subset of what we can imagine.

From the definition that morality derives from a comparison between "is" and "ought", and the understanding that "ought" exists in the unbounded realm of the imagination, we conclude that morality is subjective: it exists only in minds capable of creative power.

I would use a random number generator, coupled with an appropriate heuristic, to power the imagination.

On page 184 in
Things A Computer Scientist Rarely Talks About, Donald Knuth writes:

Indeed, computer scientists have proved that certain important computational tasks can be done much more efficiently with random numbers than they could possibly ever be done by deterministic procedure. Many of today's best computational algorithms, like methods for searching the internet, are based on randomization. If Einstein's assertion were true, God would be prohibited from using the most powerful methods.

Of course, this is all speculation on my part, but perhaps the reason why God plays dice with the universe is to drive the software that makes us what we are. Without randomness, there would be no imagination. Without imagination, there would be no morality. And without imagination and morality, what would we be?

# Georgia Wild Animal Safari

Becky, Rachel, and I visited the Wild Animal Safari in Pine Mountain, Georgia. A 3.5 mile road winds through the park; you can ride a bus, rent a van, or drive your car. There is also a walkabout section. We bought a bag of food for each of us and opted for the bus. The animals come right up expecting a treat. We fed buffalo, pigs, deer, and a giraffe. The big cats were caged. Some more pictures and a video of a bear playing with a tire after the fold.

# Kashka-Suu

Yesterday I noted the violence in Kyrgyzstan. I visited there in June of '99. I managed to find this picture of Kashka-Suu. Unfortunately, it doesn't do justice to the beauty of the area.

# A Moldy Easter Atheist

Previously, I wrote about dealing with mold at church on Easter. But that wasn't the only place I encountered fungus. Easter was being celebrated over at Vox Popoli when "DT" dropped in and asked us if we were "sure we've got our story straight?" DT then proceeded to list a number of supposed contradictions in the New Testament account of the Resurrection. The first alleged problem is that Mark 15:25 says that Jesus was crucified "at the third hour" (KJV), while John 19:14-15 states that Jesus was led away to be crucified "at about the sixth hour" (KJV). Obviously, the times don't agree, the authors didn't have their story straight, and so the Resurrection is but a fabrication.

# Violence in Kyrgyzstan

Today brings news of violence in Bishkek, the capital of Kyrgyzstan. I've been there and I think I recognize some of the locations in these pictures on CNN.

I hope my friends are ok. If I weren't paranoid, I'd post a link to a picture of me in the mountains in Kyrgyzstan. Maybe if I blurred the faces of my Kyrgyz companions...

# Another Short Conversation...

In Who Needs Christianity, I wrote, "Man is the biological machine that doesn't do what it ought to do." Someone named "Cabal" responded, "Excuse me but exactly what should Man be doing and and [sic] according to who...and please no vacuous, meaningless answers along the lines of 'obey God and according to God.'"

The answer, of course, is evident via a little self-reflection. We don't do what we ourselves think we ought to do.

Cabal wasn't heard from again.

# Inigo Montoya vs. Humpty Dumpty

Montoya: You keep using that word. I do not think it means what you think it means.
Dumpty: When I use a word it means just what I choose it to mean; neither more nor less.