Welcome back to our "Binary Tree" week!

Today we will discuss the enum function. It can be used to emulate arrays (as you might have noticed, PicoLisp doesn't have an array data type). But what do arrays have to do with binary trees...?!

The enum function is only available in pil21. If you get a enum -- Undefined error when you executing the examples below, you probably haven't installed the latest version.

The `enum` function

Let's see what we can learn from the documentation about the enum function.

(enum 'var 'cnt ['cnt ..]) -> lst
(enum 'var) -> lst

Enumerates cells by maintaining a binary tree in var. The keys are implicit from the enumerated cnts, and the resulting tree is balanced (independent from the insertion order). (...)

enum can be used to emulate (possibly sparse) arrays.

The enum function expects a cnt - which means a (small) integer - which specifies the position of a cell in a balanced tree. Like for any cell, the corresponding value could be defined with the set function.

The cell is returned if it already exists, otherwise it is created as a new node in the tree. A typical use case would be the conversion of a list, as in the following example:

: (for (I . S) '(a b c d e f g)
   (set (enum 'E I) S) )
-> o

The for loop takes the index (I) and value (S) of each list item. I is stored in the enum tree, then its value is set to S.

Let's check the result:

: (view E T)
      g
   c
      e
a
      f
   b
      d

We see a perfectly balanced tree. To read out some values, we use val (enum ...) ):

: (val (enum 'E 1))
-> a

: (val (enum 'E 3))
-> c

We can check if a cell already exists using the enum? function. If it exists, the whole subtree is returned, otherwise NIL.

: (enum? E 3)
-> (c (e) g)

: (enum? E 10)
-> NIL

`cache` and `enum` - what's the difference?

enum doesn't store any keys. The key is implicit due to its position in the tree.
On the other hand, cache is creating the index using hash function, which implies that any key can be stored. The backdraw is that the hashing step needs to be done everytime an item should be looked up.
cache accepts a program as argument. If the program has already been evaluated, the value is returned without re-evaluating. enum? can only check if an index exists or not.
enum is creating balanced trees, cache is creating probably balanced trees (in real applications, this gap should be rather small)

Array emulation

Probably you can already guess in which regards enum is similar to arrays: Arrays can usually be indexed directly by a syntax like myArr[1]. enum allows a very similar handling by (val (enum 'E 1)).

But how does it look like in terms of efficiency? Let's take a list of 1 Mio. items and assume we want to look up the item at position 900.000.

Arrays. Usually arrays take up consecutive space in the memory. If we know the size per entry, the computer "knows" the required position by calculating "starting point + 900.000*size". --> approx. 1 calculation
Lists. In PicoLisp, lists are connected via pointers. Therefore we need to jump from pointer to pointer to get to our desired cell --> approx. 900.000 calculations.
enum/binary tree. The enum function guarantees that the tree is balanced, so we should be able to find our item within 20 steps --> approx. 20 calculations.

Multidimensional Arrays

We can even create multidimensional arrays, since enum accepts more than one cnt variable. Let's create a 4x4 Matrix:

: (off A)

: (for I 4
      (for J 4
          (set (enum 'A I J) (pack I "-" J)) ))

To get the item in the "second line, second row":

: (val (enum 'A 2 2))
-> "2-2"

Good old Fibonacci, revisited

And now finally, let's consider our Fibonacci example again and compare it to the cache version. We can expect a difference, because we can store and access the variables directly under the iteration number N.

However, it requires some further work from us. While cache accepts a program as parameter and evaluates it only in case the value has not been stored yet, we need to do this manually for enum:

check the value of enum '(NIL) N and set it to E. If it has already been calculated, E will be non-NIL, otherwise it will be NIL.
If (val E) evaluates to non-NIL (i. e. has already been calculated), this value is returned.
Otherwise we set E to the result of the recursive calculation.

(de fiboEnum (N)
   (let E (enum '(NIL) N)
      (or
         (val E)
         (set E
            (if (>= 2 N)
               1
               (+ (fiboEnum (dec N)) (fiboEnum (- N 2))) ) ) ) ) )

In the previous post, we showed that the cached version can calculate 10.000 Fibonacci numbers in 0.1 s (of course depending on the respective hardware). How many can we do with fiboEnum in the same time?

: (bench (fiboEnum 14500))
0.098 sec

We get close to it at Fibonacci number 14500. This a speed increase of almost 50% compared to the cache version!

If you still haven't seen enough of Fibonacci, stay tuned for the next Rosetta-Code post which will include analytic and iterative approaches, and the second Euler Project challenge where we will sum up the even Fibonacci numbers.

After that we will start a new topic - Web Application Programming 😃

Sources

picolisp.com/wiki/?ArrayAbstinence
software-lab.de/doc/index.html

PicoLisp Explored: The enum function

The enum function

cache and enum - what's the difference?