Introducing Tcl 8.7 Part 13: more list enhancements

Published

This is the thirteenth in a series of posts about new features in the upcoming version 8.7 of Tcl. New commands for list manipulation in Tcl 8.7 were described in a previous post. Since then, some additional enhancements related to lists have been implemented. These are described in this post.

To take Tcl 8.7 for a spin, you can download a pre-alpha binary for your platform. Alternatively, you can build it yourself from the core-8-branch branch in the Tcl fossil repository.

The enhancements described here are not available in an officially released Tcl 8.7 build at the time of writing. You will need to build from source in the core-8-branch in the Tcl fossil source repository. On Windows, you can download a binary distribution from magicsplat

The lseq command

The lseq command creates a list containing a sequence of numeric values. It can take one of several syntactic forms. The first of these generates a sequence starting at 0 and containing a specified number of values.

lseq COUNT ?by STEP?

The optional argument STEP is the difference between successive values in the sequence and defaults to 1. It may be negative as well.

% lseq 5
0 1 2 3 4
% lseq 5 by 2
0 2 4 6 8
% lseq 5 by -1
0 -1 -2 -3 -4

The second form of the command is similar except that it permits the initial value to be specified.

lseq START count COUNT ??by? STEP??

The generated sequence will begin at START instead of at 0.

% lseq -2 count 5
-2 -1 0 1 2
% lseq 2 count 5 by -1
2 1 0 -1 -2
% lseq 2 count 5 -1
2 1 0 -1 -2

Note the by keyword is optional as shown above, but the count keyword must be specified to distinguish this form from the third form of the command described below.

The final form of the command differs from the first two in that it specifies the range covered by its end value as opposed to a count.

lseq START ?..|to? END ??by? STEP?

The sequence is generated by incrementing each prior element by STEP until it crosses the END value.

% lseq 1 to 5
1 2 3 4 5
% lseq 1 .. 5
1 2 3 4 5
% lseq 1 5
1 2 3 4 5
% lseq 1 to 5 by 2
1 3 5
% lseq 1 to 5 by 3
1 4
% lseq 1 to -5 by -2
1 -1 -3 -5

The .. and to keywords are synonyms and optional as is by. I prefer to use them for readability. Note in the above examples that the END value may or may not be present in the sequence depending on whether it differs from START by an integral number of steps.

In this third form, if the STEP argument is not specified, it defaults to an absolute value of 1 with the sign of the step determined by the relative values of START and END. For example,

% lseq -2 to 2
-2 -1 0 1 2
% lseq 2 to -2
2 1 0 -1 -2

In all the above examples, the numeric values passed to the command were constants. However, lseq will also accept expressions in the same syntax as the Tcl expr command. For example,

% set n 5
5
% lseq {$n-2} {$n+2}
3 4 5 6 7

The lseq command is not restricted to integer values. It can be used with floating point values as well. However, you need to be aware of issues around floating point computations that are not specific to lseq, or even Tcl, but computers in general. For example,

% lseq 0 2 by 0.5
0.0 0.5 1.0 1.5 2.0
% lseq 0 2 by 0.4
0.0 0.4 0.8 1.2000000000000002 1.6

Note in the second case how the last element is 1.6, and not 2.0. That's floating point math for you. Because of the imprecision of floating point computation, the value after 1.6 is calculated as greater than 2.0 albeit by a very minute amount.

Below are a couple of examples to illustrate how lseq can make code more concise. You may want to implement the same functionality in Tcl 8.6 for a comparison.

In many cases, lseq replaces uses of for with more intuitive list based iterations. Generate a list of first 5 squares:

% set l [lmap i [lseq 1 count 5] { expr {$i*$i} }]
1 4 9 16 25

Remove elements at even indices from a list:

% set l {a b c d e f g}
a b c d e f g
% lremove $l {*}[lseq [llength $l] by 2]
b d f

lseq can also be useful within expressions. Check if an integer is odd and within a specified range:

% expr {$i in [lseq 1 99 by 2]}
1

One final important point to keep in mind about lseq is that it is very efficient in terms of memory as unlike "traditional" lists it does not need to actually store values of every element. Thus even [lseq 100000000] takes up less than a hundred bytes of storage.

The ledit command

The new ledit command is very similar to the lreplace command in Tcl 8.6 except that instead of operating directly on a list value, it operates on a list value contained in a variable and stores it back in that variable. The command syntax is

ledit LISTVAR FIRST LAST ?ELEM ...?

The FIRST and LAST arguments are indices that specify the range of elements to be deleted (inclusive). The optional ELEM arguments specify the new elements to be inserted at index FIRST.

% set l [list a b c d e]
a b c d e
% ledit l 2 3 X Y Z
a b X Y Z e
% set l
a b X Y Z e

Because of Tcl's reference counting mechanism, commands that operate on variables are generally more efficient than those operating on values. This is as true for ledit as it is for other list commands like lappend that operate on variables and is the reason for prefering it to lreplace when the existing value does not need to be preserved. The following performance measurement samples illustrate the difference.

% timerate -calibrate {}
0.03035015489656804 µs/#-overhead 0.031086 µs/# 62792591 # 32168335 #/sec
% set l [lrepeat 1000 x]; llength $l
1000
% timerate {set l [lreplace $l 499 500 X Y]}
2.956879 µs/# 334759 # 338194 #/sec 989.842 net-ms
% set l [lrepeat 1000 x]; llength $l
1000
% timerate {ledit l 499 500 X Y}
0.240055 µs/# 3698156 # 4165711 #/sec 887.761 net-ms

Like lreplace, the command can be used not just for replacing a range of elements but for pure insertions and deletes as well. The command can be used in lieu of linsert by specifying LAST as a value less than FIRST. Moreover, for the same reasons as stated above it can be more efficient as well.

% set l [lrepeat 1000 x]; llength $l
1000
% timerate {set l [linsert $l 500 X]}
44.4620 µs/# 22477 # 22491.1 #/sec 999.373 net-ms
% set l [lrepeat 1000 x]; llength $l
1000
% timerate {ledit l 500 499 X}
0.337232 µs/# 2720481 # 2965318 #/sec 917.433 net-ms

For large lists, the performance improvement is substantial as seen above. Note: similar performance can be achieved with linsert as well with use of the K operator. That is however less obvious and clear from a readability perspective.

Similarly, ledit can be used in place of lremove when a contiguous range of elements are to be removed leaving out any ELEM arguments.

% set l [list a b c d e]
a b c d e
% ledit l 2 3
a b e
% set l
a b e

Performance improvements

The internal representation of lists has changed in Tcl 8.7 resulting in significant improvements in performance even for existing commands in some common scenarios.

Repeated insertion at the front of a list is faster, order of magnitude as lists get longer.

Tcl 8.6:

% set l {}
% timerate {set l [linsert $l 0 X]}
79.5236 µs/# 40010 # 12574.9 #/sec 3181.741 net-ms

Tcl 8.7:

% set l {}
% timerate {set l [linsert $l 0 X]}
0.193749 µs/# 4447614 # 5161315 #/sec 861.721 net-ms

When writing for Tcl 8.6, appending to a list using lappend was often favoured over prepending as the former was so much faster. This need no longer be the case in 8.7.

Range operations on large lists are also signficantly faster.

Tcl 8.6:

% set l [lrepeat 1000 X] ; llength $l
1000
% timerate {lrange $l 1 end-1}
3.350451 µs/# 295636 # 298467 #/sec 990.514 net-ms

Tcl 8.7:

% set l [lrepeat 1000 X] ; llength $l
1000
% timerate {lrange $l 1 end-1}
0.115875 µs/# 6804325 # 8630012 #/sec 788.449 net-ms

The difference is even more pronounced for larger lists. There are also benefits in terms of reduced memory usage.

The lassign command also benefits from faster range operations. Other less pronounced improvements are listed in TIP 625.

References

TIP 629: Add a lseq (formally "range") command to the core of list commands

TIP 631: ledit - a generalized insert/delete command for list variables

TIP 625: Re-implementation of lists