Using async/await to parallelize sequential code

Published

The latest post in our series on promises introduced the async and await commands. That post focused on how these commands further simplify asynchronous programming with promises. This post takes a different angle on their utility - how they can be used to speed up sequential code with minimal effort.

NOTE: The code samples here assume V1.1 of the promise package.

The example I'm using is based on a real utility script I use to check the status of my code repositories, simplified to remove extraneous details. The original sequential version of my script looked as follows.

proc get_status {} {
    global argv vcs_status
    foreach dir $argv {
        set dir [file nativename [file normalize $dir]]
        if {[file exists [file join $dir .hg]]} {
            set vcs_status($dir) [exec cmd /c cd $dir && hg status]
        } elseif {[file exists [file join $dir .git]]} {
            set vcs_status($dir) [exec cmd /c cd $dir && git status]
        } else {
            set vcs_status($dir) "Error: could not recognize VCS for $dir."
        }
    }
}

get_status

foreach {dir status} [array get vcs_status] {
    puts [string repeat = 40]
    puts "STATUS for $dir:"
    puts $status
}

This simple script should be self-explanatory. It logs the status for the list of repositories passed on the command line.

Given I have more than a dozen repositories and the actual commands are more than the simple status used in the modified script above, the script took more than just a few seconds to run. Patience not being a virtue, I have long wanted to speed up the script but couldn't be bothered to use exec or an open pipeline asynchronously, hooking up the event handlers etc.. Not that it's hugely difficult, but still...

However, the async / await commands from the promise package made speeding up the above sequential code almost trivial. The slowness in the script above stems from two factors - first, the sequential script does not utilize multiple processors, and moreover, even on a single processor system time is wasted in child processes waiting for I/O. These factors can be addressed very simply by using async / await in combination with the promise-based pexec equivalent of exec.

Here is the modified script.

package require promise
namespace path promise

async get_status {} {
    global argv vcs_status
    while {[llength $argv]} {
        set argv [lassign $argv dir]
        set dir [file nativename [file normalize $dir]]
        if {[file exists [file join $dir .hg]]} {
            set vcs_status($dir) [await [pexec cmd /c cd $dir && hg status]]
        } elseif {[file exists [file join $dir .git]]} {
            set vcs_status($dir) [await [pexec cmd /c cd $dir && git status]]
        } else {
            set vcs_status($dir) "Error: could not recognize VCS for $dir."
        }
    }
}

eventloop [all* [get_status] [get_status] [get_status] [get_status]]

foreach {dir status} [array get vcs_status] {
    puts [string repeat = 40]
    puts "STATUS for $dir:"
    puts $status
}

The changes we have made are:

  • Obviously, we first need to load the promise package itself.

  • We then define the get_status procedure using the async command rather than proc.

  • The foreach loop is replaced with a while since there are now effectively multiple parallel loops that will be picking elements off the argv list.

  • The exec call is replaced by the await and pexec combination.

  • Finally, we add the eventloop line to start four asynchronous routines each of which will execute the equivalent of our original code. (I picked 4 because that's the number of processors on my machine.)

NOTE The eventloop command is new in V1.1 of the promise package. It enters the Tcl event loop waiting for a promise to be settled. In older versions of the package, the equivalent would be something along the lines of

set gate [all* [get_status] [get_status] [get_status] [get_status]]
$gate done {set completed true}
vwait completed

Though there are several modifications, they are mechanical and close to being trivial. Perhaps the most important characteristic of this transformation is that the parallelized code closely resembles the structure of the original sequential version and is as easy to follow while performing significantly better, executing roughly three times as fast in my case.

A second example

This post was actually motivated by a user's query regarding a test framework. The following Tcl pseudocode describes the general structure of scripts that emulate a client running against the server under test.

proc test_client {server iterations} {
    set conn [connect_to_server $server]
    
    for {set i 0} {i < $iterations} {incr i} {
        check_response [query $conn QUERY1] 
        after [random_think_time] ;# Simulate thinking time
        check_response [query $conn QUERY2] 
        after [random_think_time]
        ...
        ...
    }

    close_connection $conn
}

For load testing that emulates multiple clients, the test harness runs the script in multiple threads or processes. This has scalability limitations in terms of how many clients can be emulated on a system and the question was whether promises could used to improve this by running multiple clients within each Tcl interpreter without significant changes to the code structure.

Again, async / await idiom is ideally suited for this and the equivalent code is shown below.

async test_client {server iterations} {
    set conn [connect_to_server $server]
    
    for {set i 0} {i < $iterations} {incr i} {
        check_response [query $conn QUERY1] 
        await [ptimer [random_think_time]] ;# Simulate thinking time
        check_response [query $conn QUERY2] 
        await [ptimer [random_think_time]] ;# Simulate thinking time
        ...
        ...
    }

    close_connection $conn
}

Basically, the after blocking command is replaced by await on timers created with ptimer. This allows multiple test_client invocations to run concurrently. A single Tcl interpreter can then potentially emulate hundreds of clients resulting in greatly increased test scalability with the following simple snippet.

for {set i 0} {i < 100} {incr i} {
    lappend clients [test_client $server 10]
}
eventloop [all $clients]

Hopefully this short post provides motivation for you to explore promises in more depth.