Performance benchmarks of Transd

There is a proven approach in achieving a good software performance: make it run well on old hardware, and on modern hardware it will fly.

Here are two performance benchmarks. One is in an area of Transd's specialization: processing of structured data. The other is a lengthy numeric computation: multiplication of two matrices

The benchmarks are done on a machine having:

Processor: Intel Celeron 1.2GHz from 2002.
Memory: 2Gb.
Operating system: Linux with more or less modern 4.19 kernel.

$ cat /proc/cpuinfo
processor	: 0
vendor_id	: GenuineIntel
cpu family	: 6
model		: 22
model name	: Intel(R) Celeron(R) CPU          220  @ 1.20GHz
stepping	: 1
microcode	: 0x36
cpu MHz		: 1200.060
cache size	: 512 KB

$ lsmem
Memory block size:       128M
Total online memory:       2G
Total offline memory:      0B

The benchmarks are run with the "TREE3" Transd interpreter ("Transd Expression Evaluator"), which can be downloaded here.

Benchmark 1: SELECT query on a table

The Transd program with the benchmark can be obtained here.

Benchmark: a table, containing 3.7 million values in 100.000 rows and 37 columns is read from a CSV file, then a query is performed on the table, which selects table's rows based on certain values in two columns.

The dataset for the test is taken from a site offering datasets for testing purposes:
The link to the dataset file:

In order to be able to operate with numbers in the table (in contradistinction to strings), the type of values in columns need to be specified (at least for those columns on which we are going to make queries). So, we change in the first line of the dataset file the headers of two columns:

After that, we need to specify where our dataset is located. For this, we edit the program file and set the value of the tabfile variable to the path to our dataset file:

MainModule: {
    tabfile: "/mnt/dnl/100000-Records.csv",
    tabstr: "",

Then we perform the test:

$ tree3
Loading database...
20.44338 sec. elapsed
Building indexes...
1.698761 sec. elapsed
Perfoming query...
0.006182 sec. elapsed
UT[Drs., Cameron, Diggs, 36.35, 40119]
UT[Mr., Cory, Coyle, 37.62, 41078]
UT[Mr., Carol, Vangundy, 36.59, 41724]
UT[Mrs., Kristi, Beliveau, 38.39, 41796]
UT[Ms., Particia, Blair, 35.06, 41819]
UT[Mr., Wilber, Ransome, 37.67, 41994]
UT[Ms., Cathern, Pettit, 36.36, 42453]
UT[Mr., Lamar, Parson, 35.41, 42458]

We see that our little interpreter on a venerable Celeron 1.2GHz shows pretty mighty industry grade performance. This is not a trick or hack. All values in the table and indexes are boxed and used in a completely normal way without any raw-data hackery.

This demonstrates one of the main design properties of Transd: it is intended to become as declarative as possible. And the more high level a function is externally (such as SELECT data query - a distinct feature of some declarative languages), the more it is optimized in its implementation internally.

To further illustrate this principle, we will perform a benchmark at the opposite side of the abstraction scale: a lengthy number crunching in a manual implementation of matrix multiplication.

Benchmark 2: "Manual" matrix multiplication

The Transd program with the benchmark can be obtained here.

Benchmark: Two square matrices A and B are multiplied and the result is stored in matrix C.

Matrices are implemented as two-dimensional vectors of 'double' type numbers. In Transd types this looks as: 'Vector<Vector<Double>>'

This benchmark is done in the same hardware envionment as described above.

As a basis for assesing the benchmark results, the comparison with the same task done with Python will be used. The Python benchmark can be found here. The Python version used in the test is 3.7.4.

For n = 200 the following results were observed:

$ python
Computation time was: 5.470508 sec

$ tree3
Computation time was: 5.655717 sec

As we see, in the low-level data manipulation, our interpreter demonstrates quite remarkable performance, which is close to Python's.