Performance test of Transd


Transd speed benchmarks described on this page provide a rough assessment of the language performance. The specs of hardware chosen for the benchmarks show that the language can be used on a very wide range of hardware. One benchmark performs processing of structured data, the other - multiplication of two matrices

The benchmarks are done on a machine having:

Processor: Intel Celeron 1.2GHz from 2002.
Memory: 2Gb.
Operating system: Linux with more or less modern 4.19 kernel.


$ cat /proc/cpuinfo
processor	: 0
vendor_id	: GenuineIntel
cpu family	: 6
model		: 22
model name	: Intel(R) Celeron(R) CPU          220  @ 1.20GHz
stepping	: 1
microcode	: 0x36
cpu MHz		: 1200.060
cache size	: 512 KB

$ lsmem
Memory block size:       128M
Total online memory:       2G
Total offline memory:      0B

The benchmarks are run with the "TREE3" Transd interpreter ("Transd Expression Evaluator"), which can be downloaded here.


Benchmark 1: SELECT query on a table


Benchmark: a table, containing 3.7 million values in 100.000 rows and 37 columns is read from a CSV file, then a query is performed on the table, which selects table's rows based on certain values in two columns.

The Transd program with the benchmark can be obtained here.

The dataset for the test is taken from a site offering datasets for testing purposes:
https://eforexcel.com/wp/downloads-16-sample-csv-files-data-sets-for-testing/
The link to the dataset file:
https://eforexcel.com/wp/wp-content/uploads/2017/07/100000-Records.zip

In order to be able to operate with numbers in the table (in contradistinction to strings), the type of values in columns need to be specified (at least for those columns on which we are going to make queries). So, we change in the first line of the dataset file the headers of two columns:

After that, we need to specify where our dataset is located. For this, we edit the program file and set the value of the tabfile variable to the path to our dataset file:

MainModule: {
    tabfile: "/mnt/dnl/100000-Records.csv",
    tabstr: "",

Then we perform the test:

$ tree3 perftest.td
 
Loading database...
20.44338 sec. elapsed
Building indexes...
1.698761 sec. elapsed
Perfoming query...
0.006182 sec. elapsed
UT[Drs., Cameron, Diggs, 36.35, 40119]
UT[Mr., Cory, Coyle, 37.62, 41078]
UT[Mr., Carol, Vangundy, 36.59, 41724]
UT[Mrs., Kristi, Beliveau, 38.39, 41796]
UT[Ms., Particia, Blair, 35.06, 41819]
UT[Mr., Wilber, Ransome, 37.67, 41994]
UT[Ms., Cathern, Pettit, 36.36, 42453]
UT[Mr., Lamar, Parson, 35.41, 42458]

We see that for this class of hardware Trand interpreter demonstrates rather remarkable performance.


Benchmark 2: "Manual" matrix multiplication


Benchmark: Two square matrices A and B are multiplied and the result is stored in matrix C.

Matrices are implemented as two-dimensional vectors of 'double' type numbers. In Transd types this looks as: 'Vector<Vector<Double>>'

The Transd program with the benchmark can be obtained here.

This benchmark is done in the same hardware envionment as described above.

As the basis for assesing the benchmark results, the comparison with the same task done with Python will be used. The Python benchmark can be found here. The Python version used in the test is 3.7.4.

For n = 200 the following results were observed:

$ python matrices.py
Computation time was: 5.470508 sec

$ tree3 matrices.td
Computation time was: 5.655717 sec

As we see, in the low-level data manipulation, the speed of Transd interpreter is quite satisfactory and close to Python's.