The ongoing progress report


2022-May-31 | May results

Containers
Two new container types have been added: Set and HashIndex.


2022-Apr-30 | April results

Data importing
Data importing functionality has been added. Data importing (or object deserializing) allows quick creation and initialization of program objects of custom classes using external textual data. An example of this feature is deserializing data in JSON format into custom class objects in some languages.

Data importing in Transd is similar to JSON object deserializng, but enhanced with some essential features, so that whole hierarchies of custom objects with complex structure can be loaded into the program from data files with minimal coding.


2022-Feb-28 | February results

'locals:' keyword
The keyword locals: placed just after the function signature in the lambda declaration introduces local function variables, whose scope is the function body.

Built-in methods
(String::strip) method removes from one or both ends of a string the specified characters. Example.


2022-Jan-31 | January results

REDUCE data query
The data-processing functionality of the language has been expanded with the "REDUCE" data query type. Like other query types (SELECT and UPDATE) REDUCE works the same on large multi-column datasets and on one-dimensional vectors, which are regarded as one-column datasets.

#lang transd

MainModule: {
    v: ["apple", "orange", "avocado", "peach"],
    n: 0,

    _start: (λ 
        (tsd v reduce: ["(size col1)"] using: (λ i Int() (+= n i)))
        (textout n)
    )
    // Output:
    // 23
}

Lambdas: capturing
Lambda type has been improved by adding the ability for lambda objects to capture variables (including other lambdas) from the environment at the point of lambda definition. Thus enabling lambdas to perform as closures.

#lang transd
MainModule: {

Lii: typealias(Lambda<Int Int>()),

la: Lii(λ z Int() (ret (+ z 100))),

makeclos1: (λ fn Lii()
    (with clos Lii(λ [[fn]] i Int() 
            (textout (exec fn i)))
        (ret clos)
)   ),

_start: (λ
    (with clos (makeclos1 la)
        (exec clos 7) // prints 107
)   )
}

More examples can be found in the reference test suit: Lambdas.

Accurate floating point calculations
The (incr) and (decr) methods were added to the Double type for enabling accurate calculations with interval arithmetic. For more information and for an example of using these methods see here.

Built-in functions
Several built-in functions have been added into the core language.

Examples of using these functions can be seen here and here. Or, as usually, they can be found in the test suit with the command:

grep -r <FUNC_NAME> <TEST_SUIT_DIRECTORY>


2021-Dec-31 | December results

Lambdas
Transd supports anonymous function objects ("lambda" functions), which can be used in a great many of places, for example:

   (with v ["C", "a", "D", "b", "A"]
       (sort v (lambda l String() r String() -> Bool()
              (ret (less (tolower l) (tolower r)))))
       (lout v)) //<= [a, A, b, C, D]

Now, anonymous lambdas are supplemented with named variables of Lambda type. Lambda objects can be copied, passed as arguments to functions, returned from functions, etc. That is, they are just another data type.

   (with lam Lambda<String Null>(λ s String() 
         (textout "Hello, " s "!") )

         (exec lam "World") //<= Hello, World!
   )

Pipeline calling order
Pipeline evaluation operator has been added. It makes possible to compose functions by concatenating function calls instead of nesting. For example, suppose, we have some string as data and we want to perform a sequence of operations on this data:

  1. Split it to words;
  2. Sort words in alphabetical order;
  3. Print sorted words to the screen.

In usual way, this data processing flow can be arranged as follows:

(textout 
  (sort 
    (split someStr " ")))

The pipeline operator -| allows us to avoid deep nesting and to write function calls in the same order as operations follow logically:

(-| (split someStr " ") 
    (sort) 
    (textout))

The necessary condition for combining function calls in this way is that the return value of a function must be admissible as an argument to the next function.

The inclusion of pipeline operator into language constructs, can make parentheses-grouping syntax arguably one of the most clean and clear:

Rosetta task: Anagrams

Formatted output
Formatting capabilities of text streams have been expanded with the following manipulators: fill:, :left, :right, :internal, prec:, :fixed, :boolalpha, :noboolalpha.

(with pr1 3.7 pr2 1.5
    (lout width: 20 fill: "." :left "Fried chicken" 
             prec: 2 :fixed pr1 " USD" )
    (lout width: 20 "Ice-cream" pr2 " USD" )
    (lout width: 28 fill: "-" )
    (lout width: 20 fill: " " :right "Total: " (+ pr1 pr2) " USD" )
 )
OUTPUT:

Fried chicken.......3.70 USD
Ice-cream...........1.50 USD
----------------------------
             Total: 5.20 USD

Container methods
Another method - (coincide) - has been added to generic container methods. It takes two containers and returns the length of their common prefix (or postfix), that is the number of equal elements from the beginning (or the end) of the container.

(with v1 [0,1,2,3,4,5,6,7] v2 [0,1,2,3,4,4,5,6,7]
  (textout (coincide v1 v2)) // <= 5
)


2021-Nov-30 | November results

Expanding data processing capabilities
As the core of Transd has acquired its stable long-term shape, the emphasis of development has shifted to strengthening and expanding specialized parts of the language.

The data processing section has been supplemented with Table class, and now includes the following classes:

  1. Object - represents a text block of name/value pairs, treated as a single object. Such objects in Transd are called "TSD objects" ("TSD" - is short for "Transd Structured Data").

  2. TSDBase - a collection of TSD objects, which can be viewed as an ad-hoc "NoSQL" database, and which supports "SELECT" and "UPDATE" data queries.

  3. Table - a class for working with tabular data (e.g. CSV files). Table objects can be viewed as one-table databases, and they support "SELECT" and "UPDATE" data queries.

Performance increase
Some good amount of laborious optimizations and profiling has brought its positive results in the form of considerable performance gains in both low-level and high-level operations of the language. Results can be seen here.

Portability increase
With the addition of Clang to the list of building tools with which Transd is built seamlessly, the portability of the language reached its, in fact, ultimate level: on both platforms Transd can be compiled using, basically, the same command:

Linux:

$ clang++ -std=c++14 -O3 src/transd.cpp src/main.cpp -D__LINUX__ -lpthread -o tree3

Windows:

PS clang++ -std=c++14 -O3 src/transd.cpp src/main.cpp -DWIN32 -o tree3


2021-Oct-31 | October results

(tsd-query) : UPDATE
Data query functionality has been expanded with the "UPDATE" query type. An example of using the UPDATE query can be seen here: Merge aggregate datasets.

DateTime type
Another type was added to the type system: DateTime. Among other uses, this type is indispensable in data processing. An example of usage is via the link in the previous paragraph.

Example Transd program
Audio Flow Combiner, written in Transd, has reached the first level of maturity. It plays smoothly for hours most intricate and interwoven flows. The memory usage is astoundingly low (around 5 Mb of commited memory on Linux).

2021-Sep-23 | "AFC": a program on Transd

For demonstration purposes a first program on Transd has been created: "Audio Flow Combiner". This program illustrates one way of using Transd as a front-end language. Serving as a front-end for the popular and venerable "SoX" audio program, AFC can create finely grained audio flows from one or several audio files.

This program demonstrates many features of the language and main principles of structuring a Transd program. Its source code can be used as a tutorial in Transd programming, and as a reference for the particular task of scripting program's behaviour through the command line. The analysis of the program's source code can be found here.



2021-Aug-28 | TSD objects

A new Object type has been added to the type system. This type represents a Transd Structured Data (TSD) object: a named block of text data structured in the form of name/value pairs:

"order_255" : {
    "Orange Juice": 0.95,
    "Lunch Herb Crusted Salmon": 3.95,
    "Orange Chicken": 1.95,
    "Side of French Fries": 0.95, 
    total: 7.80
}

A text file with many such objects can be processed with Transd in the following way:

objs: Index<String Vector<Object>>(),

(rebind objs (group-by 
                (read-tsd-file "restaurant_orders") 
                (λ ob Object() -> String() 
                    (get-String ob "name"))))

And in objs variable we have an Index with TSD objects, addressable by name (e.g. "order_255") and ready to be placed into a TSDBase or processed in some other way.

What gives this feature much more power is that it can be used for initializing Transd objects with text data.

E.g. we have a program that defines several classes and uses objects of those classes. Then we can initialize these objects from a single text file, which can play the role of a database, advanced configuration file, etc. Our program can look like this:

class ClassA : {
    field1: String(),
    field2: Int(),
    meth1: (lambda ... )
}

class ClassB : {
    field1: vector<String>(),
    field2: Double(),
    meth1: (lambda ... )
}

MainModule: {
    objs: Index<String Vector<Object>>(),
    objA_1: ClassA(),
    objA_2: ClassA(),
    objB: ClassB(),

    _start: (λ (rebind objs (group-by 
                (read-tsd-file "database1") 
                (λ ob Object() -> String() 
                    (get-String ob "name"))))
    (load-from-object objA_1 (get (snd (get objs "objA_1")) 0))
    (load-from-object objA_2 (get (snd (get objs "objA_2")) 0))
    (load-from-object objB (get (snd (get objs "objB")) 0))
    ...

And the program's data file will look like this:

"objA_1": {
    class: "ClassA",
    field1: "string1",
    field2: 25
}
"objA_2": {
    class: "ClassA",
    field1: "string2",
    field2: 37
}
"objB": {
    class: "ClassB",
    field1: ["string3", "string4", "string5"],
    field2: 14.1
}

Thus, with addition of TSD Object type, it's now possible in Transd to implement with minimum of code the chains "TEXT_DATA --> TSD Object --> TSDBase database". That is, with a small amount of code we can create objects of custom structures, read them in from a text file, and work with them in TSDBase in a database way: sorting, quering, selecting, etc.



2021-Jul-08 | Type system

Type system has become much closer to the production shape. All fundamental types are of fixed size now, instead of platform-dependent sizes, when long int on Windows was 4 bytes and on Linux - 8 bytes. Strings remain platform dependent, since wchar_t is irreplaceable for Unicode handling, and its size is different on Windows and Linux.

The built-in types are:

Byte        - unsigned, 1 byte
Int         - signed, 4 bytes
Long        - signed, 8 bytes
ULong       - unsigned, 8 bytes
String      - UTF-16 on Windows, UTF-32 on Linux
ByteArray   - container for native (unboxed) unsigned bytes
Vector<>    - generic sequence container
Index<>     - generic associative array
HashIndex<> - generic hash table

In other news - a new Rosetta task has been implemented, where the first usage of formatted output, mentioned in the previous blog entry, can be seen (as well as the new type system features): AKS test for primes



2021-Jun-30 | Formatted output

As release is nearing, the design decisions that were postponed until later stages are being made. The conception of how to do the formatted output in Transd has been detalized.

I decided to go with the C++ model, which uses stream manipulators for output formatting, since the concept of manipulators ideally aligns with already existing Transd's markers. Manipulators define how the nearest item after them should be formatted. So, the Transd code for formatted output of, e.g., a string and a double can look like this:

(textout width: 6 "Pi:" :sign prec: 4 PI)

Which will produce the output:

Pi:   +3.1416

This illustrates how important it is to make design decisions in their proper time. Markers appeared relatively late in the language, and if on early stages some other model were chosen for formatted output (e.g. Python's), then it certainly would not look and act so uniform with the rest of the syntax. Compare the above example with a call of 'substr' function:

(substr s from: after: last: "/" to: last: ".")


2021-Jun-23 | Data processing

Containers received another upgrade: now Transd supports the pipeline semantics for staged data processing.

An example of pipelined data processing is an implementation of "Anagrams" Rosetta task. The contents of an English dictionary is read to the 'words' string, then this data is passed through several stages of processing, and as a result the list of words with the maximum number of anagrams is outputted:

#lang transd

MainModule: {
    _start: (λ 
        (with fs FileStream() words String()
            (open fs "/mnt/proj/tmp/unixdict.txt")
            (textin words fs)
            (textout 
                (snd (max-element
                (regroup-by 
                (group-by 
                (split words)
                    (λ s String() -> String() (sort s)))
                    (λ v Vector<String>() -> Int() (size v))))))))
}

Output:

[[abel, able, bale, bela, elba], 
[caret, carte, cater, crate, trace], 
[angel, angle, galen, glean, lange], 
[alger, glare, lager, large, regal], 
[elan, lane, lean, lena, neal], 
[evil, levi, live, veil, vile]]



2021-Jun-10 | Streams

Transd now supports three types of streams:

StringStream - for Unicode text data;

ByteStream - for raw bytes;

FileStream - file I/O.

All stream types work in a uniform way, support automatic conversion between strings and bytes, and can serve as source or destination for data. For example, the following call:

(textout to: MyStream "Some text")

can output text to any of three stream types as well as to StdOut. For additional examples see the "Type system/Streams" folder in the test suit.



2021-Mar-14 | Type aliases

Another new feature was added: type aliases. Type aliasing creates an additional name for a data type without creating a new type. It is used for simplifying the syntax of declaring complex compound types or for providing descriptive names to types in specific context.

Example:

#lang transd

MainModule: {
    Tuis : typealias( Tuple( Int() String() ) ),
    v1: Vector( Tuis() ),
    uv1: [[6, "a"]],
    uv2: [[1, "c"]],
    uv3: [[3, "h"]],
    uv4: [[2, "e"]],
    _start: (λ (with abc 1)
        (add v uv1) (add v uv2)(add v uv3)(add v uv4)
        (set v1 0 uv1) (set v1 1 uv2)(set v1 2 uv3)(set v1 3 uv4)
        (textout "v: " v "\n")
        (textout "v1: " v1 "\n")
        (textout (sort v :asc 
            (lambda l Tuis() r Tuis() -> Bool()
                (ret (less<Int> (get l 0) (get r 0))))) "\n")
    )
}