The ongoing progress report

2021-Sep-23 | "AFC": a program on Transd

For demonstration purposes a first program on Transd has been created: "Audio Flow Combiner". This program illustrates one way of using Transd as a front-end language. Serving as a front-end for the popular and venerable "SoX" audio program, AFC can create finely grained audio flows from one or several audio files.

This program demonstrates many features of the language and main principles of structuring a Transd program. Its source code can be used as a tutorial in Transd programming, and as a reference for the particular task of scripting program's behaviour through the command line. The analysis of the program's source code can be found here.

2021-Aug-28 | TSD objects

A new Object type has been added to the type system. This type represents a Transd Structured Data (TSD) object: a named block of text data structured in the form of name/value pairs:

"order_255" : {
    "Orange Juice": 0.95,
    "Lunch Herb Crusted Salmon": 3.95,
    "Orange Chicken": 1.95,
    "Side of French Fries": 0.95, 
    total: 7.80
}

A text file with many such objects can be processed with Transd in the following way:

objs: Index<String Vector<Object>>(),

(rebind objs (group-by 
                (read-tsd-file "restaurant_orders") 
                (λ ob Object() -> String() 
                    (get-String ob "name"))))

And in objs variable we have an Index with TSD objects, addressable by name (e.g. "order_255") and ready to be placed into a TSDBase or processed in some other way.

What gives this feature much more power is that it can be used for initializing Transd objects with text data.

E.g. we have a program that defines several classes and uses objects of those classes. Then we can initialize these objects from a single text file, which can play the role of a database, advanced configuration file, etc. Our program can look like this:

class ClassA : {
    field1: String(),
    field2: Int(),
    meth1: (lambda ... )
}

class ClassB : {
    field1: vector<String>(),
    field2: Double(),
    meth1: (lambda ... )
}

MainModule: {
    objs: Index<String Vector<Object>>(),
    objA_1: ClassA(),
    objA_2: ClassA(),
    objB: ClassB(),

    _start: (λ (rebind objs (group-by 
                (read-tsd-file "database1") 
                (λ ob Object() -> String() 
                    (get-String ob "name"))))
    (load-from-object objA_1 (get (snd (get objs "objA_1")) 0))
    (load-from-object objA_2 (get (snd (get objs "objA_2")) 0))
    (load-from-object objB (get (snd (get objs "objB")) 0))
    ...

And the program's data file will look like this:

"objA_1": {
    class: "ClassA",
    field1: "string1",
    field2: 25
}
"objA_2": {
    class: "ClassA",
    field1: "string2",
    field2: 37
}
"objB": {
    class: "ClassB",
    field1: ["string3", "string4", "string5"],
    field2: 14.1
}

Thus, with addition of TSD Object type, it's now possible in Transd to implement with minimum of code the chains "TEXT_DATA --> TSD Object --> TSDBase database". That is, with a small amount of code we can create objects of custom structures, read them in from a text file, and work with them in TSDBase in a database way: sorting, quering, selecting, etc.

2021-Jul-08 | Type system

Type system has become much closer to the production shape. All fundamental types are of fixed size now, instead of platform-dependent sizes, when long int on Windows was 4 bytes and on Linux - 8 bytes. Strings remain platform dependent, since wchar_t is irreplaceable for Unicode handling, and its size is different on Windows and Linux.

The built-in types are:

Byte        - unsigned, 1 byte
Int         - signed, 4 bytes
Long        - signed, 8 bytes
ULong       - unsigned, 8 bytes
String      - UTF-16 on Windows, UTF-32 on Linux
ByteArray   - container for native (unboxed) unsigned bytes
Vector<>    - generic sequence container
Index<>     - generic associative array
HashIndex<> - generic hash table

In other news - a new Rosetta task has been implemented, where the first usage of formatted output, mentioned in the previous blog entry, can be seen (as well as the new type system features): AKS test for primes



2021-Jun-30 | Formatted output

As release is nearing, the design decisions that were postponed until later stages are being made. The conception of how to do the formatted output in Transd has been detalized.

I decided to go with the C++ model, which uses stream manipulators for output formatting, since the concept of manipulators ideally aligns with already existing Transd's markers. Manipulators define how the nearest item after them should be formatted. So, the Transd code for formatted output of, e.g., a string and a double can look like this:

(textout width: 6 "Pi:" :sign prec: 4 PI)

Which will produce the output:

Pi:   +3.1416

This illustrates how important it is to make design decisions in their proper time. Markers appeared relatively late in the language, and if on early stages some other model were chosen for formatted output (e.g. Python's), then it certainly would not look and act so uniform with the rest of the syntax. Compare the above example with a call of 'substr' function:

(substr s from: after: last: "/" to: last: ".")


2021-Jun-23 | Data processing

Containers received another upgrade: now Transd supports the pipeline semantics for staged data processing.

An example of pipelined data processing is an implementation of "Anagrams" Rosetta task. The contents of an English dictionary is read to the 'words' string, then this data is passed through several stages of processing, and as a result the list of words with the maximum number of anagrams is outputted:

#lang transd

MainModule: {
    _start: (λ 
        (with fs FileStream() words String()
            (open fs "/mnt/proj/tmp/unixdict.txt")
            (textin words fs)
            (textout 
                (snd (max-element
                (regroup-by 
                (group-by 
                (split words)
                    (λ s String() -> String() (sort s)))
                    (λ v Vector<String>() -> Int() (size v))))))))
}

Output:

[[abel, able, bale, bela, elba], 
[caret, carte, cater, crate, trace], 
[angel, angle, galen, glean, lange], 
[alger, glare, lager, large, regal], 
[elan, lane, lean, lena, neal], 
[evil, levi, live, veil, vile]]



2021-Jun-10 | Streams

Transd now supports three types of streams:

StringStream - for Unicode text data;

ByteStream - for raw bytes;

FileStream - file I/O.

All stream types work in a uniform way, support automatic conversion between strings and bytes, and can serve as source or destination for data. For example, the following call:

(textout to: MyStream "Some text")

can output text to any of three stream types as well as to StdOut. For additional examples see the "Type system/Streams" folder in the test suit.



2021-Mar-14 | Type aliases

Another new feature was added: type aliases. Type aliasing creates an additional name for a data type without creating a new type. It is used for simplifying the syntax of declaring complex compound types or for providing descriptive names to types in specific context.

Example:

#lang transd

MainModule: {
    Tuis : typealias( Tuple( Int() String() ) ),
    v1: Vector( Tuis() ),
    uv1: [[6, "a"]],
    uv2: [[1, "c"]],
    uv3: [[3, "h"]],
    uv4: [[2, "e"]],
    _start: (λ (with abc 1)
        (add v uv1) (add v uv2)(add v uv3)(add v uv4)
        (set v1 0 uv1) (set v1 1 uv2)(set v1 2 uv3)(set v1 3 uv4)
        (textout "v: " v "\n")
        (textout "v1: " v1 "\n")
        (textout (sort v :asc 
            (lambda l Tuis() r Tuis() -> Bool()
                (ret (less<Int> (get l 0) (get r 0))))) "\n")
    )
}