SON-OF-UNBABTIZED

Welcome to SON-OF-UNBABTIZED, or SOU in short. Its not really related to UNBABTIZED, more of an illegitimate child of Sorted!, but since Sorted! is not faithful to the roman-catholic church, it is fit to called this language SON-OF-UNBABTIZED even though that is the wrong spelling.

Its not an Essies candidate, either, because I feel it has too much of a Sorted! feel to it.

So, without further ado, here are SOUs

FEATURES

People want readable code, I give you readable code, code that looks like your good old COBOL on stereoids, code that makes you want to memorize it and tell it to your as of yet hopefully unborn grandchildren. Here is the obligatory sample code, proof that the compiler is 100% bugfree and all that.

QUICKSORT IS A FUNCTION OF 3 PARAMETERS THAT IMPLEMENTS IDispatch.
IT USES 10 LOCAL VARIABLES.
THE STATEMENT CALLING QUICKSORT NOT 5 NOT 1 NOT 0 IS 
LABELED ACID. THE STATEMENT CALLING QUICKSORT NOT 2 
NOT 4 NOT 0 IS LABELED JUNKIES.
THE STATEMENT IGNORE IS NOT NOT 0 IS LABELED PHUTURE.
THE STATEMENT IGNORE IS + NOT 4 IS LABELED WILL.
THE STATEMENT NOT 3 IS NOT NOTHING IS LABELED STAY.
THE STATEMENT NOTHING IS <= NOT 3 IS LABELED ALIVE.
THE STATEMENT NOT 9 IS NOT NOTHING IS LABELED RICHIE.
THE STATEMENT IGNORE IS / 2 IS LABELED KEN.
THE STATEMENT NOT 4 IS <= NOT 5 IS LABELED ISHII.
THE STATEMENT NOT 4 IS NOT NOT 1 IS LABELED FRANKIE.
THE STATEMENT NOT 1 IS >= NOT 5 IS LABELED KNUCKLES.
THE STATEMENT NOT 4 IS >= NOT 2 IS LABELED JUAN.
THE STATEMENT NOTHING IS NOT NOT 6 IS LABELED ATKINS.
THE STATEMENT NOTHING IS >= NOT 3 IS LABELED HAWTIN.
THE STATEMENT NOT 5 IS NOT NOT 2 IS LABELED LFO.
THE STATEMENT IGNORE IS + NOT 5 IS LABELED KEVIN.
THE STATEMENT IGNORE IS + NOT 2 IS LABELED SAUNDERSON.
THE STATEMENT NOT 4 IS > NOT 5 IS LABELED DAVE.
THE STATEMENT NOT 6 IS NOT NOTHING IS LABELED CLARKE.
THE STATEMENT NOTHING IS NOT NOT 9 IS LABELED AUX88.
THE STATEMENT IGNORE IS + NOT 0 IS LABELED BLAKE.
THE STATEMENT IGNORE IS NOT NOT 1 IS LABELED BAXTER.
THE STATEMENT THAT RETURNS IGNORE IS LABELED DETROIT.
THE STATEMENT THAT RETURNS 0 IS LABELED HOUSE.
THE STATEMENT THAT RETURNS 1 IS LABELED TECHNO.
THE STATEMENT IGNORE IS NOT NOT 4 IS LABELED PHOTEK.
THE STATEMENT IGNORE IS + 1 IS LABELED MORALES.
THE STATEMENT NOT 4 IS NOT IGNORED IS LABELED RANDOMXS.
THE STATEMENT IGNORE IS NOT NOT 5 IS LABELED TODDTERRY.
THE STATEMENT IGNORE IS - 1 IS LABELED CJBOLLAND.
THE STATEMENT NOT 5 IS NOT IGNORED IS LABELED DARRENEMERSON.
THE STATEMENT STATING HOUSE,STAY,BLAKE,KEN,SAUNDERSON,
BAXTER,LFO,FRANKIE IS LABELED ANDYC.
THE STATEMENT STATING DETROIT,HAWTIN,WILL,PHUTURE IS 
LABELED GROOVERIDER. THE STATEMENT STATING TECHNO,
RANDOMXS,MORALES,PHOTEK IS LABELED TECHNICAL.
THE STATEMENT STATING DETROIT,ALIVE,KEVIN,PHUTURE IS
LABELED ITCH. THE STATEMENT STATING TECHNO,
DARRENEMERSON,CJBOLLAND,TODDTERRY IS LABELED DANNY.
THE STATEMENT STATING DETROIT,DAVE IS LABELED BREAKS.
THE STATEMENT STATING HOUSE,DARRENEMERSON,CJBOLLAND,
TODDTERRY,RANDOMXS,MORALES,
PHOTEK,ATKINS,KEVIN,PHUTURE,AUX88,WILL,PHUTURE,RICHIE,
KEVIN,PHUTURE,CLARKE,WILL,PHUTURE IS LABELED HARDCORE.
THE STATEMENT STATING DETROIT,ISHII IS LABELED JUNGLE.
THE STATEMENT STATING DETROIT,KNUCKLES IS LABELED DMXKREW.
THE STATEMENT STATING HOUSE,ACID IS LABELED SPICELAB.
THE STATEMENT STATING DETROIT,JUAN IS LABELED HUMATE.
THE STATEMENT STATING TECHNO,JUNKIES IS LABELED HARDFLOOR.
THE STATEMENT STATING HOUSE IS LABELED FORCEINC.
THE STATEMENT GOING FROM ANDYC TO 12 IS LABELED JOHN.
THE STATEMENT GOING FROM GROOVERIDER TO 3 IS LABELED DIGWEED.
THE STATEMENT GOING FROM TECHNICAL TO 1 IS LABELED MRC.
THE STATEMENT GOING FROM ITCH TO 5 IS LABELED METALHEADZ.
THE STATEMENT GOING FROM DANNY TO 3 IS LABELED GOLDIE.
THE STATEMENT GOING FROM BREAKS TO 7 IS LABELED SPEEDFREAK.
THE STATEMENT GOING FROM HARDCORE TO 12 IS LABELED SVENVAETH.
THE STATEMENT GOING FROM JUNGLE TO 1 IS LABELED MODELL500.
THE STATEMENT GOING FROM DMXKREW TO 10 IS LABELED DJUNGLEFEVER.
THE STATEMENT GOING FROM SPICELAB TO 12 IS LABELED ALEC.
THE STATEMENT GOING FROM HUMATE TO 12 IS LABELED EMPIRE.
THE STATEMENT GOING FROM HARDFLOOR TO 12 IS LABELED A1PEOPLE.
THE STATEMENT GOING FROM FORCEINC TO 12 IS LABELED KLUBRADIO.
THE STATEMENT COMING FROM KLUBRADIO,A1PEOPLE,EMPIRE,
ALEC,DJUNGLEFEVER,MODELL500, SVENVAETH,SPEEDFREAK,GOLDIE,
METALHEADZ,MRC,DIGWEED,JOHN IS LABELED IDispatch.
MAIN IS A FUNCTION OF NO PARAMETERS THAT IMPLEMENTS IUnknown.
IT USES 1 LOCAL VARIABLE.
THE STATEMENT NOT 0 IS NOT 0 IS LABELED DERRICKMAY.
THE STATEMENT IGNORE IS NOT NOT 0 IS LABELED STACEY.
THE STATEMENT NOTHING IS NOT ANYTHING IS LABELED YOUNG.
THE STATEMENT NOT 0 IS NOT IGNORED IS LABELED ACID.
THE STATEMENT SAYING NOTHING IS LABELED MILLS.
THE STATEMENT SAYING "BEFORE QUICKSORT:" IS LABELED DJRUSH.
THE STATEMENT SAYING "AFTER QUICKSORT:" IS LABELED CLAUDE.
THE STATEMENT IGNORE IS < 100 IS LABELED PULLEN.
THE STATEMENT IGNORE IS + 1 IS LABELED JEFF.
THE STATEMENT IGNORE IS NOT 0 IS LABELED WARRIOR.
THE STATEMENT CALLING QUICKSORT 99 0 0 IS LABELED BELTRAM.
THE STATEMENT SAYING 13 AS CHAR IS LABELED JOEY.
THE STATEMENT THAT RETURNS IGNORE IS LABELED UNDERGROUND.
THE STATEMENT THAT RETURNS 0 IS LABELED RESISTANCE.
THE STATEMENT STATING RESISTANCE,DERRICKMAY IS LABELED ANDYC.
THE STATEMENT STATING UNDERGROUND,PULLEN,ACID,JEFF,YOUNG,
STACEY IS LABELED GROOVERIDER. THE STATEMENT STATING RESISTANCE,
ACID,WARRIOR,DJRUSH IS LABELED TECHNICAL. THE STATEMENT STATING
UNDERGROUND,PULLEN,ACID,JEFF,MILLS,STACEY IS LABELED ITCH. THE 
STATEMENT STATING RESISTANCE,DERRICKMAY,CLAUDE,BELTRAM,
JOEY IS LABELED DANNY. THE STATEMENT STATING UNDERGROUND,
PULLEN,ACID,JEFF,MILLS,STACEY IS LABELED BREAKS.
THE STATEMENT STATING RESISTANCE,JOEY IS LABELED HARDCORE.
THE STATEMENT GOING FROM ANDYC TO 0 IS LABELED JOHN.
THE STATEMENT GOING FROM GROOVERIDER TO 1 IS LABELED DIGWEED.
THE STATEMENT GOING FROM TECHNICAL TO 0 IS LABELED MRC.
THE STATEMENT GOING FROM ITCH TO 3 IS LABELED METALHEADZ.
THE STATEMENT GOING FROM DANNY TO 0 IS LABELED GOLDIE.
THE STATEMENT GOING FROM BREAKS TO 5 IS LABELED SPEEDFREAK.
THE STATEMENT GOING FROM HARDCORE TO 0 IS LABELED SVENVAETH.
THE STATEMENT COMING FROM SVENVAETH,SPEEDFREAK,GOLDIE,METALHEADZ,
MRC,DIGWEED,JOHN IS LABELED IUnknown.

Its pretty obvious that this is yer old recursive quicksort algorithm. The instructions are not logically grouped, but can be written *in any order* you like. The program thus separates the statements from the order of their logical execution.

People wanted comments, I give you comments. You can use the whole malaysian characterset from 0xD00 to 0xD7F for comments. See http://www.unicode.org/charts/PDF/U0D00.pdf. Incidentally, the only feasible way to insert comments seems by using a hexeditor. Not quite so incidentally, I've been rather involved in a hex editor project recently, called "The Free Hexeditor" (see http://www.kibria.de/frhed.html).

People wanted input, I give you input. Besides hardcoded input, which is next to godliness, SOU supports user input: The code listens to data on the absolute address 0xbaadbeef. It is a beginners task to write a debugger that will catch read attempts to this invalid address and return the user data instead, which, because it is so trivial, is left as an exercise to the reader.

IMPLEMENTATION DETAILS

The SOU C Compiler is a compiler, not an interpreter. I think the only other compiler I wrote sofar was for Sorted!, so its not exactly rocket sience. It has an option to produce optimized code, which results in randomly inserted printf-statements "I feel great", "Life is good", ":)" and "Happy Happy Joy Joy" (a small reference to Ren & Stimpy).

The SOU C Compiler uses GOTOs exclusively. The generated code uses GOTOs exclusively. I think its called "Proof of Concept".

The SOU C compiler understands sourcecode in UTF-16LE and nothing else. If the file doesn't have a BOM that indicates UTF-16LE, the compiler will fail. Note that the *syntax* would work with plain 7-BIT 1960-Style ASCII (See http://www.bobbemer.com/BEMER-CV.HTM), its just the SOU C compiler refuses to load anything other.

The SOU C compiler is written using Win32 functions and is not portable. Being portable is way too fashionable to be of real value. Lately, all new languages have been portable, so I think this is a nice individualistic turn away from that. A python interpreter might follow, but I will use my very best to make sure its a nonportable version in python (how is that for an achievement).

DOWNLOAD

You can download the whole SOU distribution here. Note that you need a) Win32, and b) the Microsoft Visual C++ 6.0 compiler to use this tool.

TUTORIAL

The code is a wild bunch of statements, seperated by dots. That is about all of UNBABTIZED heritage there is in SOU.

CODE = { STATEMENT '.' }

Function headers

Actually, its more like a series of functions, that is implemented by statements. Each function has a prologue (if that word exists) and a body. The following STATEMENT will declare a function

FUNC_DEC = NAME 'IS A FUNCTION OF' (NUMBER|'NO')
    ('PARAMETER'|'PARAMETERS') 'THAT IMPLEMENTS' NAME.

So, for example,

MAIN IS A FUNCTION OF NO PARAMETERS THAT IMPLEMENTS IUnknown.

is a valid statement. In fact, its a required statement for any SOU code, because that is what will be executed. Another example would be

QUICKSORT IS A FUNCTION OF 3 PARAMETERS THAT IMPLEMENTS IDispatch.

which is pretty self-explaining. All statements following this statement up to the next function declaration belong to the same function. They can be in any order you like.

You can say that a function uses local variables.

LV_DECL = 'IT USES' NUMBER ('LOCAL VARIABLE'|'LOCAL VARIABLES').

So,

IT USES 10 LOCAL VARIABLES.

is pretty self-explaining, again. If omitted, the function has no local statements other than IGNORE.

Function bodys

Most statements that do stuff look like this

STMT_DECL = 'THE STATEMENT' STMT 'IS LABELED' NAME.

For example, in

THE STATEMENT NOTHING IS NOT ANYTHING IS LABELED S8.

NOTHING IS NOT ANYTHING is the do-stuff-part of the statement, so to speak, and S8 is the name.

You can do assignments:

ASSIGNMENT = VARIABLE 'IS NOT' (LITERAL | VARIABLE).

That is philosophically true, because at the time of writing that statement, the left side IS NOT equal to the right side, otherwise the statement would be redundant and should be omitted in the first place. So you probably already guessed that NOTHING IS NOT ANYTHING is an assignment of ANYTHING to the variable NOTHING.

Variables

There are several types of variables you can use.

Local variables are allocated as an array (see the IT USES n LOCAL VARIABLES statement above), so they must be (zero-based) indexed. The syntax is

LOCAL_VARIABLE = 'NOT' (NUMBER|'ANYTHING').

So, NOT 3 is the fourth local variable, and NOT ANYTHING is the first local variable. Every function has one hardcoded local helper variable, called IGNORE. So, in

THE STATEMENT IGNORE IS NOT NOT 0 IS LABELED S2.

local variable 0 (NOT 0) is assigned to the helper variable IGNORE. Its pretty obvious. You can also use IGNORED instead of IGNORE, SOU treats both as equal.

Every SOU program has a global memory of 2817 integers. You can access the memory only indirectly in reference to the helper variable IGNORE. So, if you want to access the third memory location, you must first load IGNORE with the integer literal 2, and then write

NOTHING.

So, for example, the statement

THE STATEMENT NOTHING IS NOT NOT 0 IS LABELED S8.

will assign the first local variable to the memory cell indexed by the helper variable IGNORE.

Literals

There are two kinds of literals. Normal integer literals are just numbers: 0, 1, and so on.

Plus, you can use ANYTHING to get a random number. Note that

THE STATEMENT IGNORE IS NOT ANYTHING IS LABELED S8.

will load the helper value with a random number, while

THE STATEMENT IGNORE IS NOT NOT ANYTHING IS LABELED S8.

will load the helper value with the first local variable.

Arithmetic functions

The arithmetic functions are of the do-stuff sort, and modify the left variable. The following, pretty self-explaining operations are supported.

<, <=, >=, >, !=, ==, +, -, *, /, %

For example,

THE STATEMENT IGNORE IS + 1 IS LABELED S4.

will increment the local helper variable by 1, and

THE STATEMENT IGNORE IS < 100 IS LABELED S3.

will check if it is less than 100.

Grouping Statements

You can group a series of statements by their name. The syntax is:

FUNCGROUP = 'STATING' NAME { ',' NAME }.

Note that the order of statements is reversed, so the first statement must be the last in the list, and so on. For example,

THE STATEMENT STATING C,B,A IS LABELED ACIDHOUSE.

represents the order-of-execution A(), B(), C().

Associating Statements with numbers

You can associate a function group with a number to go to, if the last function returns not 0 (true). The syntax is

FUNCASSOC = 'GOING FROM' NAME 'TO' LITERAL.

For example,

THE STATEMENT GOING FROM BREAKS TO 7 IS LABELED SPEEDFREAK.

means: If the function BREAKS() fails, go to the 7th function group of the interface implemented.

Interface definition

You can define an interface that represents the order of function statements to be executed. This, in combination with the associating-statements-with-numbers-feature, is the only way to do looping in SOU. The syntax is:

FUNCCOME = 'COMING FROM' NAME { ',' NAME }.

For example,

THE STATEMENT COMING FROM SVENVAETH,SPEEDFREAK,GOLDIE,METALHEADZ
    MRC,DIGWEED,JOHN IS LABELED IUnknown.

is the main program logic in the Quicksort test program above.

Doing I/O

You can write stuff:

OUTPUT = OUTPUT_INTEGER | OUTPUT_STRING.
OUTPUT_INTEGER = 'SAYING' VARIABLE ['AS CHAR'].
OUTPUT_STRING = 'SAYING' '"' string '"'.

And use either hardcoded input, or read stuff:

INPUT = 'READING' VARIABLE.

AsyncIO is reserved for a future revision of this spec.

Syntax overview

This is it:

CODE = { EXPRESSION }.
EXPRESSION = FUNC_DEC | LV_DECL | STATEMENT.
FUNC_DEC = NAME 'IS A FUNCTION OF' (NUMBER|'NO')
    ('PARAMETER'|'PARAMETERS') 'THAT IMPLEMENTS' NAME.
LV_DECL = 'IT USES' NUMBER ('LOCAL VARIABLE'|'LOCAL VARIABLES').
STATEMENT = 'THE STATEMENT' STMT 'IS LABELED' NAME.
STMT = ASSIGNMENT | TWOOPFUNC | RETURN | FUNCCALL | FUNCGROUP |
    FUNCASSOC | FUNCCOME | OUTPUT | INPUT.
FUNCCOME = 'COMING FROM' NAME { ',' NAME }.
FUNCASSOC = 'GOING FROM' NAME 'TO' LITERAL.
FUNCGROUP = 'STATING' NAME { ',' NAME }.
FUNCCALL = 'CALLING' NAME {(VARIABLE|LITERAL)}.
RETURN = 'THAT RETURNS' (VARIABLE|LITERAL).
INPUT = 'READING' VARIABLE.
OUTPUT = OUTPUT_INTEGER | OUTPUT_STRING.
OUTPUT_INTEGER = 'SAYING' VARIABLE ['AS CHAR'].
OUTPUT_STRING = 'SAYING' '"' string '"'.
ASSIGNMENT = VARIABLE 'IS NOT' (LITERAL | VARIABLE).
VARIABLE = LOCAL_VARIABLE | HELPER_VALUE | RANDOM_NUMBER |
    MEMORY_OF_HELPER_VALUE = 
LOCAL_VARIABLE = 'NOT' (NUMBER|'ANYTHING').
HELPER_VALUE = 'IGNORE'|'IGNORED'.
RANDOM_NUMBER = 'ANYTHING'.
MEMORY_OF_HELPER_VALUE = 'NOTHING'.
TWOOPFUNC = VARIABLE 'IS' TWOOPCODE (LITERAL | VARIABLE).
TWOOPCODE = '<' | '<=' | '>=' | '>' | '!=' | '==' | '+'
    | '-' | '*' | '/' | '%'.
NAME = (CHAR{DIGIT|CHAR})|'IT'.
LITERAL = <decimal integer>.
WHITESPACES = ' '|'\t'|'\r'|'\n'.
COMMENTS = <All malayan characters are treated as comments.>