Search This Blog

Tuesday, January 3, 2012

Scripting Plans

I've taken quite a few formal programing classes over the years, but none of them have even mentioned the concept of scripting, much less emulating one language with another.  I find myself asking the question "Where do I start?"  Well, it seems reasonable to search the Internet and see if anyone else has tried what I'm about to do.  As it turns out, if anyone is, they aren't advertising it to the world at large.  No big surprise there though. 

Well, I came up with a plan, a roadmap if you will, for how I'm going to pull this all together to make it work.  It starts with a formal definition of exactly what language I'm implementing.  This isn't just a matter of picking a language that already exists; I have to define what commands are going to be available, what their syntax is, and how they are to function, exactly.  I've never thought about it before, but programing languages are complex, and very difficult to describe in precise terms.

For this project, I've chosen to implement my own version of C.  I'm going to try to make it as close to ANSI C as possible, but there are going to be differences, for sure.  I've typed up a document that formally defines every data type, command, structure, and the syntaxes for all.  It helps that I'm comfortable with C, so I didn't have to do very much research in this task.  Here's a brief overview of my work so far.

Primitive Data Types:
void - indicates a function type returns no data.
int - 32-bit signed integer.
long - 64-bit signed integer.
float - 64-bit IEEE double-precision floating point number.
char - 2 byte Unicode character.
cstr - A string, similar to a c-style string, but 2 bytes per character (for Unicode support), and terminated with a two 0 bytes ( \0\0).

User-Defined Data Types:
enum - an enumeration of values.  Internally, this is the same as an int, but you can assign monikers to represent numbers.
struct - a collection of other data types that can be stored in a single variable, or used as an array.
Arrays - a collection of the same data type stored in a single variable, using subscripts to access individual elements of the array.  Can be made of any primitive or user defined data type.

Declaring variables:
primitive data types: datatype moniker [ = initial value];
enums: enum moniker { [member1 [ = value ] [, member2 [ = value] ...]] }[moniker];
structs: struct moniker { [type1 moniker1; [type2 moniker2;]] } [moniker];
arrays: datatype moniker[arraybound1][arraybound2]...;

Declaring functions:
prototype function: datatype moniker([datatype [,datatype...]]);
define function: datatype moniker ([datatype param1 [, datatype param2...]]) { statements; };

Now for some command statements.  These fall into a few categories.

Looping:
for ([init_expression]; [condition_expression]; [loop_expression]) {statements;}
do {statements;} while (condition);
while (condition) {statements;}

Condition Execution
if (condition) {statements;} [else {statements}]
switch (expression) { case const_value1 : [statements;] [break]; [ case const_value2 : [statements;] [break;] ] ... [default : [statements;] [break;]] }


For those of you who are not as familiar with c-style syntax, here's a few examples to help clarify what all these definitions mean.

Variable declaration and assignment.
int a = 4;        /* creates an int var, assigns value 4 */
int b;     /* creates int var b, default value is 0 */
b = a;    /* assigns var b the value in var a (4) */
a = 5 + b;   /* assigns var a the value of 5 plus the value of var b (4), so 9 altogether */

Array definition and handling.
int list[2];   /* creates an array variable list, with two elements */
list[0] = 2;  /* assigns the first element, 0, the value 2 */
list[1] = 5;  /* assigns the second element, 1, the value 5 */

Enum definition and handling.
enum DaysofWeek { Sunday, Monday, Tuesday, Wednesday, Thursday, Friday, Saturday };
DaysofWeek Today = DaysofWeek.Tuesday;
DaysofWeek Tomorrow = Today + 1;   /* assigns the value of Today (2) + 1, to var, will = Wednesday (3) */

Struct definition and handling.
struct coordinates {
    int X;
    int Y;
};   /* creates a struct called coordinates, that contains two data fields, an int called X, and an int called Y */
coordinates Location;   /* defines a variable called Location of struct type coordinates.  You C guys will notice the lack of the struct keyword in this definition.  I'm leaving it out of my specification because it's not necessary. */
Location.X = 3;   /*assigns the X component of the Location coordinates structure the value 3 */
Location.Y = 2;   /*assigns the Y component of the Location coordinates structure the value 2 */


Looping example:
For loops are generally used when you need to run a block of code a fixed number of times.  Before the loop is started, the init_condition is evaluated.  This is generally used to create local variables for a loop counter.  Then the condition_expression is evaluated to see if the statements get execution.  If the condition_expression evaluates to non-zero, then statements are executed, else the block of statements is skipped, and the for ends. After execution of the statements, the loop_expression is evaluated, then the process starts back over at testing the condition_expression.  All parameters are optional, and if condition_expression is omitted it is assumed to be non-zero, and an infinite loop is created.  A for loop can be exited early with the break; statement, or prematurely restarted (just like it had reached the end of statements) with a continue; command.
int total = 0;
for (int count = 1; count < 5; count ++) {
    total += count;
}     /* total will be equal to 10 when this finishes */

Do loops are generally used when you don't know how many times a block of code needs to run, but you know it needs to run at least once through.  The loop condition is tested at the end of each iteration.  Do loops can be exited with a break; command.
int a  = 1;
int b = 32;
do {
    a +=1;
} while (a < b);

While loops are generally used when you dont' know how many times a block of code needs to run, and in fact, it might not run a single time.  The loop condition is tested before entering each iteration.  While loops can be exited with a break; command.
int a = 3
while (a < 100) {
   a +=1;
}

Conditional execution example.
If statements are used to test a condition, and execute statements if non-zero, and optionally execute statements if zero. 
int a = 3;
int b = 2;
if (a > b) {
    /* a is bigger than b */
} else {
    /* a is smaller than or equal to b */
}

Switch statements are used when testing an expression against multiple possible values.  Switch is preferable to a series of if statements, but can only be used to test against constant values.  Switch evaluates the expression, then begin search each case clause.  When a case is matches, execution begins following the : until a break; statement is reached.  break; sends execution to the end of the statement block.  An option default statement starts execution after the : if no previous case values matches the expression.
int a = 3;
switch (a) {
    case 1 :
        /* a = 1 */
        break;
  case 2 :
        /* a = 2 */
        break;
   default :
        /* a did not equal 1 or 2 */
        break;
}

Well, that's all the examples I have for now.  If anyone has any thoughts or suggestions about this, please feel free to post them in the comments below.

Next up, how to read this programmatically.

No comments:

Post a Comment