NEWCMD commandName {argumentVector} [#["optional arg name list"]] {@"categoryName"} [?"descriptionString"]

NEWCMD declares and defines a subroutine that can be invoked later in the user's Statistics101/Resampling Stats program. A subroutine is a set of commands that is given a name and is treated as a unit. The subroutine consists of all the commands between the NEWCMD command and its corresponding END command. A subroutine is essentially a user-defined command. Using subroutines can make a program shorter and easier to understand.

Subroutines have three aspects. The first is the "declaration" of the subroutine, the second is the "definition", and the third is the "invocation". Declaring a subroutine means to specify its name and argument profile. Defining a subroutine means to associate its name and profile with the actual commands that comprise the subroutine. Invoking a subroutine means using it as a command and giving it the actual arguments that it is supposed to operate on. The NEWCMD command performs both the declaration and the definition functions. The DECLARE command performs only the declaration function.

Subroutine definition

The first argument of NEWCMD assigns the "commandName" to the subroutine. The "commandName" can be any legal Resampling Stats name except that of an existing command. No two subroutines in the same program may have the same name. As usual, its typographical case doesn't matter.

Immediately following the commandName are zero or more "dummy" arguments that can pass vectors into or out of the subroutine. They are called "dummy" arguments because they are just placeholders for the real arguments that will be used when the subroutine is later invoked. The number of dummy arguments in the NEWCMD command's definition must be matched by the number of real arguments in its invocation.

For example, we can declare a new command that computes the average of the values in a vector like this (even though the MEAN command already exists to do this; this is just to provide a simple example):

NEWCMD AVERAGE inputVector result
   SUM inputVector inputSum
   SIZE inputVector inputSize
   DIVIDE inputSum inputSize result
END

In this example, AVERAGE becomes the name of the new command or subroutine. The argumentVector names inputVector and result are placeholders, or "dummy variables" that are used within the subroutine.

Subroutine invocation

When the subroutine is invoked at run time, the actual arguments will be substituted for the dummy arguments. The first actual argument will be substituted for the first dummy argument, the second for the second, and so forth. The dummy's name may be the same as that of a vector outside the subroutine, but it does not refer to that outside vector. It is only used in the subroutine to refer to whatever program variable is in its position. Here is an example of an invocation in a portion of a program:

COPY 1,100 myVector
AVERAGE myVector average  '<==  invocation
PRINT average

Here, myVector will be used wherever inputVector is referred to inside the subroutine and average will be used wherever result is referred to inside the subroutine. Statistics101 knows that the variable average is distinct from the subroutine AVERAGE because of their positions. That is, AVERAGE is a command name because it is first on a line and average is a variable name because it is not first on the line. Their typographical case has nothing to do with the distinction. The name average was used for the argument's name, but any valid name could be used, such as result or avg or anything else.

Any vector name referred to inside a subroutine, that is not named in the argument list is considered to be "local" to the subroutine. "Local" means that it is not available outside the subroutine to other parts of the program. Thus inputSum and inputSize are locals. They may be given values and referred to only inside the subroutine.

Similarly, any vector defined outside the subroutine is not visible within the subroutine unless that vector is listed in a GLOBAL command.

Optional arguments

Following the dummy arguments on the command line is an optional code, the number-sign ('#'). If this is present it indicates that the subroutine's invocation is allowed to have an unspecified number of additional, or "optional" arguments. If the '#' is not present, then the invocation must have exactly as many arguments as does the NEWCMD command that defines the subroutine. The '#' mark may be followed by a double-quote-enclosed string which is a comment to describe the optional argument list. The comment will appear in the Syntax Help Bar near the top of Statistics101's main window. If you omit the comment, the default comment "{variable}" will appear in the Syntax Help Bar.

Here is an example of a subroutine definition that requires optional arguments. The comment in quotes after the '#' indicates that the subroutine expects one or more input vector arguments.

NEWCMD MINIMUMS #"inVector {inVector} result"
   . . .
END

The above NEWCMD defines a subroutine named MINIMUMS that (syntactically) will accept zero or more arguments. The quoted comment accompanying the '#' indicates that it must (semantically, that is, at run time) have at least two arguments, inVector and result, but may have any number of input vector arguments between the first inVector and result.

Inside a subroutine definition the dummy variables are referred to by their names. In the case of the optional variables (those allowed by the '#' mark), those are referred to by number using two other commands: ARGCOUNT, and GETARG. ARGCOUNT tells you how many optional arguments are in the invocation. GETARG returns one of the optional arguments, chosen by number. See the last example at right and the discussion and examples at ARGCOUNT and GETARG.

Category assignment

A subroutine can be placed in one or more categories using the "@" sign followed by a category name. If the category name, such as "financial", is a single word, it doesn't need to be enclosed in double quotes. If it is multiple words, such as "vector operations", then it must be enclosed in double quotes. The reason for assigning categories to a subroutine is so that the subroutine can be listed along with others in the same category in the "Commands and Subroutines (by category)" menu list of the Edit menu and the Edit Window popup menu. Statistics101 builds the menu lists based on the subroutines it finds in the /lib directory. If you are not going to save your subroutine in /lib, then there's no need to use this feature.

If you assign a category that is not currently in the menus' lists, that category will be added to the menus.

If you don't assign a category, the subroutine will be assigned the default category, "other". If you assign more than one category, the subroutine's name will appear in each category's list in the menus.

If you want a subroutine's name to not appear in the menus, you can use the tag "@HIDE". If that tag is used, then all other tags on the same NEWCMD line are ignored.

Here is the above example of a subroutine definition extended to assign the subroutine to two categories.

NEWCMD MINIMUMS #"inVector {inVector} result" @math @"vector operations"
   . . .
END

Help bar description

NEWCMD also accepts an optional one-line description of the subroutine. That description is introduced by a question mark, "?" (as a symbol for "help"). The description must be enclosed within quotation marks. The quoted description may be any length, but it must be all on one line. After the first syntax check, the description will appear in the Description Help Bar when the edit cursor is on a line that invokes the subroutine.

If the description is omitted, the default description "Subroutine invocation." will appear in the Description Help Bar.

Here is the above example of a subroutine definition extended to add the help bar description.

NEWCMD MINIMUMS #"inVector {inVector} result" @math @"vector operations" \
?"Copies minimum element of all input vectors into same position in result"
   . . .
END

Important points

Here are some important points to remember about "new commands" or "subroutines":

  • A new command or subroutine may be declared anywhere in the program, except inside another subroutine's definition, as long as its declaration appears in the text before it is used. (Remember that both the NEWCMD and the DECLARE command perform the declaration function.) It is good practice to put all the subroutines near the beginning of a program, after any NAMEd constant definitions but ahead of the "main program".

  • The subroutine's name and all its arguments will be used as help text in the Statistics101 main window's syntax help bar. This help text will become effective after the first syntax check is performed on your program that includes the subroutine. If you are in the process of writing your program and the help text doesn't appear when the cursor is on a line containing a subroutine invocation, then just run a syntax check by selecting menu Run>Check Program Syntax. Even if there are errors, the subroutine's syntax help text will be established.

  • A subroutine's arguments must be vectors. String literals and string variables (created by the STRING command) are not allowed. The descriptors following "#", "@" and "?" must be literal strings enclosed in quotes.

  • A subroutine may invoke other subroutines but only if the invoked subroutines are declared earlier in the program text. A subroutine may not invoke itself (recursion) and two subroutines may not each invoke the other.

  • If the '#' option is not used, a subroutine must be invoked with exactly the same number of actual arguments as there are dummy arguments in its declaration. If the declaration has three dummy arguments, then every invocation of that subroutine must have three actual arguments. Use the '#' option to allow a subroutine to have a variable or unspecified number of arguments. The '#' must be placed after the last of any required dummy arguments.

  • If more than one of the three optional codes, '#', '@', and '?' and their arguments are present on the NEWCMD command they must be in the order just listed.

  • All arguments to a subroutine are passed "by reference". This means that if the dummy vector representing an actual vector argument is changed within the subroutine, then that change will occur in the actual argument also.

  • If a subroutine dummy argument has the same name as a global constant or variable, the subroutine argument has precedence within the subroutine. So any use of that name inside the subroutine will refer to the argument, not the global. In other words, a dummy name will hide a global name within a subroutine if the names are the same.

  • By default, a subroutine cannot directly access constants or variables outside itself. If it must access some vector from the main program or an enclosing subroutine, then that vector must be passed to the subroutine in its argument list, or that vector must be declared to be global with the GLOBAL command.

  • All local variables are cleared when a subroutine completes execution. A subroutine cannot hold a value between invocations.

These last two rules are to ensure that the general behavior of subroutines is the same as that of all the other Resampling Stats commands. That is,

  • A command has no state (no memory). This means that executing the same command on the same variables always has the same effect.

  • A command has no side effects. This means that executing a command affects only its arguments; it cannot affect any variable not in its argument list. You can purposely violate this "no side effects" condition by declaring one or more variables to be global and designing your subroutine to affect one or more of those global variables. That is called a "side effect" and makes program correctness more difficult to achieve. But sometimes it might be necessary or desirable if it simplifies the program. Declaring NAMEd constants to be global does not cause side effects because their values cannot be changed.

See also: ARGCOUNT, DECLARE and GETARG

The following is a subroutine that increments a variable by one. It also shows the use of the optional category feature (using the "@") and the optional descriptive comment (introduced by the question mark).

'Subroutine to increment a variable.
NEWCMD INCR var @math ?"Increments given variable" 
   ADD 1 var var 
END

'Main program snippet using above subroutine.
COPY 0 successCount
COPY 1000 numTrials
NAME false true
REPEAT numTrials
'. . .(code to give "someVariable" a value)
   IF someVariable = true
      INCR successCount  'subroutine invocation
   END
'. . .
END

Here's a simple example of a subroutine that computes the log base 16 of a vector. Compare this to the similar example at GLOBAL.

'Subroutine to compute log base 16
NEWCMD LOG16 input result ?"Computes log base 16 of input vector"
   LOG input logInput ' log base e of input
   LOG 16 loge16      ' log base e of 16
   DIVIDE logInput loge16 result
END

'Main program using the subroutine
COPY 1,15 A
LOG16 A log16A
PRINT A log16A

Here's a more elaborate example:

/*
From "Fifty Challenging Problems in Probability" by Frederick Mosteller
"To encourage Elmer's promising tennis career, his father offers
him a prize if he wins (at least) two tennis sets in a row in a 
three-set series to be played with his father and the club champion
alternately: father-champion-father or champion-father-champion,
according to Elmer's choice. The champion is a better player than
Elmer's father. Which series should Elmer choose?"
*/

'(Call the champion "pro".) 
'Assign arbitrary probabilities to the pro and the father vs. the son, 
'with the restriction that the pro's probability of winning is 
'higher than the father's. Create two universes based on those
'probabilities, one for the pro vs. the son and one for the 
'father vs. the son.

NAME sonLoses sonWins
GLOBAL sonLoses sonWins

COPY 7#sonLoses 3#sonWins proGames     'p = 0.7 (pro has higher prob than father)
COPY 6#sonLoses 4#sonWins fatherGames  'p = 0.6 

'Subroutine to determine and score the results of 
'a three-set tournament.
NEWCMD TOURNAMENT firstGame secondGame thirdGame successCount 
   IF firstGame = sonWins
      IF secondGame = sonWins
         ADD 1 successCount successCount
      END
   ELSEIF secondGame = sonWins
      IF thirdGame = sonWins
         ADD 1 successCount successCount
      END
   END
END

'Main program
NAME 10000 rptCount
COPY 0 successCount
'Compute probability of set with father in middle
REPEAT rptCount
   SAMPLE 1 proGames firstGame
   SAMPLE 1 fatherGames secondGame
   SAMPLE 1 proGames thirdGame
   TOURNAMENT firstGame secondGame thirdGame successCount
END
DIVIDE successCount rptCount probFatherMiddle
PRINT probFatherMiddle

'Compute probability of set with pro in middle
COPY 0 successCount
REPEAT rptCount
   SAMPLE 1 fatherGames firstGame
   SAMPLE 1 proGames secondGame
   SAMPLE 1 fatherGames thirdGame
   TOURNAMENT firstGame secondGame thirdGame successCount
END
DIVIDE successCount rptCount probProMiddle
PRINT probProMiddle

If your subroutine is subject to possible error conditions, you can use the DEBUG command with its optional message string to report the error. For example:

'Computes the factorial of the first element 
'of the input vector and returns it in the
'result vector. number must be between zero
'and 170 (inclusive).
NEWCMD FACTORIAL1 number result @math ?"Computes factorial of number."
   IF number = 0
      COPY 1 result
   ELSEIF number > 0
      PRODUCT number,1 result
   ELSE
      DEBUG "Error while computing factorial: Number must be positive."
   END
END

If this subroutine is invoked with a negative number, its DEBUG command will print the error message to the Output Window, then the program will enter debug mode.

Here is an example using the '#' feature of optional arguments. Note the use of ARGCOUNT and GETARG:

'Subroutine to do an ascending coordinated sort, in place, of two or more vectors
'based on the first vector (keyVariable) as the key. All vectors must be the same length.
NEWCMD SORTCOORD keyVariable #"variable {variable}"  ?"Coordinated sort, in place, of two or more vectors"
   ARGCOUNT numberOfArgs
   IF numberOfArgs > 0
      TAGSORT keyVariable tags
      TAKE keyVariable tags keyVariable
      FOREACH argNum 1,numberOfArgs
         GETARG argNum arg
         TAKE arg tags arg
      END
   ELSE
      DEBUG "ERROR: Incorrect number of arguments in SORTCOORD."
   END
END

Here is an example invoking the above SORTCOORD subroutine with three arguments:

NAME male female
DATA (62   68  73 58  66) height
DATA (120 165 198 99 115) weight
DATA (female male male female female) sex

SORTCOORD height weight sex
PRINT height weight sex

Result:

height: (58.0 62.0 66.0 68.0 73.0)
weight: (99.0 120.0 115.0 165.0 198.0)
sex: (female female female male male)