A whole lot of extensions I want to see in C++
By Derek Ross
I began this list a long time ago, on a computer far, far away. I decided that it would be helpful if I kept a "bug-log", or a list of programming errors that caused problems for me and for other people working around me. I thought that if I kept a record of bugs, I would be less likely to repeat the bug as I wrote more new source code. Unfortunately, that was not so! It seems that minor memory lapses during programming still allow the same bugs to creep in, even if I had duly noted them in my logs.
Apparently, the human brain (or my brain, at least) wasn't designed to check source code with the same rigidity as that of a compiler. Thus, it seems inevitible that a certain number of programming errors will show up in source code.
The search for a solution to this problem led to the development of this list.
This is a list of language features that would eliminate or reduce common programming errors, by preventing the program from compiling if such an error is encountered.
I'm not sure if you can classify a bunch of extensions as a new language. I think you can; just look at C++.
This list of features is divided into eight sections. The heading of each section describes the rationale for the features in that section.
Each feature is explained and discussed by dividing the discussion into 5 subsections:
WARNING: The list is pretty inconsistent in format, explanation, detail, etc.Basically, the better stuff is near the top of the list.
"for" loops may operate on only one
variable
DESCRIPTION: Prevent for-loops from containing more than one variable. IE
for(j=1;j<10;i++)
would become illegal.
REASON: A common programming error is to start writing a for loop, using
an uncommon variable name, such as "k", and then finish the for loop using
the common variable name 'i'. The reason this happens is because programmers
are so accustomed to writing for-loops that use 'i', that typing reflex kicks
in and results in an incorrect for-loop.
PROBLEMS: Lots of programmers like to embed multiple operations in for loops,
IE for(i=0;i<10;i++,j++), which is perfectly valid code.
ALTERNATIVE: It is possible to define a macro that takes only one argument
and creates a for-loop with only one variable
#define FOR(var, startval, endval)
thus, the code
FOR(i,1,10){...
will expand to
for(i=1;i<=10;i++){...
However, this is somewhat inflexible and also ugly.
Automatic prevention of dereferencing NULL pointers
Always initialize pointers to NULL
Deleted pointers are always assigned a
NULL
DESCRIPTION: Prevent member function calls and assignments to the contents
of pointers if they are NULL. Automatically set pointers to NULL at the beginning
of time. If a pointer is deleted, it is set to NULL.
REASON: I find that frequently when I am about to dereference a pointer,
either to access a variable or call a member function, I call the "assert"
macro just before the operation to ensure that the pointer is not NULL when
I use it. Since this is alot of needless typing that could be automated,
it should be a built in feature of the language. Also, I try to set
all my pointers to NULL at the beginning of time. That way, if I try to use
an uninitialized pointer, an "assert" should fail somewhere. Finally, if
I delete a pointer, I try to set it to NULL, so that if I later try to
dereference the pointer, an "assert" should fail. All this could easily be
automated.
PROBLEMS: I cannot think of any good reasons why someone would actually want
to do things to a NULL pointer. The only problem is a speed hit from the
extra checking and assignments. Maybe a new keyword, "safe" could be used
for pointers that are automatically checked before use IE
safe char* the_ptr;
After the program is debugged, the "safe" keyword could be removed to speed
things up.
ALTERNATIVE: A class may be created that holds a pointer (or a
"pointer-holder"), and overloads the "->()" operator so that checking
on the contained pointer may be performed. The problems with this are that
calling new and delete for this object would require a different syntax,
because the "pointer-holder" class would itself be a direct object, not a
pointer to an object.
Prevent an assignment within a
conditional.
REASON: To eliminate the common typo of typing an assignment "=" when one
actually intends to type the equality operator "==". IE
typing if(a=b)
instead of if(a==b)
PROBLEM: Assignments within a conditional are extremely common. Any change
will break alot of code.
ALTERNATIVES: define a macro "EQ" that expands to "==" and use that instead.
Or, read the compiler warnings.
Illegal to take the address of a function unless '&' is
used
DESCRIPTION: Currently, the address of a function may be taken by typing
the name of the function minus the parentheses and arg list. This feature
would alter the syntax so that an ampersand is required to get the address.
I.E.
void my_function();
pfunc= my_function; // old style, would be illegal
pfunc=&my_function;// new style, requires
'&'
REASON: A common bug among beginners is to use the address of a function
instead of the return value. I.E.
while( kbhit )// intended to be "while(kbhit())"
// the while loop will never exit, because "kbhit" is a constant non-zero
value.
PROBLEMS: Could break too much code.
ALTERNATIVES: Read compiler warnings; use lint.
Illegal to derive a class if any member variables have duplicate names in the base class.
Illegal to derive a class if any member functions have duplicate
names in the base class, unless the base class functions are
"virtual".
REASON: The behavior of duplicate names in base and derived classes is erratic
when direct objects, pointers to the base classes and pointers to derived
classes are mixed. Duplicate names are usually unintentional, and happen
when adding to code that's old or unfamiliar.
PROBLEMS: Not sure. May break some code.
ALTERNATIVES: Read the compiler warnings; use lint.
Compile time checking to prevent recursive
functions
REASON: Recursive functions (functions that call themselves) are rarely
used . In C, it's fairly easy to determine if a function is recursive or
not. I've had problems with unintentionally recursive functions in C++, always
with large class hierarchies and virtual functions. If you have a virtual
function calling another virtual function, the combination of possible function
calls increases by O(N^2), where N is the number of virtual functions, therefore
it's hard to check.
SYNTAX: If a function is meant to be recursive, it requires the keyword
"recursive", I.E.
recursive void my_func();// if "recursive" is not specified
for recursive functions, it will be a compile-time
error.
PROBLEMS: Will break some code. Also, compile time checking for possible
recursion could be time consuming.
ALTERNATIVES: Use lint.
Default preconditions in arg lists.
REASON: I type alot of code like:
void my_func(int i, char* ptr){
assert(0<i<123):
assert(ptr!=NULL);
I think it would be "nicer" if I could embed those preconditions in the arglist,
I.E.
void my_func( int 0<i<123, char* ptr!=NULL ){
etc...
PROBLEMS: None that I can see. Maybe readability would be compromised.
ALTERNATIVES: use asserts, as shown above.
Default preconditions in variable
declarations.
REASON: Sometimes a variable is desired which should NEVER exceed certain
limits. Instead of checking the variable every time it is assigned to, it
would be simpler to specify the limits for the variable when it is created,
and automatically check that it is within those limits during runtime.
SYNTAX: If an integer named my_value must always be between 10 and 20, the
declaration would be:
int 10 >= my_value >= 20;
PROBLEMS: Reduced readability. No conflicts with existing code otherwise.
The extra checking during runtime could take more time.
ALTERNATIVE: Create a "range" template class that fails if it is assigned
a value beyond its preset limits.
Allow member initialization in the class
definition
REASON: To simplify initialization of variables, and make visible the initial
values simply by viewing the header file. Also, it seems more intuitive when
programming. Somewhat similar to default arguments. It would prevent any
forgotten initializations if the class has more than one constructor.
SYNTAX: Consider a class with members "int A", "float B" and "char* C", in
which the initial values are A=123, B=3.141 and C=NULL.
class MyClass{
int A=123;
float B=3.141;
char* C=NULL;
};
PROBLEMS: ???
ALTERNATIVE: None
Supplement to the "?:" operator... the "(?# :)"
operator
The "?:" operator allows an inline "if" statement to be written. IE
condition ? op_if_true : op_if_false
The statement will return either "op_if_true" or "op_if_false" depending
on the logical value of "condition".
The new operator, "(?# :)" allows a limited inline switch statement. It would
save alot of typing, and needless repetition of the "break" keyword.
SYNTAX (for 5 cases):
( integer_value ?# op_if_0 : op_if_1 : op_if_2 :
op_if_3 : op_if_4_or_other )
EXAMPLE:
Conversion of an integer "num" between 0 and 3 to an ascii string, and streamed
to "cout"
int num;
cout << ( num ?# "zero" : "one" : "two" : "three" : "other"
);
PROBLEM: Would interfere with the preprocessor's handling of the "#"
symbol.
ALTERNATIVE: Switch statements.
Allow switch statements to operate on
strings
REASON: The following code is illegal:
char* my_string; ...etc...
switch( my_string ){
case "hello": hello_func(); break;
case "there": there_func(); break;
case "lalala": lalala_func(); break;
}
And must be replaced by:
if ( streq(my_string, "hello"))
hello_func();
else if ( streq(my_string, "there"))
there_func();
else if ( streq(my_string, "lalala"))
lalala_func();
Which is wordier and therefore more error-prone. It's also possible to break
the structure , yet avoid a static error, by forgetting an "else" or an "if"
in the code.
The problem with the switch is that it compares constants to decide which
case to execute. Comparing strings would require a function call for each
case .
One solution would be a switch statement that operates on function calls.
SYNTAX: switchex, or extended switch statement.
switchex ( streq(my_string, case) ){
case "hello": hello_func() ; break; /* each case calls the function,
with the case value as the other arg of the function.*/
... etc ...
The switchex strcture would require template style arguments, to permit more
than just string comparisons. This structure would also allow objects with
equality operators to be switched, IE
CMyClass obj1; etc ...
switchex( obj1==case ){ etc...
Two-dimensional if statement
DESCRIPTION: An if statement that takes the form of a two-dimensional table,
and operates on two parameters.
REASON: Looks neater in code. Might prevent logic errors
SYNTAX: Would require an editor that permits tables, like HTML.
EXAMPLE:
float A;
int B;
int counter;
if |
B==0 |
B<0 |
B>0 |
default |
A==2.0 |
func1(); |
func2(); |
func3(); |
|
A>=36.1 |
func2(); |
cout << "An Error"; |
func2(); |
// nothing |
A<=-1000 |
func1(); |
func1(); |
counter++; |
|
default |
cout << "An Error"; |
// nothing |
|
cout << "An Error"; |
If this if statement were done using conventional notation, it would be a lot bigger than this!
Ability to derive from built-in types
Bjarne was going to add this feature to C++, but he "restrained himself".
Ability to overload "<" and ">" as unary
postfix
REASON: This would permit an improved type of variable argument list. Would
also permit a shorthand for "opening" and "closing" or "initializing" and
"finalizing" objects. Combined with the comma operator, The old "C" style
of variable argument lists could be completely replaced in a type-safe manner.
EXAMPLE: A variable argument "max" function.
class CMaxFunc{
double max_value;
CMaxFunc& operator<(double val){// initialize the function
max_value=val;
return *this;// return a reference for the next operator to
work on
}
CMaxFunc& operator,(double val){// evaluate another argument in the
list
if(val>max_value)
max_value=val;
return *this;// return a reference for the next operator to
work on
}
double operator>(){// finish the list, return the max value
return max_value;
}
};
CMaxFunc max; // a global instance of the function/object
void main(){
cout << max<1,3,4,5,6>; // will print out "6"
}
Ability to stream a "void".
Ability to create templates with "void" arg.
Templates with variable numbers of args.
Macros with variable argument lists.
These ideas occurred when I was trying to design a template
class/macro that simplified parser design. I wanted to create a technique
where one could convert any function to a parseable function with a minimum
of typing. The source code would ideally look like, for example:
// FUNCTIONS THAT I WANT TO MAKE "PARSEABLE"
double hypot(double a, double b);
void my_func(int a);
long the_time(void );
// DESIRED SYNTAX TO DO IT
PARSEABLE(double, hypot, double, double) ;
PARSEABLE(void, my_func, int);
PARSEABLE(long, the_time, void);
// The PARSEABLE macro takes the return value of the function, the pointer
to the function, and the arg(s) of the function, and allows the arg(s) to
be streamed in (IE cin), with the result streamed out (IE cout).
After spending some of time working on this problem, I came to the following conclusions:
Would be easier if you could stream in and out a "void" type .The result of such an operation would be no action at all. Treating the "void" as a distinct type, equal in importance to all the other built in types, would add a little more consistency to the language.
Would be easier if templates and macros allowed variable argument lists. The syntax for such a feature is unclear however.
Would be easier if template classes could be instantiated with a "void" arg. See point "1", above.
new type of define. Definex. Allows non-alpha characters to be made into macros, such as '{'
Built in branch coverage checking, for testing.
An auto_increment macro type. Allows testing structures, such as branch coverage testing.
The following is a wish list of features that I would like to see in an editor.
REPLACE THE ASCII TEXT STANDARD FOR EDITING C/C++ WITH HTML.
Functions could be hotlinks to the source code for the function. Comments could have hotlinks to more extensive comments. Graphics and charts could easily be included in source code. Tables would help format new types of logical control structures. Keywords, types could have different fonts, colors, styles. Every class/variable/function declaration has a "pointer" to a location in a help file, or a document, making it easy to update help files/documents . Frames could allow screen splitting.
Editor with curly-bracket collapsing/expanding. Hierarchical.
Auto spacing is part of auto formatting. A<-100 becomes A < -100. A<=100 becomes A <= 100.
Auto parenthesizing, or some indication to show precedence of a complicated expression.
Count parenthesis, make sure they are evened up. Either a '(' or ')' at right side of screen if parenthesis don't line up.
When a new brace is generated, a // is added for the user to put in a comment. This comment is visible if the user collapses the block.
Editor has a list of functions at top, easy to jump to a function.
Editor can alphabetically order functions in the file.
Header files have automatic "sentries"
Window PIP (picture in picture) Top right corner of window has a smaller window, looking at a different area of the file. May have different font (smaller usually).
Editor: warning if different variables have names that are very similar. IE "Counter" and "Counters". '!' at right side of screen. R button click on the '!' to get the warning information.
Text checkers for warnings, etc are interpreted from a file. Makes it easy to add new checkers.
On the fly compiling, gets typos as you type them. Common errors: Misspelled names, unincluded headers. This compiling is done in a background thread.
Button for "add #include " so you can add a header without losing your spot in the source file.
Editor: Cool formatting options for printing Double columns, landscape, numbered, index (for functions and classes). Summary info.
Help with casting int to uint to uchar to char etc. Interpreter lets you run test cases.
Automatically sticks prototypes into header files.
Auto arrange enums.
Button that automatically searches for which header file a function is in, and includes it.
For some editor checking functions, less than 100% accuracy is ok.
Syntax hilighting indicates if a word typed has not been typed previously, and is not in header files. Question mark at right side of screen at that line.
Ability to parse a single function and test it with arbitrary test parameters. Within the editor.
Dynamic dispatch option as well as regular virtual functions. "dynamic" kwd.
Simplify usage of return values of functions, especially enums. Allow enums to be automatically printed as their source string instead of an integer. Also, defines.
able to specify a scope to access the members of an object, similar to pascals "with"
ability to use binary literals IE char ch= 0b10010111;
logical xor. ^^
Functions act like namespaces. The local variables may be accessed by func::var
Ability to "delete" functions, thus freeing up RAM. These functions have to be placed on the heap.
Ability to get the size of a function, the number of bytes it requires.
To put a function on the heap: heap void func(a,b,c);
Functions can have public and private members. Publics are "static", privates are "stack". Functions have ctors and dtors. Ctors are passed all the args. Able to derive a function from another function, and the derived function inherits the dtor and ctor.
EMAIL: derekr@escape.ca