|Latest Version: Beta-1||Copyright © 2003, Ross A. Beyer|
The idea behind the classes in this library are to facilitate the parsing of a command line passed into your program, and the utilization of those tokens parsed. The basic layout is that a user (you, the user of this library) constructs a few argument objects of various kinds, optionally creates a matcher object and a policy object, and then creates a parser object. Once the parse() method on the parser object is called (usually with ARGV and ARGC as arguments), the parser does its job, and whereas before the argument objects contained nothing, or default values, they now have values based on the parsing.
One of the things that I am rather proud of is that I think that I have boiled down argument parsing to its most basic components, and as such this library can accomodate any number of different parsing styles. And if I've done the design work correctly, even those parsing styles for which I don't have explicit classes for, you should be able to tweak values or subclass my abstract base classes and get things set up to your liking. What do I mean by different parsing styles? Well here are some permutations of a program named "foo" with an option named "bar". This option can either take a value or not.
foo b foo -b foo -b blah foo --bar foo --bar=blah foo -b file1.txt file2.txt file3.txt
These are just a few examples, and there are many, many more. I'm going to define a few terms that I use to abstract the parsing process and you'll find that I use these terms consistently throughout this library. Programs have arguments. In my examples above, the program "foo" has an argument with a key named "bar", this is often also called an option. In the terms of this library, an argument is an abtract property (an object) that affects your program that you expect the user of your program to set in some way. On the command line itself there are several tokens that will get parsed into an argument object. For each argument, these tokens are the prefix, the key, the assignment operator, and the value. In the example "foo -b blah", the prefix is the "-", the key is the "b", the assignment operator is the space, and the value is "blah". In the example "foo --bar=blah", the prefix is the "--", the key is "bar", the assignment operator is the "=", and the value is "blah", and so on.
This is a good time to talk about the four major classes that this library uses to get its job done. They are the argument classes, the matcher classes, the policy classes, and the parser classes. When you construct argument objects, you are creating the various things that will affect what your program does. When you create an argument object you define its keys, and things about whether it requires a value or not. When you construct a matcher object to use in your parsing (which is optional, there is a default), you are setting up how tokens on the command line will be matched to the keys of your arguments (for example if the key is "bar", but you want just a "b" to match, you'd use the char_matcher class). The policy object that you construct (this also is optional, there is a default) sets the various policies that the parser will use when it is trying to make decisions. Things like what the prefix (or prefixes) is (are), what the assignment operator(s) should be, and what to do if the parser runs into unexpected circumstances like something that doesn't match a prefix or a key. Finally, the parser object that you create will be doing the work of parsing the tokens that you give to it. It is constructed with your argument objects, your matcher, and your policy object. When you are ready, you call its parse() method (usually with ARGV and ARGC as the arguments to that method). It then will parse your tokens, using information from the policy object to break those tokens down into potential prefix-key-assignment-value groups. It then uses the matcher object to determine if the potential key matches one of the keys of our arguments. Once it identifies a match, it then passes the prefix-key-assignment-value tokens into the argument object.
The argument class is an abstract base class that covers the various kinds of things that you want to be modifiable by the user of your program. If these objects were just special function commandl objects, then you'd have to go through a second step to get the information that you want out of them to use in your program. However, through the magic of multiple inheritance and overloading, these objects are so much more. For example, the commandl::int_arg class inherits from the abstract base class argument, but also functions just like an int. You can use it just like you would use an int anywhere in your code, it just has a few extra methods that it inherits from commandl::argument so that it can be used by the commandl::parser object. For example, once the parser has used the matcher to determine a match, it passes the prefix, key, assignment, value and other information to the argument, and the argument at this point can take those values and fill itself out, or it can throw an exception (if you try and pass the value "blah" to an int_arg, it will do just that).
In addition to things like int_arg, float_arg, and string_arg that function like ints, floats, and strings, there are also a few special purpose argument classes, like usage_arg and stop_arg which essentially just interrupt the parsing process and cause the parser object to do some specific things (this is done via exceptions, such that argument objects don't have to know about the parser directly). The commandl::usage_arg argument object when matched causes the parser to emit a usage message either on STDERR or on an output stream of your choosing. The commandl::stop_arg when matched causes the parser to stop parsing the tokens that it was given, and depending on the commandl::policy object being used will most likely put any extra tokens into a vector for later use.
Also, a commandl::argument object is what I think of as a distal class. It isn't involved in performing any of the parsing or matching activities directly (other than as it is used by the other classes).
The concept of a separate matcher arose because I felt that an argument shouldn't really be concerned with that activity. And it would also allow you to change whether your program would match on the first character of a key or the whole string of the key just by changing which matcher object you use, and not changing all of the argument objects. This also allows you to write your own matchers, which can be arbitrarily complex. An external matcher just separates and encapsulates the activity of determining whether a given string matches any of the keys of the various arguments that the matcher knows about.
A commandl::matcher object knows about and uses commandl::argument objects. Conversely, it is used by commandl::parser objects.
Policy objects are pretty much just data storage objects. They contain a number of boolean values and other basic information like what the acceptable prefix tokens are (or if no prefix token is required), what the assignment operators are (or if no assignment operator is needed), and things like whether the parser should keep going if a token doesn't match and if it should keep track of those things which don't match. Handy, but simple was the idea here.
The parser is the heavy workhorse of the commandl library. Once you have set up all the other kinds of objects, it is the parser that uses them all to do the work. It is highly flexible (because of the other classes that it depends on), and configurable. It can be configured to either throw exceptions or not. My guess is that the commandl::parser class will probably be the least subclassed, but I could very well be wrong.
|Last Updated: 12 Oct 2003|