Parsing command line options
Parsing command line options is something almost every command or applications needs to handle in some way, and there is too many home-made argument parsers out there. As so many programs needs to parse options from the command line, this facility is encapsulated in a standard library function getopt(2).
The GNU C library provides an even more sophisticated API for parsing the command line, argp(), and is described in the glibc manual [1]. However, this function is not portable.
There is also many libraries that provides such facilities, but lets keep us to what the glibc library provides.
Command line options
A typical UNIX command takes options in the following form
command [option] arguments
The options has the form of a hyphen (-) followed by a unique character and a possible argument. If the options take an argument, it may be separated from that argument by a white space. When multiple options is specified, those can be grouped after a single hyphen, and the last option in the group may be the only one that takes an argument.
Example on single option
ls -l
Example on grouped options
ls -lI *hidden* .
In the example above, the -l (long listing format) does not takes an argument, but -I (Ignore) takes *hidden* as argument.
Long options
It's not unusual that a command allows both a short (-I) and a long (--ignore) option syntax. A long option begins with two hyphens, and the option itself is identified using a word. If the options take an argument, it may be separated from that argument by a =.
To parse such options, use the getopt_long(2) glibc function, or the (non portable) argp().
Example using getopt_long()
getopt_long() is quite simple to use. First we create a struct option and defines the following elements: * name is the name of the long option.
- has_arg
- is: no_argument (or 0) if the option does not take an argu‐ ment; required_argument (or 1) if the option requires an argu‐ ment; or optional_argument (or 2) if the option takes an optional argument.
- flag
- specifies how results are returned for a long option. If flag is NULL, then getopt_long() returns val. (For example, the calling program may set val to the equivalent short option character.) Otherwise, getopt_long() returns 0, and flag points to a variable which is set to val if the option is found, but left unchanged if the option is not found.
- val
- is the value to return, or to load into the variable pointed
- to by flag.
The last element of the array has to be filled with zeros,
The next step is to iterate through all options and take care of the arguments.
Example code
Example code
#include <unistd.h> #include <stdio.h> #include <stdlib.h> #include <getopt.h> struct arguments { int a; int b; int c; int area; int perimeter; }; void print_usage() { printf("Usage: triangle [Ap] -a num -b num -c num\n"); } int main(int argc, char *argv[]) { int opt= 0; struct arguments arguments; /* Default values. */ arguments.a = -1; arguments.b = -1; arguments.c = -1; arguments.area = 0; arguments.perimeter = 0; static struct option long_options[] = { {"area", no_argument, 0, 'A' }, {"perimeter", no_argument, 0, 'p' }, {"hypotenuse",required_argument, 0, 'c' }, {"opposite", required_argument, 0, 'a' }, {"adjecent", required_argument, 0, 'b' }, {0, 0, 0, 0 } }; int long_index =0; while ((opt = getopt_long(argc, argv,"Apa:b:c:", long_options, &long_index )) != -1) { switch (opt) { case 'A': arguments.area = 1; break; case 'p': arguments.perimeter = 1; break; case 'a': arguments.a = atoi(optarg); break; case 'b': arguments.b = atoi(optarg); break; case 'c': arguments.c = atoi(optarg); break; default: print_usage(); exit(EXIT_FAILURE); } } if (arguments.a == -1 || arguments.b == -1 || arguments.c == -1) { print_usage(); exit(EXIT_FAILURE); } if (arguments.area) { arguments.area = (arguments.a*arguments.b)/2; printf("Area: %d\n",arguments.area); } if (arguments.perimeter) { arguments.perimeter = arguments.a + arguments.b + arguments.c; printf("Perimeter: %d\n",arguments.perimeter); } return 0; }
Example of usages
Full example with short options
[13:49:00]marcus@little:~/tmp/cmdline$ ./getopt -Ap -a 3 -b 4 -c 5 Area: 6 Perimeter: 12
Missing -c option
[14:07:37]marcus@little:~/tmp/cmdline$ ./getopt -Ap -a 3 -b 4 Usage: triangle [Ap] -a num -b num -c num
Full example with long options
[14:09:38]marcus@little:~/tmp/cmdline$ ./getopt --area --perimeter --opposite 3 --adjecent 4 --hypotenuse 5 Area: 6 Perimeter: 12
Invalid options
[14:10:14]marcus@little:~/tmp/cmdline$ ./getopt --area --perimeter --opposite 3 --adjecent 4 -j=3 ./getopt: invalid option -- 'j' Usage: triangle [Ap] -a num -b num -c num
Full example with mixed syntaxes
[14:09:38]marcus@little:~/tmp/cmdline$ ./getopt -A --perimeter --opposite=3 -b4 -c 5 Area: 6 Perimeter: 12
Variants
getopt_long_only() is like getopt_long(), but '-' as well as "--" can indicate a long option. If an option that starts with '-' (not "--") doesn't match a long option, but does match a short option, it's parsed as a short option instead.
Example using argp()
argp() is a more flexible and powerful than getopt() with friends, but it's not part of the POSIX standard and is therefor not portable between different POSIX-compatible operating systems. However, argp() provides a few interesting features that getopt() does not.
These features include automatically producing output in response to the ‘--help’ and ‘--version’ options, as described in the GNU coding standards. Using argp makes it less likely that programmers will neglect to implement these additional options or keep them up to date.
The implementation is pretty much straight forwards and similar to getopt() with a few notes.
const char *argp_program_version = Triangle 1.0"; const char *argp_program_bug_address = "<marcus.folkesson@combitech.se>";
Is used in automatic generation for the --help and --version options.
struct argp_option
This structure specifies a single option that an argp parser understands, as well as how to parse and document that option. It has the following fields:
- const char *name
- The long name for this option, corresponding to the long option --name; this field may be zero if this option only has a short name. To specify multiple names for an option, additional entries may follow this one, with the OPTION_ALIAS flag set. See Argp Option Flags.
- int key
- The integer key provided by the current option to the option parser. If key has a value that is a printable ASCII character (i.e., isascii (key) is true), it also specifies a short option ‘-char’, where char is the ASCII character with the code key.
- const char *arg
- If non-zero, this is the name of an argument associated with this option, which must be provided (e.g., with the --name=value or -char value syntaxes), unless the OPTION_ARG_OPTIONAL flag (see Argp Option Flags) is set, in which case it may be provided.
- int flags
- Flags associated with this option, some of which are referred to above. See Argp Option Flags.
- const char *doc
- A documentation string for this option, for printing in help messages.
If both the name and key fields are zero, this string will be printed tabbed left from the normal option column, making it useful as a group header. This will be the first thing printed in its group. In this usage, it’s conventional to end the string with a : character.
Example code
Example code with little more comments
#include <stdlib.h> #include <argp.h> const char *argp_program_version = "Triangle 1.0"; const char *argp_program_bug_address = "<marcus.folkesson@combitech.se>"; /* Program documentation. */ static char doc[] = "Triangle example"; /* A description of the arguments we accept. */ static char args_doc[] = "ARG1 ARG2"; /* The options we understand. */ static struct argp_option options[] = { {"area", 'A', 0, 0, "Calculate area"}, {"perimeter", 'p', 0, 0, "Calculate perimeter"}, {"hypotenuse", 'c', "VALUE", 0, "Specify hypotenuse of the triangle"}, {"opposite", 'b', "VALUE", 0, "Specify opposite of the triangle"}, {"adjecent", 'a', "VALUE", 0, "Specify adjecent of the triangle"}, { 0 } }; /* Used by main to communicate with parse_opt. */ struct arguments { int a; int b; int c; int area; int perimeter; }; /* Parse a single option. */ static error_t parse_opt (int key, char *arg, struct argp_state *state) { struct arguments *arguments = (struct arguments*)state->input; switch (key) { case 'a': arguments->a = atoi(arg); break; case 'b': arguments->b = atoi(arg); break; case 'c': arguments->c = atoi(arg); break; case 'p': arguments->perimeter = 1; break; case 'A': arguments->area = 1; break; default: return ARGP_ERR_UNKNOWN; } return 0; } /* Our argp parser. */ static struct argp argp = { options, parse_opt, args_doc, doc }; int main (int argc, char **argv) { struct arguments arguments; /* Default values. */ arguments.a = -1; arguments.b = -1; arguments.c = -1; arguments.area = 0; arguments.perimeter = 0; /* Parse our arguments; every option seen by parse_opt will * be reflected in arguments. */ argp_parse (&argp, argc, argv, 0, 0, &arguments); if (arguments.a == -1 || arguments.b == -1 || arguments.c == -1) { exit(EXIT_FAILURE); } if (arguments.area) { arguments.area = (arguments.a*arguments.b)/2; printf("Area: %d\n",arguments.area); } if (arguments.perimeter) { arguments.perimeter = arguments.a + arguments.b + arguments.c; printf("Perimeter: %d\n",arguments.perimeter); } return EXIT_SUCCESS; }
Example of usages
This application gives the same output as the getopt() usage, with the following extra features:
The options --help, --usage and --version is automaically generated
[15:53:04]marcus@little:~/tmp/cmdline$ ./argp --help Usage: argp [OPTION...] ARG1 ARG2 Triangle example -a, --adjecent=VALUE Specify adjecent of the triangle -A, --area Calculate area -b, --opposite=VALUE Specify opposite of the triangle -c, --hypotenuse=VALUE Specify hypotenuse of the triangle -p, --perimeter Calculate perimeter -?, --help Give this help list --usage Give a short usage message -V, --version Print program version Mandatory or optional arguments to long options are also mandatory or optional for any corresponding short options. Report bugs to <marcus.folkesson@combitech.se>.
Version information
[15:53:08]marcus@little:~/tmp/cmdline$ ./argp --version Triangle 1.0
Conclusion
Parsing command line options is simple. argp() provides a log of features that I really appreciate.
When portability is no issue, I always go for argp() as, besides the extra features, the interface is more appealing.