Parsing command line options

Parsing command line options

Parsing command line options is something allmost every command or applications needs to handle in some way, and there is too many home-made argument parsers out there. As so many programs needs to parse options from the command line, this facility is encapsulated in a standard library function getopt(2).

The GNU C library provides an even more sophisticated API for parsing the command line, argp(), and is described in the glibc manual [1]. However, this function is not portable.

There is also many libraries that provides such facilities, but lets keep us to what the glibc library provides.

Command line options

A typical UNIX command takes options in the following form

command [option] arguments

The options has the form of a hyphen (-) followed by a unique character and a possible argument. If the options take an argument, it may be separated from that argument by a white space. When multiple options is specified, those can be grouped after a single hyphen, and the last option in the group may be the only one that takes an argument.

Example on single option

ls -l

Example on grouped options

ls -lI *hidden* .

In the example above, the -l (long listing format) does not takes an argument, but -I (Ignore) takes *hidden* as argument.

Long options

It is not unusual that a command allows both a short (-I) and a long (--ignore) option syntax. A long option begins with two hyphens, and the option itself is identified using a word. If the options take an argument, it may be separated from that argument by a =.

To parse such options, use the getopt_long(2) glibc function, or the (nonportable) argp().

Example using getopt_long()

getopt_long() is quite simple to use. First we create a struct option and defines the following elements: * name is the name of the long option.

  • has_arg
    is: no_argument (or 0) if the option does not take an argu‐ ment; required_argument (or 1) if the option requires an argu‐ ment; or optional_argument (or 2) if the option takes an optional argument.
  • flag
    specifies how results are returned for a long option. If flag is NULL, then getopt_long() returns val. (For example, the calling program may set val to the equivalent short option character.) Otherwise, getopt_long() returns 0, and flag points to a variable which is set to val if the option is found, but left unchanged if the option is not found.
  • val
    is the value to return, or to load into the variable pointed
    to by flag.

The last element of the array has to be filled with zeros,

The next step is to iterate through all options and take care of the arguments.

Example code

Example code

#include <unistd.h>
#include <stdio.h>
#include <stdlib.h>
#include <getopt.h>

struct arguments
{
    int a;
    int b;
    int c;
    int area;
    int perimeter;
};


void print_usage() {
    printf("Usage: triangle [Ap] -a num -b num -c num\n");
}

int main(int argc, char *argv[]) {
    int opt= 0;
    struct arguments arguments;

    /* Default values. */
    arguments.a = -1;
    arguments.b = -1;
    arguments.c = -1;
    arguments.area = 0;
    arguments.perimeter = 0;


    static struct option long_options[] = {
        {"area",      no_argument,       0,  'A' },
        {"perimeter", no_argument,       0,  'p' },
        {"hypotenuse",required_argument, 0,  'c' },
        {"opposite",  required_argument, 0,  'a' },
        {"adjecent",  required_argument, 0,  'b' },
        {0,           0,                 0,  0   }
    };

    int long_index =0;
    while ((opt = getopt_long(argc, argv,"Apa:b:c:",
                   long_options, &long_index )) != -1) {
        switch (opt) {
             case 'A':
                 arguments.area = 1;
                 break;
             case 'p':
                arguments.perimeter = 1;
                 break;
             case 'a':
                 arguments.a = atoi(optarg);
                 break;
             case 'b':
                 arguments.b = atoi(optarg);
                 break;
             case 'c':
                 arguments.c = atoi(optarg);
                 break;
             default: print_usage();
                 exit(EXIT_FAILURE);
        }
    }


    if (arguments.a == -1 || arguments.b == -1 || arguments.c == -1) {
        print_usage();
        exit(EXIT_FAILURE);
    }

    if (arguments.area) {
        arguments.area = (arguments.a*arguments.b)/2;
        printf("Area: %d\n",arguments.area);
    }

    if (arguments.perimeter) {
        arguments.perimeter = arguments.a + arguments.b + arguments.c;
        printf("Perimeter: %d\n",arguments.perimeter);
    }

    return 0;
}

Example of usages

Full example with short options

[13:49:00]marcus@little:~/tmp/cmdline$ ./getopt  -Ap -a 3 -b 4 -c 5
Area: 6
Perimeter: 12

Missing -c option

[14:07:37]marcus@little:~/tmp/cmdline$ ./getopt  -Ap -a 3 -b 4
Usage: triangle [Ap] -a num -b num -c num

Full example with long options

[14:09:38]marcus@little:~/tmp/cmdline$ ./getopt  --area --perimeter --opposite 3 --adjecent 4 --hypotenuse 5
Area: 6
Perimeter: 12

Invalid options

[14:10:14]marcus@little:~/tmp/cmdline$ ./getopt  --area --perimeter --opposite 3 --adjecent 4 -j=3
./getopt: invalid option -- 'j'
Usage: triangle [Ap] -a num -b num -c num

Full example with mixed syntaxes

[14:09:38]marcus@little:~/tmp/cmdline$ ./getopt  -A --perimeter --opposite=3 -b4 -c 5
Area: 6
Perimeter: 12

Variants

getopt_long_only() is like getopt_long(), but '-' as well as "--" can indicate a long option. If an option that starts with '-' (not "--") doesn't match a long option, but does match a short option, it is parsed as a short option instead.

Example using argp()

argp() is a more flexible and powerful than getopt() with friends, but it is not part of the POSIX standard and is therefr not portable between different POSIX-compatible operating systems. However, argp() provides a few interresting features that getopt() does not.

These features include automatically producing output in response to the ‘--help’ and ‘--version’ options, as described in the GNU coding standards. Using argp makes it less likely that programmers will neglect to implement these additional options or keep them up to date.

The implementation is pretty much straigt forwards and similiar to getopt() with a few notes.

const char *argp_program_version = Triangle 1.0";
const char *argp_program_bug_address = "<marcus.folkesson@combitech.se>";

Is used in automatic generation for the --help and --version options.

struct argp_option

This structure specifies a single option that an argp parser understands, as well as how to parse and document that option. It has the following fields:

  • const char *name
    The long name for this option, corresponding to the long option --name; this field may be zero if this option only has a short name. To specify multiple names for an option, additional entries may follow this one, with the OPTION_ALIAS flag set. See Argp Option Flags.
  • int key
    The integer key provided by the current option to the option parser. If key has a value that is a printable ASCII character (i.e., isascii (key) is true), it also specifies a short option ‘-char’, where char is the ASCII character with the code key.
  • const char *arg
    If non-zero, this is the name of an argument associated with this option, which must be provided (e.g., with the --name=value or -char value syntaxes), unless the OPTION_ARG_OPTIONAL flag (see Argp Option Flags) is set, in which case it may be provided.
  • int flags
    Flags associated with this option, some of which are referred to above. See Argp Option Flags.
  • const char *doc
    A documentation string for this option, for printing in help messages.

If both the name and key fields are zero, this string will be printed tabbed left from the normal option column, making it useful as a group header. This will be the first thing printed in its group. In this usage, it’s conventional to end the string with a : character.

Example code

Example code with little more comments

#include <stdlib.h>
#include <argp.h>

const char *argp_program_version = "Triangle 1.0";
const char *argp_program_bug_address = "<marcus.folkesson@combitech.se>";

/* Program documentation. */
static char doc[] = "Triangle example";

/* A description of the arguments we accept. */
static char args_doc[] = "ARG1 ARG2";

/* The options we understand. */
static struct argp_option options[] = {
    {"area",        'A',    0,  0,  "Calculate area"},
    {"perimeter",   'p',    0,  0,  "Calculate perimeter"},
    {"hypotenuse",  'c',    "VALUE",  0,  "Specify hypotenuse of the triangle"},
    {"opposite",    'b',    "VALUE",  0,  "Specify opposite of the triangle"},
    {"adjecent",    'a',    "VALUE",  0,  "Specify adjecent of the triangle"},
    { 0 }
};

/* Used by main to communicate with parse_opt. */
struct arguments
{
    int a;
    int b;
    int c;
    int area;
    int perimeter;
};

/* Parse a single option. */
static error_t parse_opt (int key, char *arg, struct argp_state *state)
{
    struct arguments *arguments = (struct arguments*)state->input;

    switch (key) {
        case 'a':
            arguments->a = atoi(arg);
            break;
        case 'b':
            arguments->b = atoi(arg);
            break;
        case 'c':
            arguments->c = atoi(arg);
            break;
        case 'p':
            arguments->perimeter = 1;
            break;
        case 'A':
            arguments->area = 1;
            break;

        default:
            return ARGP_ERR_UNKNOWN;
    }
    return 0;
}

/* Our argp parser. */
static struct argp argp = { options, parse_opt, args_doc, doc };

int
main (int argc, char **argv)
{
    struct arguments arguments;

    /* Default values. */
    arguments.a = -1;
    arguments.b = -1;
    arguments.c = -1;
    arguments.area = 0;
    arguments.perimeter = 0;

    /* Parse our arguments; every option seen by parse_opt will
     *      be reflected in arguments. */
    argp_parse (&argp, argc, argv, 0, 0, &arguments);


    if (arguments.a == -1 || arguments.b == -1 || arguments.c == -1) {
        exit(EXIT_FAILURE);
    }

    if (arguments.area) {
        arguments.area = (arguments.a*arguments.b)/2;
        printf("Area: %d\n",arguments.area);
    }

    if (arguments.perimeter) {
        arguments.perimeter = arguments.a + arguments.b + arguments.c;
        printf("Perimeter: %d\n",arguments.perimeter);
    }

    return EXIT_SUCCESS;
}

Example of usages

This application gives the same output as the getopt() usage, with the following extra features:

The options --help, --usage and --version is automaically generated

[15:53:04]marcus@little:~/tmp/cmdline$ ./argp --help
Usage: argp [OPTION...] ARG1 ARG2
Triangle example

  -a, --adjecent=VALUE       Specify adjecent of the triangle
  -A, --area                 Calculate area
  -b, --opposite=VALUE       Specify opposite of the triangle
  -c, --hypotenuse=VALUE     Specify hypotenuse of the triangle
  -p, --perimeter            Calculate perimeter
  -?, --help                 Give this help list
      --usage                Give a short usage message
  -V, --version              Print program version

Mandatory or optional arguments to long options are also mandatory or optional
for any corresponding short options.

Report bugs to <marcus.folkesson@combitech.se>.

Version information

[15:53:08]marcus@little:~/tmp/cmdline$ ./argp --version
Triangle 1.0

Conclusion

Parsing command line options is simple. argp() provides a log of features that I really appreciate.

When portability is no issue, I allways go for argp() as, besides the extra features, the interface is more appealing.

Embedded Linux Conference 2019

Embedded Linux Conference 2019

Here we go again! This trip got exited even before it begun. I checked my passport the day before we should leave and noticed that my passport has expired. Outch. Fortunataly I was able to get a temporary passport at the airport. I must admit that I'm not traveling that often and do not have these 'must-checks' in my muscle memory..

This time we were heading Lyon in France. The weather is not the best but at least it is not freezing cold as in Sweden as it is this time of the year.

The conference

The conference this year is good as usual. Somehow, my attendence focus has switched from the technical talks to actually connect and talk to people. Of course, I have a group of people that I allways meet (it is mostly the same people that shows up on the conferences, after all), but I have met far more people than I used to. Am I beginning to be social? Anyway, as said before, it is fun to have a face on the patches I've reviewed or got comments on.

The talks

I mostly go for the "heavy technical" talks, but the talk I appreciated most this year had a very low technical level. The talk was performed by Gardena [1] that is doing gardening tools. Yes, water hoses and stuff. They described their journey from a product family that historically have no software at all, to a full blown embedded Linux system with all the legal implications that you can encounter with open source licenses. Gardena became sued for breaking the GPL license, which could be a very costly story. What Gardena did was absolutely the best way to handle it, and it was really nice to hear about. The consensus is that Gardena now have a public github account [2] containing the software for their Garden Gateway products [3]. Support for the SoC that is used were using is not only published, but also mainlined(!!).

Gardena hired contractors from Denx [4] for mainlining U-Boot and Linux kernel and also hired the maintainer of the radiochip that they were using. Thanks to this, all open parts of their product is mainlined and Gardena even managed to get the radio certified, which have helped at least two other companies.

Hiring the right folks for the right tasks was really the best thing Gardena could do. The radiochip maintainer fixed their problem in 48 man-hour, something that could take months to fix for Gardena. The estimated cost of all these "mainlining work" was only 10% of their budget, which is really nothing. It also made it possible for Gardena to put their products on the market in time. One big bonus is that the maintainence is far mor easy when the code is mainlined.

This is also how we work on Combitech. Linux is not part of our customers "core competence", and it really should not be. Linux is our core competence, that is why our customers let us take care of "our thing", IOW Linux development.

But why does all this make me so happy? First of all, the whole Open Source Community is really a big thing to me. It has influenced both my career choices and my view on software. In fact, I'm not even sure that I had enjoyed programming without Open Source.

So, Gardena, Keep up the good work!

/media/elce2019.jpg

ligsegfault.so

libsegfault.so

The dynamic linker [1] in a Linux system is using several environment variables to customize it's behavior. The most commonly used is probably LD_LIBRARY_PATH which is a list of directories where it search for libraries at execution time. Another variable I use quite often is LD_TRACE_LOADED_OBJECTS to let the program list its dynamic dependencies, just like ldd(1).

For example, consider the following output

$ LD_TRACE_LOADED_OBJECTS=1 /bin/bash
    linux-vdso.so.1 (0x00007ffece29e000)
    libreadline.so.7 => /usr/lib/libreadline.so.7 (0x00007fc9b82d1000)
    libdl.so.2 => /usr/lib/libdl.so.2 (0x00007fc9b80cd000)
    libc.so.6 => /usr/lib/libc.so.6 (0x00007fc9b7d15000)
    libncursesw.so.6 => /usr/lib/libncursesw.so.6 (0x00007fc9b7add000)
    /lib64/ld-linux-x86-64.so.2 (0x00007fc9b851f000)
    libtinfo.so.6 => /usr/lib/libtinfo.so.6 (0x00007fc9b78b1000)

LD_PRELOAD

LD_PRELOAD is a list of additional shared objects that should be loaded before all other dynamic dependencies. When the loader is resolving symbols, it sequentially walk through the list of dynamic shared objects and takes the first match. This makes it possible to overide functions in other shared objects and change the behavior of the application completely.

Consider the following example

$ LD_PRELOAD=/usr/lib/libSegFault.so LD_TRACE_LOADED_OBJECTS=1 /bin/bash
    linux-vdso.so.1 (0x00007ffc73f61000)
    /usr/lib/libSegFault.so (0x00007f131c234000)
    libreadline.so.7 => /usr/lib/libreadline.so.7 (0x00007f131bfe6000)
    libdl.so.2 => /usr/lib/libdl.so.2 (0x00007f131bde2000)
    libc.so.6 => /usr/lib/libc.so.6 (0x00007f131ba2a000)
    libncursesw.so.6 => /usr/lib/libncursesw.so.6 (0x00007f131b7f2000)
    /lib64/ld-linux-x86-64.so.2 (0x00007f131c439000)
    libtinfo.so.6 => /usr/lib/libtinfo.so.6 (0x00007f131b5c6000)

Here we have preloaded libSegFault and it is listed in second place. In the first place we have linux-vdso.so.1 which is a Virtual Dynamic Shared Object provided by the Linux kernel. The VDSO deserves it's own separate blog post, it is a cool feature that maps kernel code into the a process's context as a .text segment in a virtual library.

libSegFault.so

libSegFault.so is part of glibc [2] and comes with your toolchain. The library is for debugging purpose and is activated by preload it at runtime. It does not actually overrides functions but register signal handlers in a constructor (yes, you can execute code before main) for specified signals. By default only SIGEGV (see signal(7)) is registered. These registered handlers print a backtrace for the applicaton when the signal is delivered. See its implementation in [3].

Set the environment variable SEGFAULT_SIGNALS to explicit select signals you want to register a handler for.

/media/libsegfault.png

This is an useful feature for debug purpose. The best part is that you don't have to recompile your code.

libSegFault in action

Our application

Consider the following in real life application taken directly from the local nuclear power plant:

void handle_uranium(char *rod)
{
    *rod = 0xAB;
}

void start_reactor()
{
    char *rod = 0x00;
    handle_uranium(rod);
}

int main()
{
    start_reactor();
}

The symptom

We are seeing a segmentation fault when operate on a particular uranium rod, but we don't know why.

Use libSegFault

Start the application with libSegFault preloaded and examine the dump:

$ LD_PRELOAD=/usr/lib/libSegFault.so ./powerplant
*** Segmentation fault
Register dump:

 RAX: 0000000000000000   RBX: 0000000000000000   RCX: 0000000000000000
 RDX: 00007ffdf6aba5a8   RSI: 00007ffdf6aba598   RDI: 0000000000000000
 RBP: 00007ffdf6aba480   R8 : 000055d2ad5e16b0   R9 : 00007f98534729d0
 R10: 0000000000000008   R11: 0000000000000246   R12: 000055d2ad5e14f0
 R13: 00007ffdf6aba590   R14: 0000000000000000   R15: 0000000000000000
 RSP: 00007ffdf6aba480

 RIP: 000055d2ad5e1606   EFLAGS: 00010206

 CS: 0033   FS: 0000   GS: 0000

 Trap: 0000000e   Error: 00000006   OldMask: 00000000   CR2: 00000000

 FPUCW: 0000037f   FPUSW: 00000000   TAG: 00000000
 RIP: 00000000   RDP: 00000000

 ST(0) 0000 0000000000000000   ST(1) 0000 0000000000000000
 ST(2) 0000 0000000000000000   ST(3) 0000 0000000000000000
 ST(4) 0000 0000000000000000   ST(5) 0000 0000000000000000
 ST(6) 0000 0000000000000000   ST(7) 0000 0000000000000000
 mxcsr: 1f80
 XMM0:  00000000000000000000000000000000 XMM1:  00000000000000000000000000000000
 XMM2:  00000000000000000000000000000000 XMM3:  00000000000000000000000000000000
 XMM4:  00000000000000000000000000000000 XMM5:  00000000000000000000000000000000
 XMM6:  00000000000000000000000000000000 XMM7:  00000000000000000000000000000000
 XMM8:  00000000000000000000000000000000 XMM9:  00000000000000000000000000000000
 XMM10: 00000000000000000000000000000000 XMM11: 00000000000000000000000000000000
 XMM12: 00000000000000000000000000000000 XMM13: 00000000000000000000000000000000
 XMM14: 00000000000000000000000000000000 XMM15: 00000000000000000000000000000000

Backtrace:
./powerplant(+0x606)[0x55d2ad5e1606]
./powerplant(+0x628)[0x55d2ad5e1628]
./powerplant(+0x639)[0x55d2ad5e1639]
/usr/lib/libc.so.6(__libc_start_main+0xea)[0x7f9852ec6f6a]
./powerplant(+0x51a)[0x55d2ad5e151a]

Memory map:

55d2ad5e1000-55d2ad5e2000 r-xp 00000000 00:13 14897704                   /home/marcus/tmp/segfault/powerplant
55d2ad7e1000-55d2ad7e2000 r--p 00000000 00:13 14897704                   /home/marcus/tmp/segfault/powerplant
55d2ad7e2000-55d2ad7e3000 rw-p 00001000 00:13 14897704                   /home/marcus/tmp/segfault/powerplant
55d2ada9c000-55d2adabd000 rw-p 00000000 00:00 0                          [heap]
7f9852c8f000-7f9852ca5000 r-xp 00000000 00:13 13977863                   /usr/lib/libgcc_s.so.1
7f9852ca5000-7f9852ea4000 ---p 00016000 00:13 13977863                   /usr/lib/libgcc_s.so.1
7f9852ea4000-7f9852ea5000 r--p 00015000 00:13 13977863                   /usr/lib/libgcc_s.so.1
7f9852ea5000-7f9852ea6000 rw-p 00016000 00:13 13977863                   /usr/lib/libgcc_s.so.1
7f9852ea6000-7f9853054000 r-xp 00000000 00:13 13975885                   /usr/lib/libc-2.26.so
7f9853054000-7f9853254000 ---p 001ae000 00:13 13975885                   /usr/lib/libc-2.26.so
7f9853254000-7f9853258000 r--p 001ae000 00:13 13975885                   /usr/lib/libc-2.26.so
7f9853258000-7f985325a000 rw-p 001b2000 00:13 13975885                   /usr/lib/libc-2.26.so
7f985325a000-7f985325e000 rw-p 00000000 00:00 0
7f985325e000-7f9853262000 r-xp 00000000 00:13 13975827                   /usr/lib/libSegFault.so
7f9853262000-7f9853461000 ---p 00004000 00:13 13975827                   /usr/lib/libSegFault.so
7f9853461000-7f9853462000 r--p 00003000 00:13 13975827                   /usr/lib/libSegFault.so
7f9853462000-7f9853463000 rw-p 00004000 00:13 13975827                   /usr/lib/libSegFault.so
7f9853463000-7f9853488000 r-xp 00000000 00:13 13975886                   /usr/lib/ld-2.26.so
7f9853649000-7f985364c000 rw-p 00000000 00:00 0
7f9853685000-7f9853687000 rw-p 00000000 00:00 0
7f9853687000-7f9853688000 r--p 00024000 00:13 13975886                   /usr/lib/ld-2.26.so
7f9853688000-7f9853689000 rw-p 00025000 00:13 13975886                   /usr/lib/ld-2.26.so
7f9853689000-7f985368a000 rw-p 00000000 00:00 0
7ffdf6a9b000-7ffdf6abc000 rw-p 00000000 00:00 0                          [stack]
7ffdf6bc7000-7ffdf6bc9000 r--p 00000000 00:00 0                          [vvar]
7ffdf6bc9000-7ffdf6bcb000 r-xp 00000000 00:00 0                          [vdso]
ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0                  [vsyscall]

At a first glance, the information may feel overwelming, but lets go through the most importat lines.

The backtrace lists the call chain when the the signal was delivered to the application. The first entry is on top of the stack

Backtrace:
./powerplant(+0x606)[0x55d2ad5e1606]
./powerplant(+0x628)[0x55d2ad5e1628]
./powerplant(+0x639)[0x55d2ad5e1639]
/usr/lib/libc.so.6(__libc_start_main+0xea)[0x7f9852ec6f6a]
./powerplant(+0x51a)[0x55d2ad5e151a]

Here we can see that the last executed instruction is at address 0x55d2ad5e1606. The tricky part is that the address is not absolute in the application, but virtual for the whole process. In other words, we need to calculate the address to an offset within the application's .text segment. If we look at the Memory map we see three entries for the powerplant application:

55d2ad5e1000-55d2ad5e2000 r-xp 00000000 00:13 14897704                   /home/marcus/tmp/segfault/powerplant
55d2ad7e1000-55d2ad7e2000 r--p 00000000 00:13 14897704                   /home/marcus/tmp/segfault/powerplant
55d2ad7e2000-55d2ad7e3000 rw-p 00001000 00:13 14897704                   /home/marcus/tmp/segfault/powerplant

Why three? Most ELF files (application or library) has at least three memory mapped sections: - .text, The executable code - .rodata, read only data - .data, read/write data

With help of the permissions it is possible to figure out which mapping correspond to each section.

The last mapping has rw- as permissions and is probably our .data section as it allows both write and read. The middle mapping has r-- and is a read only mapping - probably our .rodata section. The first mapping has r-x which is read-only and executable. This must be our .text section!

Now we can take the address from our backtrace and subtract with the offset address for our .text section: 0x55d2ad5e1606 - 0x55d2ad5e1000 = 0x606

Use addr2line to get the corresponding line our source code

$ addr2line -e ./powerplant -a 0x606
    0x0000000000000606
    /home/marcus/tmp/segfault/main.c:3

If we go back to the source code, we see that line 3 in main.c is

*rod = 0xAB;

Here we have it. Nothing more to say.

Conclusion

libSegFault.so has been a great help over a long time. The biggest benefit is that you don't have to recompile your application when you want to use the feature. However, you cannot get the line number from addr2line if the application is not compiled with debug symbols, but often it is not that hard to figure out the context out from a dissassembly of your application.

Embedded Linux Conference 2018

Embedded Linux Conference 2018

Ok, time for another conference. This time in Edinburgh, Scottland. My travel is limited to Edinburgh, but this city has a lot of things to see, including Edinburgh Castle, the Royal Botanic Garden, the clock that is always is 3 minutes wrong [1] and lots of more. A sidenote, yes I've tried Haggis as it is a must-try-thing and so should you. But be prepared to buy a backup-meal.

The conference this year has a few talks already on sunday. I'm going for Michael Kerrisk talks about cgroups (control groups) [2]. Michael is a great speaker as usual. His book, The Linux Programming Interface [3] is the only book that you need about system programming as it cover everything.

I was listen to a talk about CGROUPS that i appreciated a lot.

Control groups (cgroups)

cgroups is a hierchy of of processes with several controllers applied to it. These controllers restrict resources such as memory usage and cpu utilisation for a certain group.

Worth to tell is that there are currently two version of cgroups, cgroupv1 and cgroupv2. These are documented in Documentation/cgroup-v1/* and Documentation/admin-guide/cgroup-v2.rst respectively. I've been using cgroupv1 on some system configurations and I can just say that I'm glad that we have a v2. There is no consistency between controllers in v1 and the support for multiple hierchies is just messy, just to mention a few issus with v1. v2 is different. It has a design in mind (v1 was more of a design by implementation) and a set of design-rules. Even better - it has maintainers that make sure the design rules are followed. With all that in mind we hopefully won't end up with a v3... :-)

However, v2 does not have all functionality that v1 provides, but it is on its way into the kernel. v1 and v2 can coexist though, as long as they do not use the same controllers.

The conference

I enjoyed the conference and met a lot of interresting people. It is really fun to put a face on those patches I have reviewed on email :-)

/media/edinburgh.jpg

Lund Linux Conference 2018

Lund Linux Conference 2018

It is just two weeks from now til the Lund Linux Conference (LLC) [1] begins! LLC is a two-day conference with the same layout as the bigger Linux conferences - just smaller, but just as nice.

There will be talks about PCIe, The serial device bus, security in cars and a few more topics. My highlights this year is to hear about the XDP (eXpress Data Path) [2] to get really fast packet processing with standard Linux. For the last six months, XDP has had great progress and is a technically cool feature.

Here we are back in 2017:

/media/lund-linuxcon-2018.jpg

OOM-killer

OOM-killer

When the system is running out of memory, the Out-Of-Memory (OOM) killer picks a process to kill based on the current memory footprint. In case of OOM, we will calculate a badness score between 0 (never kill) and 1000 for each process in the system. The process with the highest score will be killed. A score of 0 is reserved for unkillable tasks such as the global init process (see [1]) or kernel threads (processes with PF_KTHREAD flag set).

/media/oomkiller.jpg

The current score of a given process is exposed in procfs, see /proc/[pid]/oom_score, and may be adjusted by setting /proc/[pid]/oom_score_adj. The value of oom_score_adj is added to the score before it is used to determine which task to kill. The value may be set between OOM_SCORE_ADJ_MIN (-1000) and OOM_SCORE_DJ_MAX (+1000). This is useful if you want to guarantee that a process never is selected by the OOM killer.

The calculation is simple (nowadays), if a task is using all its allowed memory, the badness score will be calculated to 1000. If it is using half of its allowed memory, the badness score is calculated to 500 and so on. By setting oom_score_adj to -1000, the badness score sums up to <=0 and the task will never be killed by OOM.

There is one more thing that affects the calculation; if the process is running with the capability CAP_SYS_ADMIN, it gets a 3% discount, but that is simply it.

The old implementation

Before v2.6.36, the calculation of badness score tried to be smarter, besides looking for the total memory usage (task->mm->total_vm), it also considered: - Whether the process creates a lot of children - Whether the process has been running for a long time, or has used a lot of CPU time - Whether the process has a low nice value - Whether the process is privileged (CAP_SYS_ADMIN or CAP_SYS_RESOURCE set) - Whether the process is making direct hardware access

At first glance, all these criteria looks valid, but if you think about it a bit, there is a lot of pitfalls here which makes the selection not so fair. For example: A process that creates a lot of children and consumes some memory could be a leaky webserver. Another process that fits into the description is your session manager for your desktop environment which naturally creates a lot of child processes.

The new implementation

This heuristic selection has evolved over time, instead of looking on mm->total_vm for each task, the task's RSS (resident set size, [2]) and swap space is used instead. RSS and Swap space gives a better indication of the amount that we will be able to free if we chose this task. The drawback with using mm->total_vm is that it includes overcommitted memory ( see [3] for more information ) which is pages that the process has claimed but has not been physically allocated.

The process is now only counted as privileged if CAP_SYS_ADMIN is set, not CAP_SYS_RESOURCE as before.

The code

The whole implementation of OOM killer is located in mm/oom_kill.c. The function oom_badness() will be called for each task in the system and returns the calculated badness score.

Let's go through the function.

unsigned long oom_badness(struct task_struct *p, struct mem_cgroup *memcg,
              const nodemask_t *nodemask, unsigned long totalpages)
{
    long points;
    long adj;

    if (oom_unkillable_task(p, memcg, nodemask))
        return 0;

Looking for unkillable tasks such as the global init process.

p = find_lock_task_mm(p);
if (!p)
    return 0;

adj = (long)p->signal->oom_score_adj;
if (adj == OOM_SCORE_ADJ_MIN ||
        test_bit(MMF_OOM_SKIP, &p->mm->flags) ||
        in_vfork(p)) {
    task_unlock(p);
    return 0;
}

If proc/[pid]/oom_score_adj is set to OOM_SCORE_ADJ_MIN (-1000), do not even consider this task

points = get_mm_rss(p->mm) + get_mm_counter(p->mm, MM_SWAPENTS) +
    atomic_long_read(&p->mm->nr_ptes) + mm_nr_pmds(p->mm);
task_unlock(p);

Calculate a score based on RSS, pagetables and used swap space

if (has_capability_noaudit(p, CAP_SYS_ADMIN))
    points -= (points * 3) / 100;

If it is root process, give it a 3% discount. We are no mean people after all

adj *= totalpages / 1000;
points += adj;

Normalize and add the oom_score_adj value

return points > 0 ? points : 1;

At last, never return 0 for an eligible task as it is reserved for non killable tasks

}

Conclusion

The OOM logic is quite straightforward and seems to have been stable for a long time (v2.6.36 was released in october 2010). The reason why I was looking at the code was that I did not think the behavior I saw when experimenting corresponds to what was written in the man page for oom_score. It turned out that the manpage was not updated when the new calculation was introduced back in 2010.

I have updated the manpage and it is available in v4.14 of the Linux manpage project [4].

commit 5753354a3af20c8b361ec3d53caf68f7217edf48
Author: Marcus Folkesson <marcus.folkesson@gmail.com>
Date:   Fri Nov 17 13:09:44 2017 +0100

    proc.5: Update description of /proc/<pid>/oom_score

    After Linux 2.6.36, the heuristic calculation of oom_score
    has changed to only consider used memory and CAP_SYS_ADMIN.

    See kernel commit a63d83f427fbce97a6cea0db2e64b0eb8435cd10.

    Signed-off-by: Marcus Folkesson <marcus.folkesson@gmail.com>
    Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>

diff --git a/man5/proc.5 b/man5/proc.5
index 82d4a0646..4e44b8fba 100644
--- a/man5/proc.5
+++ b/man5/proc.5
@@ -1395,7 +1395,9 @@ Since Linux 2.6.36, use of this file is deprecated in favor of
 .IR /proc/[pid]/oom_score_adj .
 .TP
 .IR /proc/[pid]/oom_score " (since Linux 2.6.11)"
-.\" See mm/oom_kill.c::badness() in the 2.6.25 sources
+.\" See mm/oom_kill.c::badness() in pre 2.6.36 sources
+.\" See mm/oom_kill.c::oom_badness() after 2.6.36
+.\" commit a63d83f427fbce97a6cea0db2e64b0eb8435cd10
 This file displays the current score that the kernel gives to
 this process for the purpose of selecting a process
 for the OOM-killer.
@@ -1403,7 +1405,16 @@ A higher score means that the process is more likely to be
 selected by the OOM-killer.
 The basis for this score is the amount of memory used by the process,
 with increases (+) or decreases (\-) for factors including:
-.\" See mm/oom_kill.c::badness() in the 2.6.25 sources
+.\" See mm/oom_kill.c::badness() in pre 2.6.36 sources
+.\" See mm/oom_kill.c::oom_badness() after 2.6.36
+.\" commit a63d83f427fbce97a6cea0db2e64b0eb8435cd10
+.RS
+.IP * 2
+whether the process is privileged (\-);
+.\" More precisely, if it has CAP_SYS_ADMIN or (pre 2.6.36) CAP_SYS_RESOURCE
+.RE
+.IP
+Before kernel 2.6.36 the following factors were also used in the calculation of oom_score:
 .RS
 .IP * 2
 whether the process creates a lot of children using
@@ -1413,10 +1424,7 @@ whether the process creates a lot of children using
 whether the process has been running a long time,
 or has used a lot of CPU time (\-);
 .IP *
-whether the process has a low nice value (i.e., > 0) (+);
-.IP *
-whether the process is privileged (\-); and
-.\" More precisely, if it has CAP_SYS_ADMIN or CAP_SYS_RESOURCE
+whether the process has a low nice value (i.e., > 0) (+); and
 .IP *
 whether the process is making direct hardware access (\-).
 .\" More precisely, if it has CAP_SYS_RAWIO

Embedded Linux course in Linköping

Embedded Linux course in Linköping

I tech in our Embedded Linux course on a regular basis, this time in Linköping. It's a fun course with interesting labs where you will write your own linux device driver for a custom board.

The course itself is quite moderate, but with talented participants, we easily slip over to more interesting things like memory management, the MTD subsystem, how ftrace works internally and how to use it, different contexts, introduce perf and much more.

This time was no exception.

/media/embedded-linux-course.jpg

FIT vs legacy image format

FIT vs legacy image format

U-Boot supports several image formats when booting a kernel. However, a Linux system usually need multiple files for booting. Such files may be the kernel itself, an initrd and a device tree blob.

A typical embedded Linux system have all these files in at least two-three different configurations. It is not uncommon to have

  • Default configuration
  • Rescue configuration
  • Development configuration
  • Production configuration
  • ...

Only these four configurations may involve twelve different files. Or maybe the devicetree is shared between two configurations. Or maybe the initrd is... Or maybe the..... you got the point. It has a farily good chance to end up quite messy.

This is a problem with the old legacy formats and is somehow addressed in the FIT (Flattened Image Tree) format.

Lets first look at the old formats before we get into FIT.

zImage

The most well known format for the Linux kernel is the zImage. The zImage contains a small header followed by a self-extracting code and finally the payload itself. U-Boot support this format by the bootz command.

Image layout

zImage format
Header
Decompressing code
Compressed Data

uImage

When talking about U-Boot, the zImage is usually encapsulated in a file called uImage created with the mkimage utility. Beside image data, the uImage also contains information such as OS type, loader information, compression type and so on. Both data and header is checksumed with CRC32.

The uImage format also supports multiple images. The zImage, initrd and devicetree blob may therefor be included in one single monolithic uImage. Drawbacks with this monolith is that it has no flexible indexing, no hash integrity and no support for security at all.

Image layout

uImage format
Header
Header checksum
Data size
Data load address
Entry point address
Data CRC
OS, CPU
Image type
Compression type
Image name
Image data

FIT

The FIT (Flattened Image Tree) has been around for a while but is for unknown reason rarely used in real systems, based on my experience. FIT is a tree structure like Device Tree (before they change format to YAML: https://lwn.net/Articles/730217/) and handles images of various types.

These images is then used in something called configurations. One image may be a used in several configurations.

This format has many benifits compared to the other:

  • Better solutions for multi component images
    • Multiple kernels (productions, feature, debug, rescue...)
    • Multiple devicetrees
    • Multiple initrds
  • Better hash integrity of images. Supports different hash algorithms like
    • SHA1
    • SHA256
    • MD5
  • Support signed images
    • Only boot verified images
    • Detect malware

Image Tree Source

The file describing the structure is called .its (Image Tree Source). Here is an example of a .its file

/dts-v1/;

/ {
    description = "Marcus FIT test";
    #address-cells = <1>;

    images {
        kernel@1 {
            description = "My default kenel";
            data = /incbin/("./zImage");
            type = "kernel";
            arch = "arm";
            os = "linux";
            compression = "none";
            load = <0x83800000>;
            entry = <0x83800000>;
            hash@1 {
                algo = "md5";
            };
        };

        kernel@2 {
            description = "Rescue image";
            data = /incbin/("./zImage");
            type = "kernel";
            arch = "arm";
            os = "linux";
            compression = "none";
            load = <0x83800000>;
            entry = <0x83800000>;
            hash@1 {
                algo = "crc32";
            };
        };

        fdt@1 {
            description = "FDT for my cool board";
            data = /incbin/("./devicetree.dtb");
            type = "flat_dt";
            arch = "arm";
            compression = "none";
            hash@1 {
                algo = "crc32";
            };
        };


    };

    configurations {
        default = "config@1";

        config@1 {
            description = "Default configuration";
            kernel = "kernel@1";
            fdt = "fdt@1";
        };

        config@2 {
            description = "Rescue configuration";
            kernel = "kernel@2";
            fdt = "fdt@1";
        };

    };
};

This .its file has two kernel images (default and rescue) and one FDT image. Note that the default kernel is hashed with md5 and the other with crc32. It is also possible to use several hash-functions per image.

It also specifies two configuration, Default and Rescue. The two configurations is using different kernels (well, it is the same kernel in this case since both is pointing to ./zImage..) but share the same FDT. It is easy to build up new configurations on demand.

The .its file is then passed to mkimage to generate an itb (Image Tree Blob):

mkimage -f kernel.its kernel.itb

which is bootable from U-Boot.

Look at ./doc/uImage.FIT/ in the U-Boot source code for more examples on how a .its-file can look like. It also contains examples on signed images which is worth a look.

Boot from U-Boot

To boot from U-Boot, use the bootm command and specify the physical address for the FIT image and the configuration you want to boot.

Example on booting config@1 from the .its above:

bootm 0x80800000#config@1

Full example with bootlog:

U-Boot 2015.04 (Oct 05 2017 - 14:25:09)

CPU:   Freescale i.MX6UL rev1.0 at 396 MHz
CPU:   Temperature 42 C
Reset cause: POR
Board: Marcus Cool board
fuse address == 21bc400
serialnr-low : e1fe012a
serialnr-high : 243211d4
       Watchdog enabled
I2C:   ready
DRAM:  512 MiB
Using default environment

In:    serial
Out:   serial
Err:   serial
Net:   FEC1
Boot from USB for mfgtools
Use default environment for                              mfgtools
Run bootcmd_mfg: run mfgtool_args;bootz ${loadaddr} - ${fdt_addr};
Hit any key to stop autoboot:  0
=> bootm 0x80800000#config@1
## Loading kernel from FIT Image at 80800000 ...
   Using 'config@1' configuration
   Trying 'kernel@1' kernel subimage
     Description:  My cool kernel
     Type:         Kernel Image
     Compression:  uncompressed
     Data Start:   0x808000d8
     Data Size:    7175328 Bytes = 6.8 MiB
     Architecture: ARM
     OS:           Linux
     Load Address: 0x83800000
     Entry Point:  0x83800000
     Hash algo:    crc32
     Hash value:   f236d022
   Verifying Hash Integrity ... crc32+ OK
## Loading fdt from FIT Image at 80800000 ...
   Using 'config@1' configuration
   Trying 'fdt@1' fdt subimage
     Description:  FDT for my cool board
     Type:         Flat Device Tree
     Compression:  uncompressed
     Data Start:   0x80ed7e4c
     Data Size:    27122 Bytes = 26.5 KiB
     Architecture: ARM
     Hash algo:    crc32
     Hash value:   1837f127
   Verifying Hash Integrity ... crc32+ OK
   Booting using the fdt blob at 0x80ed7e4c
   Loading Kernel Image ... OK
   Loading Device Tree to 9ef86000, end 9ef8f9f1 ... OK

Starting kernel ...

One thing to think about

Since this FIT image tends to grow in size, it is a good idea to set CONFIG_SYS_BOOTM_LEN in the U-Boot configuration.

- CONFIG_SYS_BOOTM_LEN:
        Normally compressed uImages are limited to an
        uncompressed size of 8 MBytes. If this is not enough,
        you can define CONFIG_SYS_BOOTM_LEN in your board config file
        to adjust this setting to your needs.

Conclusion

I advocate to use the FIT format because it solves many problems that can get really awkward. Just the bonus to program one image in production instead of many is a great benifit. The configurations is well defined and there is no chance that you will end up booting one kernel image with an incompatible devicetree or whatever.

Use FIT images should be part of the bring-up-board activity, not the last thing todo before closing a project. The "Use seperate kernel/dts/initrds for now and then move on to FIT when the platform is stable" does not work. It simple not gonna happend, and it is a mistake.

Magic SysRq

Magic SysRq

Every kernel-hacker should knows about the magic sysrq already, so this post is kind of unnecessary. To enable the magic, make sure CONFIG_MAGIC_SYSRQ is set in your Kernel hacking tab.

I use this feature... a lot. Mostly for set loglevel and reboot the system, but it is also very useful when debugging. So, how to use it?

As everybody is using GNU Screen (what else) as their TTY terminal, the keyboard combination is: ctrl+A B. And here the magic begins! This combination simply sends a SysRq keycode to the target system.

To get information about available commands, press ctrl+A B h. h as in help.:

[669510.910125] SysRq : HELP : loglevel(0-9) reBoot Crash terminate-all-tasks(E) memory-full-oom-kill(F) kill-all-tasks(I) thaw-filesystems(J) saK show-memory-usage(M) nice-all-RT-tasks(N) powerOff show-registers(P) show-all-timers(Q) unRaw Sync show-task-states(T) Unmount show-blocked-tasks(W) dump-ftrace-buffer(Z)
And here we go! My favorites are:
0-9, set debug level
b, reboot the system (ingrained...)
p, show a register dump ( useful if the system hangs)
g, switch console to KGDB (I love KGDB)
1, Show all timers (good when looking for power-thieves)

The best part is that Magic SysRq works in allmost every situation, even if the system is frozen.

For further details see the kernel source file Documentation/sysrq.txt

Take control of your Buffalo Linkstation NAS

Take control of your Buffalo Linkstation NAS

I finally bought a NAS for all of my super-important stuff. It became a Buffalo Linkstation LS200, most because of the price ($300 for 4TB). It supports all of the standard protocols such as FTP, SAMBA, ATP and so on.

However, it would be really useful to use some sane protocols like sftp so you could use rsync for your backup scripts.

Bring a big coffee mug and let the hacking begin....

I knew that the NAS was based on the ARM architecture and supports a whole set of high level protocols, so one qualified guess is that there lives a little penguin in the box.

Lets start with download the latest firmware from the Buffalo webpage. When the firmware is unzipped we have these files:

marcus@tuxie:~/shared/buffalo\$ ls -al
total 780176
drwxr-xr-x  4 marcus marcus      4096 Jun 13 00:06 .
drwxr-xr-x 18 marcus marcus      4096 Jun 12 23:27 ..
-rw-r--r--  1 marcus marcus 190409979 Jun 13 00:06 hddrootfs.img
drwxr-xr-x  2 marcus marcus      4096 Apr  8 13:25 img
-rw-r--r--  1 marcus marcus  12602325 Apr  1 15:25 initrd.img
-rw-r--r--  1 marcus marcus       656 Apr  1 15:25 linkstation_version.ini
-rw-r--r--  1 marcus marcus       198 Apr  1 15:25 linkstation_version.txt
-rw-r--r--  1 marcus marcus 205568610 Jun 12 23:13 LS200_series_FW_1.44.zip
-rw-r--r--  1 marcus marcus    350104 Apr  1 15:25 LSUpdater.exe
-rw-r--r--  1 marcus marcus       327 Apr  1 15:25 LSUpdater.ini
-rw-r--r--  1 marcus marcus    674181 Apr  1 15:25 u-boot.img
-rw-r--r--  1 marcus marcus   2861933 Apr  1 15:25 uImage.img
-rw-r--r--  1 marcus marcus      4880 Apr  8 13:23 update.html

Ok, u-boot.img, uImage.img, initrd.img and hddrootfs.img tells us that I have a black little penguin cage in front of me. First of all, find out what kind of file these *.img files really are.:

marcus@tuxie:~/shared/buffalo$ file ./hddrootfs.img
./hddrootfs.img: Zip archive data, at least v2.0 to extract

Really? It is just an zip-file. Lets extract it then.:

marcus@tuxie:~/shared/buffalo$ unzip hddrootfs.img
Archive:  hddrootfs.img
[hddrootfs.img] hddrootfs.buffalo.updated password:

Of course, it is protected with a password. I will let my old friend John the Ripper take a look at it (I guess I had a great luck, the brute force attack only took 2.5 hours). The password for the file is: aAhvlM1Yp7_2VSm6BhgkmTOrCN1JyE0C5Q6cB3oBB

marcus@tuxie:~/shared/buffalo$ unzip hddrootfs.img
Archive:  hddrootfs.img
[hddrootfs.img] hddrootfs.buffalo.updated password:
  inflating: hddrootfs.buffalo.updated

Terrific. We got a hddrootfs.buffalo.updated file. What is it anyway?:

marcus@tuxie:~/shared/buffalo$ file hddrootfs.buffalo.updated
hddrootfs.buffalo.updated: gzip compressed data, was "rootfs.tar", from Unix, last modified: Tue Apr  1 08:24:05 2014, max compression

It is just a gzip compressed tar archive, couldn't be better! Extract it.:

marcus@tuxie:~/shared/buffalo$ mkdir rootfs
marcus@tuxie:~/shared/buffalo$ tar -xz --numeric-owner -f hddrootfs.buffalo.updated  -C ./rootfs/
marcus@tuxie:~/shared/buffalo$ ls -1l rootfs/
total 80
drwxr-xr-x  2 root root 4096 Apr  1 08:23 bin
-rwxr-xr-x  1 root root 1140 Feb  3 07:51 chroot.sh
drwxr-xr-x  2 root root 4096 Apr  1 08:23 debugtool
drwxr-xr-x  5 root root 4096 Apr  1 08:23 dev
drwxr-xr-x 33 root root 4096 Jun 12 23:17 etc
drwxr-xr-x  4 root root 4096 Apr  1 08:23 home
drwxr-xr-x  9 root root 4096 Apr  1 08:23 lib
drwxr-xr-x  3 root root 4096 Feb  3 07:51 mnt
drwxr-xr-x  2 root root 4096 Apr  1 07:35 opt
-rwxr-xr-x  1 root root 2741 Feb  3 07:51 prepare.sh
drwxr-xr-x  2 root root 4096 Apr  1 07:35 proc
drwxr-xr-x  3 root root 4096 Apr  1 08:23 root
drwxr-xr-x  3 root root 4096 Apr  1 08:22 run
drwxr-xr-x  2 root root 4096 Apr  1 08:23 sbin
drwxr-xr-x  2 root root 4096 Apr  1 07:35 sys
-rwxr-xr-x  1 root root 3751 Feb  3 07:51 test.sh
drwxrwxrwt  3 root root 4096 Apr  1 08:23 tmp
drwxr-xr-x 11 root root 4096 Apr  1 08:23 usr
drwxr-xr-x  9 root root 4096 Apr  1 08:22 var
drwxrwxrwx  6 root root 4096 Apr  1 07:53 www

Here we go!

Modify the root filesystem

First of all, I really would like to have SSH access to the box, and I found that there is a SSH daemon in here (/usr/bin/sshd), but why is it not activated? Take a look in one of the scripts that seems to be related to ssh:

marcus@tuxie:~/shared/buffalo/rootfs$ head -n 20 etc/init.d/sshd.sh
#!/bin/sh
[ -f /etc/nas_feature ] && . /etc/nas_feature
SSHD_DSA=/etc/ssh_host_dsa_key
SSHD_RSA=/etc/ssh_host_rsa_key
SSHD_KEY=/etc/ssh_host_key
SSHD=`which sshd`
if [ "${SSHD}" = "" -o ! -x ${SSHD} ] ; then
 echo "sshd is not supported on this platform!!!"
fi
if [ "${SUPPORT_SFTP}" = "0" ] ; then
        echo "Not support sftp on this model." > /dev/console
        exit 0
fi
umask 000

What about the second if-statement? SUPPORT_SFTP? And what is the /etc/nas_feature file? It does not exist in the package. Is it auto generated at boot? Anyway, I remove the second statement, it seems evil. So, if this starts up the ssh daemon, we would like to login as root, uncomment PermitRootLogin in sshd_config:

marcus@tuxie:~/shared/buffalo/rootfs$ sed -i 's/#PermitRootLogin/ PermitRootLogin/' etc/sshd_config

Then copy your public rsa-key to /root/.ssh/authorized_keys. If you don't have a key, generate it with:

marcus@tuxie:~/shared/buffalo/rootfs$ ssh-keygen

Copy the key to target:

marcus@tuxie:~/shared/buffalo/rootfs$ mkdir ./root/.ssh
marcus@tuxie:~/shared/buffalo/rootfs$ cat ~/.ssh/id_rsa.pub > ./root/.ssh/authorized_keys

This will let you to login without give any password.

Lets try to re-pack the whole thing.:

marcus@tuxie:~/shared/buffalo/rootfs$ mv ../hddrootfs.buffalo.updated{,.old}
marcus@tuxie:~/shared/buffalo/rootfs$ tar -czf ../hddrootfs.buffalo.updated *
marcus@tuxie:~/shared/buffalo/rootfs$ cd ..
marcus@tuxie:~/shared/buffalo$ zip -e hddrootfs.img hddrootfs.buffalo.updated

I encrypt the file with the same password as before, I dare not think about what happens if I don't.

Time to update firmware

I execute the LSUpdater.exe from a virtual Windows machine and hold my thumbs... The update process takes about 8 minutes and is a real pain, would it brick my NAS..?

After a while the power LED is indicating that the NAS is up and running. Wow. Quick! Do a portscan!:

marcus@tuxie:~/shared/buffalo$ sudo nmap -sS 10.0.0.4
[sudo] password for marcus:
Starting Nmap 5.21 ( http://nmap.org ) at 2014-06-13 11:59 CEST
Nmap scan report for nas (10.0.0.4)
Host is up (0.00049s latency).
Not shown: 990 closed ports
PORT      STATE SERVICE
21/tcp    open  ftp
22/tcp    open  ssh
80/tcp    open  http
139/tcp   open  netbios-ssn
443/tcp   open  https
445/tcp   open  microsoft-ds
548/tcp   open  afp
873/tcp   open  rsync
8873/tcp  open  unknown
22939/tcp open  unknown
MAC Address: DC:FB:02:EB:06:A8 (Unknown)
Nmap done: 1 IP address (1 host up) scanned in 0.27 seconds

And there it is! The SSH daemon is running on port 22.:

marcus@tuxie:~$ ssh admin@10.0.0.4
admin@10.0.0.4's password:
[admin@LS220D6A8 ~]$ ls /
bin/        debugtool/  home/       mnt/        proc/       sbin/       tmp/        www/
boot/       dev/        lib/        opt/        root/       sys/        usr/
chroot.sh*  etc/        lost+found/ prepare.sh* run/        test.sh*    var/
[admin@LS220D6A8 ~]$

It is just beautiful!

Wait, what about the /etc/nas_features file?

[admin@LS220D6A8 ~]$ cat /etc/nas_feature
DEFAULT_LANG=english
DEFAULT_CODEPAGE=CP437
REGION_CODE=EU
PRODUCT_CAPACITY="040"
PID=0x0000300D
SERIES_NAME="LinkStation"
PRODUCT_SERIES="LS200"
PRODUCT_NAME="LS220D(SANJO)"
SUPPORT_NTFS_WRITE=on
NTFS_DRIVER="tuxera"
SUPPORT_DIRECT_COPY=on
SUPPORT_RAID=on
SUPPORT_RAID_DEGRADE=off
SUPPORT_FAN=on
SUPPORT_AUTOIP=on
SUPPORT_NEW_DISK_AUTO_REBUILD=off
SUPPORT_2STEP_INSPECTION=off
SUPPORT_RESYNC_DELAY=off
SUPPORT_PRINTER_SERVER=on
SUPPORT_ITUNES_SERVER=on
SUPPORT_DLNA_SERVER=on
SUPPORT_NAS_FIREWALL=off
SUPPORT_IPV6=off
SUPPORT_DHCPS=off
SUPPORT_UPNP=off
SUPPORT_SLIDE_POWER_SWITCH=on
SUPPORT_BITTORRENT=on
BITTORRENT_CLIENT="transmission"
SUPPORT_USER_QUOTA=on
SUPPORT_GROUP_QUOTA=on
SUPPORT_ACL=on
SUPPORT_TIME_MACHINE=on
SUPPORT_SLEEP_TIMER=on
SUPPORT_AD_NT_DOMAIN=on
SUPPORT_RAID0=1
SUPPORT_RAID1=1
SUPPORT_RAID5=0
SUPPORT_RAID6=0
SUPPORT_RAID10=0
SUPPORT_RAID50=0
SUPPORT_RAID60=0
SUPPORT_RAID51=0
SUPPORT_RAID61=0
SUPPORT_NORAID=0
SUPPORT_RAID_REBUILD=1
SUPPORT_AUTH_EXTERNAL=1
SUPPORT_SAMBA_DFS=0
SUPPORT_LINKDEREC_ANALOG=0
SUPPORT_LINKDEREC_DIGITAL=0
SUPPORT_WEBAXS=1
SUPPORT_UPS_SERIAL=0
SUPPORT_UPS_USB=0
SUPPORT_UPS_RECOVER=0
SUPPORT_NUT=0
SUPPORT_SYSLOG_FORWARD=0
SUPPORT_SYSLOG_DOWNLOAD=0
SUPPORT_SHUTDOWN_FROMWEB=0
SUPPORT_REBOOT_FROMWEB=1
SUPPORT_IMHERE=0
SUPPORT_POWER_INTERLOCK=1
SUPPORT_SMTP_AUTH=1
NEED_MICONMON=off
ROOTFS_FS=ext3
USERLAND_FS=xfs
NASFEAT_VM_WRITEBACK=default
NASFEAT_VM_EXPIRE=default
MAX_DISK_NUM=2
MAX_USBDISK_NUM=1
MAX_ARRAY_NUM=1
DEV_BOOT=md0
DEV_ROOTFS1=md1
DEV_SWAP1=md2
SDK_VERSION=2
DEVICE_NETWORK_PRIMARY=eth1
DEVICE_NETWORK_SECONDARY=
DEVICE_NETWORK_NUM=1
DEVICE_HDD1_LINK=disk1_6
DEVICE_HDD2_LINK=disk2_6
DEVICE_HDD3_LINK=disk3_6
DEVICE_HDD4_LINK=disk4_6
DEVICE_HDD5_LINK=disk5_6
DEVICE_HDD6_LINK=disk6_6
DEVICE_HDD7_LINK=disk7_6
DEVICE_HDD8_LINK=disk8_6
DEVICE_HDD_BASE_EDP=md100
DEVICE_HDD1_EDP=md101
DEVICE_HDD2_EDP=md102
DEVICE_HDD3_EDP=md103
DEVICE_HDD4_EDP=md104
DEVICE_HDD5_EDP=md105
DEVICE_HDD6_EDP=md106
DEVICE_HDD7_EDP=md107
DEVICE_HDD8_EDP=md108
DEVICE_MD1_REAL=md10
DEVICE_MD2_REAL=md20
DEVICE_MD3_REAL=md30
DEVICE_MD4_REAL=md40
DEVICE_USB1_LINK=usbdisk1_1
DEVICE_USB2_LINK=usbdisk2_1
DEVICE_USB3_LINK=usbdisk1_5
DEVICE_USB4_LINK=usbdisk2_5
MOUNT_GLOBAL=/mnt
MOUNT_LVM_BASE=/mnt/lvm
MOUNT_HDD1=/mnt/disk1
MOUNT_HDD2=/mnt/disk2
MOUNT_HDD3=/mnt/disk3
MOUNT_HDD4=/mnt/disk4
MOUNT_HDD5=/mnt/disk5
MOUNT_HDD6=/mnt/disk6
MOUNT_HDD7=/mnt/disk7
MOUNT_HDD8=/mnt/disk8
MOUNT_ARRAY1=/mnt/array1
MOUNT_ARRAY2=/mnt/array2
MOUNT_ARRAY3=/mnt/array3
MOUNT_ARRAY4=/mnt/array4
MOUNT_USB1=/mnt/usbdisk1
MOUNT_USB2=/mnt/usbdisk2
MOUNT_USB3=/mnt/usbdisk3
MOUNT_USB4=/mnt/usbdisk4
MOUNT_USB5=/mnt/usbdisk5
MOUNT_MC_BASE=/mnt/mediacartridge
SUPPORT_MC_VER=1
SUPPORT_INTERNAL_DISK_APPEND=0
STORAGE_TYPE=HDD
BODY_COLOR=NORMAL
SUPPORT_MICON=0
SUPPORT_LCD=0
SUPPORT_USER_QUOTA_SOFT=0
SUPPORT_GROUP_QUOTA_SOFT=0
SUPPORT_NFS=0
SUPPORT_LVM=0
SUPPORT_OFFLINEFILE=0
SUPPORT_HIDDEN_SHARE=0
SUPPORT_HOT_SWAP=0
SUPPORT_LCD_LED=0
SUPPORT_ALERT=0
SUPPORT_PORT_TRUNKING=0
SUPPORT_REPLICATION=0
SUPPORT_USER_GROUP_CSV=0
SUPPORT_SFTP=0
SUPPORT_SERVICE_MAPPING=0
SUPPORT_SSLKEY_IMPORT=1
SUPPORT_SLEEPTIMER_DATE=0
SUPPORT_TERA_SEARCH=0
SUPPORT_SECURE_BOOT=0
SUPPORT_PACKAGE_UPDATE=0
SUPPORT_HDD_SPINDOWN=0
SUPPORT_DISK_ENCRYPT=0
SUPPORT_FTPS=1
SUPPORT_CLEANUP_ALL_TRASHBOX=0
SUPPORT_WAKEUP_BY_REBOOT=1
SUPPORT_DTCP_IP=0
SUPPORT_MYSQL=0
SUPPORT_APACHE=0
SUPPORT_PHP=0
SUPPORT_UPS_STANDBY=0
SUPPORT_HIDDEN_RAID_MENU=0
SUPPORT_ISCSI=0
SUPPORT_ISCSI_TYPE=
DEFAULT_WORKINGMODE=
MAX_LVM_VOLUME_NUM=0
MAX_ISCSI_VOLUME_NUM=0
INTERNAL_SCSI_TYPE=multi-host
SUPPORT_ELIMINATE_ADLIMIT=
USB_TREE_TYPE=
SUPPORT_WOL=0
WOL_TYPE=
SUPPORT_HARDLINK_BACKUP=0
SUPPORT_SNMP=0
SUPPORT_EDP=1
SUPPORT_POCKETU=0
SUPPORT_MC=0
SUPPORT_FOFB=0
SUPPORT_EDP_PLUS=0
SUPPORT_INIT_SW=1
SUPPORT_SQUEEZEBOX=0
SUPPORT_OL_UPDATE=1
SUPPORT_AMAZONS3=0
SUPPORT_SURVEILLANCE=0
SUPPORT_WAFS=0
SUPPORT_SUGARSYNC=0
SUPPORT_INTERNAL_DISK_REMOVE=1
SUPPORT_SETTING_RECOVERY_USB=0
SUPPORT_PASSWORD_RECOVERY_USB=0
SUPPORT_AV=
SUPPORT_TUNEUP_RAID=on
SUPPORT_FLICKRFS=0
SUPPORT_WOL_INT=1
TUNE=0
SUPPORT_EDP_PLUS=0
SUPPORT_SXUPTP=0
SUPPORT_SQUEEZEBOX=0
SUPPORT_EYEFI=0
SUPPORT_OL_UPDATE=1
SUPPORT_INIT_SW=1
SUPPORT_USB=1
SUPPORT_WAFS=0
SUPPORT_INFO_LED=1
SUPPORT_ALARM_LED=1
POWER_SWITCH_TYPE=none
SUPPORT_FUNC_SW=1
DLNA_SERVER="twonky"
SUPPORT_TRANSCODER=0
SUPPORT_LAYOUT_SWITCH=1
SUPPORT_UTILITY_DOWNLOAD=1
SUPPORT_BT_CLOUD=0
DEFAULT_VALUE_DLNA=1
DEFAULT_VALUE_BT_CLOUD=0
SUPPORT_EXCLUSION_LED_POWER_INFO_ERROR=1
DEFAULT_DLNA_SERVICE=off
SUPPORT_MOBILE_WEBUI=1
SUPPORT_SHUTDOWN_DEPEND_ON_SW=1
SUPPORT_SPARE_DISK=0

It seems that the files is generated. It also has the SUPPORT_SFTP config that we saw in sshd.sh.

What about the kernel

In the current vanilla kernel, there is devicetrees that seems to be for the Buffalo linkstation.:

marcus@tuxie:~/marcus/linux/linux$ grep -i buffalo arch/arm/boot/dts/*.dts
arch/arm/boot/dts/kirkwood-lschlv2.dts: model = "Buffalo Linkstation LS-CHLv2";
arch/arm/boot/dts/kirkwood-lschlv2.dts: compatible = "buffalo,lschlv2", "buffalo,lsxl", "marvell,kirkwood-88f6281", "marvell,kirkwood";
arch/arm/boot/dts/kirkwood-lsxhl.dts:   model = "Buffalo Linkstation LS-XHL";
arch/arm/boot/dts/kirkwood-lsxhl.dts:   compatible = "buffalo,lsxhl", "buffalo,lsxl", "marvell,kirkwood-88f6281", "marvell,kirkwood";

The device tree seems to match the current CPU:

[admin@LS220D6A8 ~]$ cat /proc/cpuinfo
Processor : Marvell PJ4Bv7 Processor rev 1 (v7l)
BogoMIPS : 795.44
Features : swp half thumb fastmult vfp edsp vfpv3 vfpv3d16 tls
CPU implementer : 0x56
CPU architecture: 7
CPU variant : 0x1
CPU part : 0x581
CPU revision : 1
Hardware : Marvell Armada-370
Revision : 0000
Serial  : 0000000000000000

Also, the kernel is not tainted indicating that there is no out-of-tree modules.:

[admin@LS220D6A8 ~]$ cat /proc/sys/kernel/tainted
0

It could therefor be possible to compile a custom kernel with support for more USB-devices that may be plugged into the NAS.

Other tips

The LSUpdater.exe will refuse to update if the same version of software is already on the target. This means that you cannot upload the same firmware again... unless you change the version!

Together with the uploader application, there is a linkstation_version.ini file that contains information about each of the *.img. The first thing I tried was just to increase the VERSION by one. This makes the LSUpdater.exe move a little forward, It stops complain about the same version, instead it complains about that this firmware is already on the target. However, I needed to change the timestamp of each binary (increased the day by one), then it updates the firmware without any problem.

marcus@tuxie:~/shared/buffalo$ cat linkstation_version.ini
[COMMON]
VERSION=1.44-0.36
BOOT=0.20
KERNEL=2014/04/01 14:33:46
INITRD=2014/04/01 14:35:10
ROOTFS=2014/04/01 15:23:41
FILE_BOOT = u-boot.img
FILE_KERNEL = uImage.img
FILE_INITRD = initrd.img
FILE_ROOTFS = hddrootfs.img

[TARGET_INFO1]
PID=0x0000001D
FILE_KERNEL=uImage.img
KERNEL=2014/04/01 14:33:46
FILE_BOOT_APPLY=u-boot.img
BOOT=0.20
BOOT_UP_CMD=""
[TARGET_INFO2]
PID=0x0000300D
FILE_KERNEL=uImage.img
KERNEL=2014/04/01 14:33:46
FILE_BOOT_APPLY=u-boot.img
BOOT=0.20
BOOT_UP_CMD=""
[TARGET_INFO3]
PID=0x0000300E
FILE_KERNEL=uImage.img
KERNEL=2014/04/01 14:33:46
FILE_BOOT_APPLY=u-boot.img
BOOT=0.20
BOOT_UP_CMD=""