All Articles

Using the third argument *envp[] of the main function to read environment variables

This page has been machine-translated from the original page.

About the third argument *envp[] of the C main function

This time, I will summarize the third argument that can be used when defining the main function in C.

The other day, while looking at the decompiled output from a certain CTF, I came across a main function that takes three arguments, like int main(int argc, char *argv[], char *envp[]).

This third environment-variable argument, *envp[], is defined as follows in the C standard, and it seems to store pointers to the environment variables in the execution environment.

In a hosted environment, the main function takes a third argument, char *envp[].

This argument points to a null-terminated array of pointers to char. Each pointer to char points to a string that provides information about the environment in which the program is executed.

The main function you usually see in C takes the following two arguments.

#include <stdio.h>

int main(int argc, char *argv[]) {
    printf("%d\n", argc);
    while(*argv)
    {
        printf("%s\n", *argv++);
    }
    return 0;
}

These are, respectively, the following arguments.

  • argc : Number of arguments
  • *argv[] : Pointers to the arguments passed at execution time

If you actually compile this source code into an executable named test.o and run it, you get the following output.

$ ./test.o arg1 arg2 arg2
4
./test.o
arg1
arg2
arg2

Now, take a look at the following main function, which takes a third argument, *envp[].

#include <stdio.h>

int main(int argc, char *argv[], char *envp[]) {
    while(*envp)
    {
        printf("%s\n", *envp++);
    }
    return 0;
}

When you run this code, every environment variable is printed one per line.

$ ./test.o 
SHELL=/bin/bash
SESSION_MANAGER=local/parrot:@/tmp/.ICE-unix/1393,unix/parrot:/tmp/.ICE-unix/1393
{{ omitted }}
PATH=/home/parrot/.local/bin:/snap/bin
DBUS_SESSION_BUS_ADDRESS=unix:path=/run/user/1000/bus
UID=1000
QT_SCALE_FACTOR=1
_=./test.o
OLDPWD=/home/parrot

This is equivalent to the output you get by running the env command.

By the way, if you ignore all environment variables with env -i, explicitly set an environment variable named Test=Test at execution time, and then run it, only Test=Test is printed.

$ env -i Test=Test ./test.o 
Test=Test

From this, you can see that the third argument *envp[] is an argument for retrieving the environment variables of the environment in which the program is executed.

*envp[] in secure coding

While looking into the third argument *envp[], I found an interesting article.

If the environment is modified in some way, the environment’s memory area may be reallocated, and as a result envp may end up pointing to the wrong location.

Reference: ENV31-C. Do not reference an environment pointer following an operation that may invalidate it

As the JPCERT/CC article above explains, if the environment variables are modified in some way after the program starts, the memory area used by *envp[] to refer to the environment variables is reallocated.

In other words, if you change the environment in some way and then use the pointer from the third argument *envp[], it may cause problems.

Based on the above, when using environment variables inside a program, it seems recommended to use extern char **environ; on Linux, or _CRTIMP extern char **_environ; on Windows, if those are defined.

Summary

This was the first time I had learned about this feature, so I went back and reread Even Cats Can Understand C Programming, which is personally my favorite introductory C book. However, the section on main function arguments only discussed argc and *argv[], and did not mention the third argument *envp[] at all.

Maybe it is a slightly niche feature that does not show up at the beginner-book level.

Update (July 7, 2021)

In fact, J.5 Common extensions, which defines the third environment-variable argument *envp[], is not strictly part of the C language specification itself, but is defined as an extension, so it does not seem to be portable across all implementations.

There are still parts I do not fully understand, but it seems my understanding that this was part of “the C language specification” was incorrect.