Two of the biggest mind shifts I’ve had to make in coming back to C programming have been strings and variable scope/lifetime. This Stack Overflow question is a nice encasulation of both.
First off — strings aren’t a first class type in C. They’re just a char
array of individual characters, with a NULL
character as the final element. This might not seem that bad, until you remember that arrays aren’t the simple “lists of stuff” you might be used to in higher level languages. In C, an array is a literal allocation of “N” bytes of RAM. Arrays are useful as lists, and as a mechanism for dealing with strings, but whenever you use an array you need to be aware of C’s rules for allocating and releasing memory, and how long a variable “lives” in C.
If you declare a regular array in a function, C will release it automatically when the function ends. This means you shouldn’t return a regularly declared array from a C function. We say shouldn’t because you actually CAN return the array, but what you’re returning is the memory address of the array. The arrays contents may or may not be there when you try to access them from the calling function.
Another option is declaring your array as static
in the function. When you do this, you’re telling C that this specific chunk of memory should only be allocated once, and C shouldn’t deallocate it until the program ends. The problem with this approach is that your returned value will always point at this same chunk of memory. This means a long lived variable might have its value swapped out from under it if the function is invoked multiple times. This has extra implications in a threaded language like C — with multiple threads calling your function you’ll never know what’s in this value — i.e. your program isn’t thread safe.
Finally, you can use the malloc function to manually allocate memory for the array. When you malloc something it sticks around, which means it’s safe to return. Having your function use malloc
also means each time its called we get a new chunk of memory (i.e. the problems of static
go away). However — with the memory sticking around and being allocated each time the function is called, if the calling programmer doesn’t free
that memory later you’ve got a memory leak on your hands. i.e. your program will consume more and more system memory the longer it runs.
None of these situations presents a clear win — the programmer needs to make a judgment call on the best thing for their particular situation. It turns out this is a lot of C programming — making these judgments calls and figuring out the intent of previous programmers when they were making similar decisions. This is probably true of all programming, but C makes it so almost anything you do requires this sort of forethought (when’s the last time you worried about how to return a string in javascript?), and failure often means a program that compiles fine, but will shred a system’s memory and/or segfault randomly.