Win32 crashes with large static structures
08 September 2012The background to this little incident is a last-minute work mode known as headless chicken, and is a follow-on from the previous incident with burned-out laptop drives. It is amazing how many things go wrong when loads of people are constantly dumping little problems on you, but the focus here is the technicalities of one of Windows's little gotchas.
A newly repaired server needed QA testing following modifications, so I grabbed the latest build of the software and installed it. All seemed fine apart from one small but critical in-house Windows Service, which was odd as it was a module that had not had any work done on it for months. Just to be sure I ran the project rebuild script then copied the binary manually. Same result.
Next stage was to compile the standalone command-line version of the service, and as a quick check I decided to run it on my development system. Crash. Eventually I hooked up GDB (I was using MinGW, which includes a Windows version of GDB), which told me where it was going down. Below is an approximate recreation of what i saw:
(gdb) r Starting program: D:\IPtv\testsrv.exe [New Thread 3388.0x1548] Program received signal SIGSEGV, Segmentation fault. 0x0046e816 in _alloca() (gdb) bt #0 0x0046e816 in _alloca () #1 0x0046d817 in main (argc=1, argv=0x8e1010) at win32.c:645
Of course I initially only registered main()
, since it was only around 30 lines long. Spin forward a bit futher, and I eventually paid proper attention to which line was actually causing the crash (again, an approximate recreation):
int main(int argc, char *argv[]) { State_t usrLogins;
The alloca()
function is basically a malloc()
that allocates on the stack rather than the heap, and eventually it clicked: Recently the maximum number of users had been bumped up, and as a result the server state was around 8 megabytes in size, which Windows has difficulty with. Thankfully the solution just needed the implicit alloca()
being replaced by an explicit malloc()
.