Announcing Logger:
Logging system for performance, large data, & report generation

Logger is a logging system for performance, large data generation, and post-processing. 

Logger was originally developed for large automated testing procedures against realtime hardware devices such as commercial satellites. Its goals were to log various types of events in human readable formats, but in an output that could easily be parsed and searched for later post-processing report generation. 

Logger supports runtime priority thresholds, automatic file segmentation, and compile time selectable locking for thread-safety. 

Two implementations are provided. One in pure C (for maximum portability), another in C++. There is also an Objective-C macro wrapper that extends the pure-C version to handle Obj-C objects. This library has been tested on Mac OS X, Linux, and Windows.

Yes, there are a ton of logging systems out there. So why look at this one? I like this one because it was designed by extremely experienced test engineers I had the privilege of working under some years back. They had decades of experience under their belts and have worked on just about every type of giant, interesting project imaginable. When I worked under them, I learned their perspective about logging systems which tends to be a little different than what most software developers I talk to think of.

Perhaps the most overlooked aspect software developers miss is creating an easily parseble and searchable log file. In the run-for-records, we would generate multi-gigabytes of data that we would need to data mine and later generate reports from. So each log entry (event) needed to be clearly identified and have structure that let us find data very quickly. (In contrast, I think software developers are generally sloppy about the output of their events...just look at any Unix-like system's system log.) We devised a system of providing a primary and subkeyword for each entry which the team would define and custom tools would know how to isolate.

Timestamps are also a critical part. Knowing when something happened to timelines can be reconstructed is really important. This was especially important when different subsystems wrote log events concurrently but you couldn't guarantee that the file system or operating system would record these events in the correct order. And sometimes it was convenient to have separate systems that communicated with each other to have their own logs, but you would then need to merge them in post-report-processing. So you needed both a timestamp with some precision timestamp and a fast way to parse and sort log events by their timestamp. The way we accomplished this was to write time in descending format from YYYY:MM:DD_HH:MM:SS.mmm which guaranteed that simple string or number comparisons would always yield the correct sort order. We also made everything fixed width for easy parsing.

Also, because the data generation was so enormous, the logging system needed to be able to auto-segment files because we would often hit file system limits. Also, the segmentation needed to work in a way that was easy for post-processing tools to understand files were part of a set.

Another useful aspect is the ability to change priority thresholds. Being able to dynamically change priority thresholds was also a big plus.

Echoing to terminal while a test was running was a nice-to-have feature too.

And of course it needs to be performant.

So I reimplemented Logger for my own purposes, most of which have been much more mundane than commercial satellites or aerospace related. But I've still found it useful nonetheless. Just recently, a new open source project I have some input in asked about logging systems in C++. I've been meaning to release this one for about 8 years (it is about the same age as my just released ALmixer), but never got around to it...until now.

So my implementation tries to keep the spirit of the original Logger implementations I learned from my test engineering guru mentors. It provides the formal log event structure with keywords and timestamp. It has runtime priority filtering/thresholding, auto-segmentation, and optional echoing to stdout or stderr. Multiple instances of Logger can also be created so different modules can do their own thing. And there is an optional compile switch to enable locking for thread-safety.

It also tries to be very performant and avoid dynamic memory allocation when possible. Most of the time, the thing is a direct passthrough to the standard I/O. For both the C and C++ versions, the public API is built around printf style conventions instead of iostreams. This works better for performance because data can be disregarded more easily without actually processing of the format strings. It also deals better with issues of localization where format parameters need to be reordered.

Anyway, Logger is pretty well documented (fully Doxygen-ized), has a CMake build system, a few test cases/sampels, and isn't terribly complex. I've run it on Mac, Linux, FreeBSD, and Windows. And of course, I think the overall design I think is excellent (since it was developed by hardcore test engineering gurus).

So now the entire world can benefit from this. I have released this under the MIT license.

I have put up a web page for Logger here. You'll find links to the Mercurial repository and Doxygen documentation there.

Also two footnotes. I've pushed a bunch of bug fixes to ALmixer and my SDL_sound/Core Audio repositories since my last post.

And I still have another project or two in the queue for release very soon. Maybe the next week or two. Stay tuned.

Copyright © PlayControl Software, LLC / Eric Wing