Hi David,
I forgot to mention this in my original note and it is a good thought, but yes I have tried to cleanly shutdown Logback. Even when the LogbackContext stop method is invoked prior to JVM exit, the async queue is not flushed completely. When Logback is stopped cleanly, the async worker thread exits the "waiting for events in the queue" while loop and moves on to the "flush remaining elements from the queue" for loop (confirmed by the info message when debug is enabled and via breakpoint). Ultimately, because the thread is a daemon thread - this process is still interrupted by JVM shutdown. There is no guarantee that the async worker will get enough time to process all events.
From a theoretical standpoint, stopping Logback properly does not solve flushing issues with the AsycAppender in a number of cases (it may be fair to argue that it should not attempt to cover all of these cases):
- Queue is relatively full or wrapped appender latency is high - the async worker may not be able to finish processing the queued items between when the LogbackContext.stop method is invoked and when the JVM actually exits
- Implicit JVM exit upon end of last user thread - some applications don't explicitly call System.exit but rather rely on implicit JVM exit. In that case, where would you cleanly shutdown Logback? It can probably be done, but would require an application redesign to avoid reliance on implicit exit or some form of thread tracking thread.
- Shutdown hooks are used to clean up resources - some applications have components or modules that need to clean up after themselves when the JVM shuts down. Typically, shutdown hooks are used for this purpose, however the AsyncAppender's worker thread is unable to process queued events after shutdown has been initiated (while the shutdown hooks are running). This also prevents shutdown hooks from being able to log events
- Signal handling on *nix systems - signals can be sent to the application for a number of reasons, most often to request graceful shutdown or to terminate something that is non-responsive. If the signal corresponds to something that generally means "graceful shutdown" on the host system, normal cleanup routines should execute in the application (typically implemented as shutdown hooks, so maybe this is the same as the previous one).
From a practical standpoint, I am running into the first and the third scenarios. Even shutting down cleanly, the latency and queue depth can cause elements to be missed; additionally, I have shutdown hooks that clean up resources and expect the ability to log events and errors.
My thought is that a modified AsyncAppenderBase implementation could (optionally) install a shutdown hook that continues watching the queue for new events. A configurable "queue idle time" could be used to tell the shutdown hook that it has probably caught all events and can safely shutdown (not foolproof, I'm still thinking about other ways of handling it). Additionally, the hook could have a configurable "max processing time" so that it doesn't keep the JVM alive for an undesirable amount of time (thinking in the context of automation software where shutdown requests typically have a max execution time before they are considered as "failed online").
I hope this helps clarify the scenario I am considering and a proposed solution!
Regards,
Mike Reinhold