Hi Gena,

The method described for a disk-cached AsyncAppender of course makes perfect sense. I think it would make a good alternative to the current AsyncAppender for some scenarios and would be a solid addition to the current Logback appenders.

I agree that AsyncAppender does not handle network failures, but in my opinion that is the responsibility of the target appender since it best knows how to handle different failure cases. For instance, some appenders may choose to retry immediately or may round-robin to an alternate server, etc. 

I agree completely that AsyncAppender has many uses not limited to remote log targets. Generally speaking, the purpose of AsyncAppender is the same for local and remote targets - to reduce the impact of target latency on application performance, whether that latency is network or disk or something else doesn't really matter. 

I feel like we are speaking past each other a little bit in that I am not proposing any of the specific shutdown hooks discussed so far at a theoretical level (without being fully thought through or developed, they are of course flawed). Rather, I am proposing that my solution be implemented as a framework for Logback, contributors to Logback, and/or end users of Logback to build shutdown hooks that fulfill specific needs. Some of the theoretical hooks discussed could be implemented, knowing that they are not general solutions, but rather niche solutions that may be acceptable for some applications. Then, leveraging Logback's community of contributors (including you and me), more general solutions can be developed and included as part of Logback. 

Based on this back and forth discussion, I don't think there is any resistance to your proposed solution as it has the potential to be a reasonably generic shutdown hook that could be applicable to many common use cases. Furthermore, it could be implemented as one of the pluggable classes usable by the ShutdownHookAction. If your implementation is as generic and effective in practice as you have described, it could actually be the default ShutdownHook implementation. If for some reason a user found that your solution was non-optimal for their application, they could implement (or work with the Logback community to implement) an alternate ShutdownHook that does meet their needs.

I hope this helps explain my position adequately - I don't see our two solutions as competing alternatives, but rather as complimentary components of the shutdown cleanup functionality of Logback. My portion provides a framework for creating shutdown hooks that cleanup the Logback context (possibly including some simple implementations) and your portion includes a more complex, but fairly general implementation of that shutdown hook framework. This gives us the best of both worlds - the flexibility to plug different hooks in if needed as well as a solid default implementation that should cover most use cases. This would seem to fit with the methodology of design that Logback generally uses where multiple implementations are possible and generally speaking a few options are included in the library, but a default implementation that covers most scenarios is also included in the library and, unless specified otherwise, is the preferred option.

Once again, thank you for the good discussion!

Regards,

Mike Reinhold


On Wed, Mar 19, 2014 at 5:25 PM, Gena Makhomed <gmm@csdoc.com> wrote:
On 19.03.2014 16:51, Michael Reinhold wrote:

I agree that logging should be reliable and fast, however this is not
always possible with some targets - particular cloud logging services
such as Loggly, Logentries, Papertrail, Splunk Storm, Loggr, etc.

However it is possible to make logging even to remote services
reliable and fast, from application code point of view. Making
process of logging to remote services true asynchronous:

- write all log events to ordinary log file at local file system

- in background (daemon) thread scan log files, check at remote
service which files should be synced and in which offset, after
this first "handshake" - just monitor tail of log file, and all
new events asynchronously read from log file and send to remote
log collection service, even fashionable cloud logging services

In this case - logging events to log file and sending events
to remote logging service are asynchronous, and without event
lost even after fast application restart, even with big queue
of not yet reported to remote service log events in the log.

Algorithm the same as in case of AsyncAppender from logback,
but as temporary storage for queued events used persistent storage
(disk subsystem) instead of java heap - this is only one difference.

Existing AsyncAppender not help in case of temporary network failure,
you will get event lost, - there is not guaranteed log events delivery.

With true asynchronous conversation with remote logging service,
(with queue located at persistent storage service, not in java heap)
all events queued in log file will be send to remote logging service,
without lost in events delivery process. (fast, asynchronous, reliable)


The AsyncAppender was created to prevent target latency from impacting
application performance by buffering the events and processing them
asynchronously.

Yes, and AsyncAppender even help in case of using RollingFileAppender,
allow application to process client request at maximum available speed,
with all logging work with log file executed in separate daemon thread.

If not use AsyncAppender before RollingFileAppender - all logging
work with underlying OutputStream are done in application thread,
and time wasted for logging instead of client request processing.


You describe a number of situations in which the discussed ShutdownHook
implementations don't work. I don't want to argue a bunch of
hypothetical scenarios on the mailing list, these were just descriptions
of how it is possible to extend the ShutdownHooks in Logback. Please
understand, I am not proposing that those are the (or the only)
ShutdownHooks that get implemented - this was merely attempting to
describe how different shutdown hooks could be developed that best meet
different needs. Additionally, it was intended to show how the user
could develop something that meets their specific application needs in
the event that one of the Logback provided ShutdownHooks is not
sufficient. Those examples may have been simplistic, but more complex
options are possible. For instance, an InstrumentationShutdownHook that
adds instrumentation into the Logback API to determine when it is safe
to shutdown. Or a HeapWalkerShutdownHook that investigates the JVM heap
to find Logger references. Or a ThreadHeuristicShutdownHook that
analyzes the threads present in the JVM to determine when the Logback
shutdown hook is the last relevant thread remaining. The point is that
the ShutdownHookAction framework can be used to extend Logback's
built-in shutdown cleanup functionality, either by the Logback team
itself, by contributors to the project, or by the end application developer.

Such approach imply minimally invasive changes to Logback
and forcing users to write own non-trivial ShutdownHooks.

If implement my proposed solution - user not need write any code at all.
User just need to answer to one simple question: application shutdown
means JVM shutdown (for example in case of UNIX service/daemon) or
application shutdown possible without JVM shutdown (OSGi, webapp, etc)

if application shutdown == JVM shutdown user add to logback.xml
one line, for enabling built-in logback shutdown hook,
which after activation turn logback from "maximum speed"
to "maximum safe" mode of work, and in this case
logback will be ready for forthcoming JVM shutdown.
event lost in this case will be minimal, because
after logback shutdown hook complete work -
all events from AsyncAppender queue will be
written to its underlying appender, queue size set to 0,
and AsyncAppender will be switched into transparent sync mode.
also, all appenders switched to immediateFlush = true mode.
in this case each log event from any shutdown hook will be
guaranteed to be written into log file.
without any additional work from user.

if application shutdown possible without JVM shutdown,
user just need to activate proper context listener
for graceful shutdown of logback via LoggerContext.stop

if none on this two methods of graceful shutdown activated -
logback can write warning about misconfiguration to StatusManager.

for user - logback will work very easy,
fast and reliable without any headache.

call LoggerContext.stop only required for case
when application shutdown != JVM shutdown,

and if application shutdown == JVM shutdown
calling LoggerContext.stop is not the best solution,
because it can lead to logging events lost.


For your scenario in particular: you say that you have shutdown hooks
installed by both your code and by third party libraries you use,
correct?

Yes.


Are the Shutdown hooks inserted by your code and by the
libraries deterministic in terms of name or quantity?

No.

If so, you could
whitelist those threads call LoggerContext.stop when they have exited
(which you can confirm during your shutdown hook via either the Thread
API or the ThreadMXBean API). If the shutdown hooks are not
deterministic in quantity or name, you can use the protected
java.lang.ApplicationShutdownHooks
<http://stackoverflow.com/questions/6865408/i-need-to-list-the-hooks-registered-with-java-lang-applicationshutdownhooks>

class during your application startup to determine the hooks that are
installed and find the names of those hooks. Then your shutdown hook can
check the status of those threads and call LoggerContext.stop once they
have finished processing (excluding the Logbcack Shutdown hook of
course). Actually, after thinking this through, it is possible that this
type of a shutdown hook will work for my application as well...

Yes, this is working solution, but it is very uncomfortable,
because I need permanently monitor list of used shutdown hooks.
and permamently update list of whitelisted shutdown hooks,
which must exit before my logback shutdown hook should call
LoggerContext.stop

Better approach - just after starting my own shutdown hook
get list of all application shutdown hooks and wait all rest
shutdown hooks termination before calling LoggerContext.stop

in this case - deadlock can be reached if any other shutdown hook
use same approach and wait for all rest shutdown hooks completion.

for almost always working solution - we need whitelist parameter
for this shutdown hook, allowing to exclude other shutdown hook
with same algorithm from logback shutdown hook wait list.

yes, this work, with the same reliability as my proposal.

Such approach imply minimally invasive changes to Logback,
but need manual deadlock detection and manual deadlock prevention.
this is small probability, but still it is headache and annoyance.
also, ApplicationShutdownHooks - this is implementation details,
on any other JVM your shutdown hook may not work correctly at all.

My approach - deadlock not possible, not need whitelist,
no headache, no annoyances, no reflection, work on any JVM,
but need slightly more code changes of logback.

compare, which variant is easier to implement and to use by users?


Don't misunderstand what I am saying - I think your proposal has its
merits and has the potential to improve the reliability in these
specific scenarios that we are discussing. In fact, I feel that if your
solution does work as proposed it would work very well in conjunction
with my changes to AsyncAppender and my addition of the
ShutdownHookAction.

if implement my proposal - ShutdownHookAction will be only one
for all cases, where application shutdown == JVM shutdown,
in this case we just not need many different shutdown hooks,
because this one (switching logback into "ready to shutdown" mode)
will work very well for all possible use cases.


That being said, your proposal is not without its
challenges. For instance, even after you have called your proposed
prepareToShutdown method you still need to avoid calling
LoggerContext.stop until all other threads/hooks that log events are
finished with their work, otherwise log events may still be dropped
since the loggers are stopped. The same timing & race condition problems
still exist, they have just been moved from one point in code to another.

If application shutdown == JVM shutdown
LoggerContext.stop not need to be called at all.

After JVM shutdown and process exit - operating system
close all open log files without assistance.

Timing & race condition problems exists after all shutdown hooks
finished if any other thread still try to write something in log file.
Such event can be lost, but it also definitely will be lost
if LoggerContext.stop already was called by your shutdown hook.

So - it is no any difference here about timing & race condition.

This, last events lost generally unavoidable at all.


--
Best regards,
 Gena
_______________________________________________
Logback-user mailing list
Logback-user@qos.ch
http://mailman.qos.ch/mailman/listinfo/logback-user