Defect on AsyncAppenderBaseTest in build #298

Hello, I have look a bit more into the defect. The stacktrace is showing some interesting bits of information : http://logback.qos.ch/jenkins/job/logback/ch.qos.logback$logback-core/298/te... JUnit is timing out right when the main thread is in the fail method of the last AssertTrue (L105 of AsyncAppenderBaseTest) I successfully reproduce the stacktrace on my computer at home (in 50% of test runs) for a specific test code outside of logback. But it is not reproductible on my laptop at work. So I still have to work on it to discover the full story. We can probably already assert the following : - Since we did enter the fail function. The assert did fail. Why? I think this is the main question - Is the 2sec timeout of the test was reach? How? There is 1sec dedicated for the join but the test length is only 1ms. By the way, it is not related to the bug, but I don't see any constraints that ensure that the worker thread is block by the BlockingQueue when the main thread try to interrupt it. The worker thread might be in the RUNNING state. Am I wrong ? Maybe it is expected ? Thanks a lot for your feedback ;-) Best regards, Cyril

On 23.04.2013 13:28, cyril.lapinte@free.fr wrote:
Hello,
I have look a bit more into the defect. The stacktrace is showing some interesting bits of information : http://logback.qos.ch/jenkins/job/logback/ch.qos.logback$logback-core/298/te...
JUnit is timing out right when the main thread is in the fail method of the last AssertTrue (L105 of AsyncAppenderBaseTest)
Why do you say that JUnit is timing out? The way I read the stack trace, delayingListAppender.interrupted is false instead of true, and the assertion on line 105 fails. If there were a timeout, junit could throw an exception with the message "java.lang.Exception: test timed out after 2000 milliseconds".
I successfully reproduce the stacktrace on my computer at home (in 50% of test runs) for a specific test code outside of logback. But it is not reproductible on my laptop at work. So I still have to work on it to discover the full story.
We can probably already assert the following : - Since we did enter the fail function. The assert did fail. Why? I think this is the main question
"Why did the assertion fail?" is indeed the question.
- Is the 2sec timeout of the test was reach? How? There is 1sec dedicated for the join but the test length is only 1ms.
I see no indication of a timeout.
By the way, it is not related to the bug, but I don't see any constraints that ensure that the worker thread is block by the BlockingQueue when the main thread try to interrupt it. The worker thread might be in the RUNNING state. Am I wrong ? Maybe it is expected ?
If the Worker thread is invoking aai.appendLoopOnAppenders(), the interruption may be consumed by the sub-appenders, but that is mostly OK, since the the parent.isStarted() condition will be false, and the while loop (line 223) will be exited. The workerShouldStopEvenIfInterruptExceptionConsumedWithinSubappender test assumes that the main thread will finish before DelayingListAppender completes the 100ms wait. If the host machine is slow or very busy, this assumption is sometimes incorrect. Note that the verify(delayingListAppender, 1) call on line 104 returns without failure. This means that delayingListAppender is stopped, contains 1 element, no errors occured and the worker thread flushed the remaining events. Does the above make sense?
Cyril
-- Ceki 65% of statistics are made up on the spot
participants (2)
-
ceki
-
cyril.lapinte@free.fr