[qos-ch/logback-decoder] f9b87f: ongoing work on layout-pattern-to-regex converters...

Branch: refs/heads/master Home: https://github.com/qos-ch/logback-decoder Commit: f9b87f6ddd1c3aa4f82e2280937f48bcfb86a785 https://github.com/qos-ch/logback-decoder/commit/f9b87f6ddd1c3aa4f82e2280937... Author: Tony Trinh <tony19@gmail.com> Date: 2012-06-18 (Mon, 18 Jun 2012) Changed paths: M pom.xml A src/main/java/ch/qos/logback/decoder/CallerDataRegexConverter.java A src/main/java/ch/qos/logback/decoder/ClassOfCallerRegexConverter.java A src/main/java/ch/qos/logback/decoder/ContextNameRegexConverter.java M src/main/java/ch/qos/logback/decoder/DateRegexConverter.java A src/main/java/ch/qos/logback/decoder/ExtendedThrowableProxyRegexConverter.java M src/main/java/ch/qos/logback/decoder/FileOfCallerRegexConverter.java A src/main/java/ch/qos/logback/decoder/IdentityRegexConverter.java A src/main/java/ch/qos/logback/decoder/LevelRegexConverter.java M src/main/java/ch/qos/logback/decoder/LineOfCallerRegexConverter.java A src/main/java/ch/qos/logback/decoder/LineSeparatorRegexConverter.java A src/main/java/ch/qos/logback/decoder/LoggerRegexConverter.java A src/main/java/ch/qos/logback/decoder/MDCRegexConverter.java A src/main/java/ch/qos/logback/decoder/MarkerRegexConverter.java A src/main/java/ch/qos/logback/decoder/MessageRegexConverter.java A src/main/java/ch/qos/logback/decoder/MethodOfCallerRegexConverter.java A src/main/java/ch/qos/logback/decoder/NopThrowableInformationRegexConverter.java M src/main/java/ch/qos/logback/decoder/PatternLayoutRegexifier.java A src/main/java/ch/qos/logback/decoder/PropertyRegexConverter.java A src/main/java/ch/qos/logback/decoder/RegexPatterns.java A src/main/java/ch/qos/logback/decoder/RelativeTimeRegexConverter.java A src/main/java/ch/qos/logback/decoder/ReplaceRegexConverter.java A src/main/java/ch/qos/logback/decoder/RootCauseFirstThrowableProxyRegexConverter.java A src/main/java/ch/qos/logback/decoder/ThreadRegexConverter.java A src/main/java/ch/qos/logback/decoder/ThrowableProxyRegexConverter.java M src/test/java/ch/qos/logback/decoder/PatternLayoutRegexifierTest.java A src/test/java/ch/qos/logback/decoder/RegexPatternsTest.java Log Message: ----------- ongoing work on layout-pattern-to-regex converters (2)

Hi Tony, Great to see logback-decoder making progress. I see that given a pattern your are able to produce a regex. I am also glad to see that you are much more savvy at writing regular expressions than I am. Have you thought about how to capture fields so as to fill in LoggingEvent/AccessEvent fields? At this stage of the code, there is no grouping in these regular expressions so it is not clear how they could be used to capture field data. Anyway, do you already have an idea how to go further or should we come up with something together? Cheers, -- Ceki http://twitter.com/#!/ceki On 18.06.2012 21:17, Tony Trinh wrote:
Branch: refs/heads/master Home: https://github.com/qos-ch/logback-decoder Commit: f9b87f6ddd1c3aa4f82e2280937f48bcfb86a785 https://github.com/qos-ch/logback-decoder/commit/f9b87f6ddd1c3aa4f82e2280937... Author: Tony Trinh<tony19@gmail.com> Date: 2012-06-18 (Mon, 18 Jun 2012)
Changed paths: M pom.xml A src/main/java/ch/qos/logback/decoder/CallerDataRegexConverter.java A src/main/java/ch/qos/logback/decoder/ClassOfCallerRegexConverter.java A src/main/java/ch/qos/logback/decoder/ContextNameRegexConverter.java M src/main/java/ch/qos/logback/decoder/DateRegexConverter.java A src/main/java/ch/qos/logback/decoder/ExtendedThrowableProxyRegexConverter.java M src/main/java/ch/qos/logback/decoder/FileOfCallerRegexConverter.java A src/main/java/ch/qos/logback/decoder/IdentityRegexConverter.java A src/main/java/ch/qos/logback/decoder/LevelRegexConverter.java M src/main/java/ch/qos/logback/decoder/LineOfCallerRegexConverter.java A src/main/java/ch/qos/logback/decoder/LineSeparatorRegexConverter.java A src/main/java/ch/qos/logback/decoder/LoggerRegexConverter.java A src/main/java/ch/qos/logback/decoder/MDCRegexConverter.java A src/main/java/ch/qos/logback/decoder/MarkerRegexConverter.java A src/main/java/ch/qos/logback/decoder/MessageRegexConverter.java A src/main/java/ch/qos/logback/decoder/MethodOfCallerRegexConverter.java A src/main/java/ch/qos/logback/decoder/NopThrowableInformationRegexConverter.java M src/main/java/ch/qos/logback/decoder/PatternLayoutRegexifier.java A src/main/java/ch/qos/logback/decoder/PropertyRegexConverter.java A src/main/java/ch/qos/logback/decoder/RegexPatterns.java A src/main/java/ch/qos/logback/decoder/RelativeTimeRegexConverter.java A src/main/java/ch/qos/logback/decoder/ReplaceRegexConverter.java A src/main/java/ch/qos/logback/decoder/RootCauseFirstThrowableProxyRegexConverter.java A src/main/java/ch/qos/logback/decoder/ThreadRegexConverter.java A src/main/java/ch/qos/logback/decoder/ThrowableProxyRegexConverter.java M src/test/java/ch/qos/logback/decoder/PatternLayoutRegexifierTest.java A src/test/java/ch/qos/logback/decoder/RegexPatternsTest.java
Log Message: ----------- ongoing work on layout-pattern-to-regex converters (2)

On Tue, Jun 19, 2012 at 2:37 PM, ceki <ceki@qos.ch> wrote:
Hi Tony,
Great to see logback-decoder making progress. I see that given a pattern your are able to produce a regex. I am also glad to see that you are much more savvy at writing regular expressions than I am.
I piggy-backed the PatternLayoutBase class to reuse its converter logic for converting the layout patterns into regular expressions. I don't think this is a very clean way of doing it, but I went with it for now. Originally, I was thinking we pass the input stream (read from a file) to the converters, allowing them to parse something from the stream and advance the stream position, but I wasn't sure how to get the parsed items back to the caller or how well the regex matching would work if only one regex pattern were given at a time. Have you thought about how to capture fields so as to fill in
LoggingEvent/AccessEvent fields? At this stage of the code, there is no grouping in these regular expressions so it is not clear how they could be used to capture field data. Anyway, do you already have an idea how to go further or should we come up with something together?
I was thinking we use regex capture groups to capture the fields. I just haven't added them to regex patterns yet, as I need to figure out exactly what fields to look for. Perhaps you have a better way to capture the fields. I don't yet have a complete design thought out yet, and I'd like to collaborate on that. My initial thoughts were: 1. Determine the logback log-file pattern (e.g., "#logback.class-pattern: %d{HH:mm:ss} %msg%n") by reading it from the file or from a command-line parameter. 2. For each pattern element, convert the pattern to a named regular-expression capture group, where the name is the pattern itself (e.g., "(?<%d{HH:MM:SS}>\\d{2}:\\d{2}:\\d{2}) ((?s).+)(\\n)"). Compile the regular expression into a Pattern object for better performance during iterative matching. NOTE: Name capture groups require Java 7 or a 3rd party library. 3. Match each line of the file with the regex pattern. Collect all matches, and parse the capture groups into a proxy class for LoggingEvent/AccessEvent. The proxy class is used for serialization annotations (e.g., JsonSerialize [1]). 4. Use the appropriate serializer (based on format specified from command-line) to process the proxy events, thereby outputting them to a file or stdout. My logic above relies on effective regular expressions, which I'm still validating in my unit tests. I hope to make better progress especially with you coming aboard. [1] http://sghill-dev.blogspot.com/2012/04/how-do-i-write-jackson-json-serialize...

On 19.06.2012 23:42, Tony Trinh wrote:
On Tue, Jun 19, 2012 at 2:37 PM, ceki <ceki@qos.ch <mailto:ceki@qos.ch>> wrote:
Hi Tony,
I piggy-backed the PatternLayoutBase class to reuse its converter logic for converting the layout patterns into regular expressions. I don't think this is a very clean way of doing it, but I went with it for now.
Using PatternLayoutBase logic is OK although piggy-backing on the just parser logic is probably a little better. I would suggest to use the code in the start() method of PatternLayoutBase. See lines 80 to 91 of PatternLayoutBase. In essence, the code boils down to: Converter<E> head = null; Parser<E> p = new Parser<E>(pattern); Node t = p.parse(); head = p.compile(t, someConverterMap); ConverterUtil.startConverters(head); There are differences between the code shown above and what the start() method of PatternLayoutBase does. The start() method sets the context of the converters. However, for logback-decoder the context has to be different than LoggingContext. A ContextBase instance could do. The other difference is that start() invokes post compile processing. In logback-classic, this is just adding %xEx converter at the end of the pattern if no converter deals with throwables. To simplify the work of the decoder, I think logback-proper could be modified so that the pattern printedat the top of files already contains %xEx so that logback-decoder can skip post compile processing altogether. Serialization to JSON or some other format is a relatively easy problem. I propose that we ignore serialization for the moment. The hard part is decoding log lines into event objects. Please have a look at the FieldCapturer interface before reading on. The link is http://tinyurl.com/fieldCapt Due to the limitations of the pattern compiler, actual instances of FieldCapturer would need to belong to a type extending Converter. This is a little ugly since field capturers would not be doing any conversions. We can tackle this marginal issue later. Let us assume for the sake of this discussion that each event fits in a single log line. Under this (obviously inaccurate) assumption, decoding boils down to something like: /** * Decode an log line as an ILoggingEvent. * * @param head - the first fieldCapturer returned as a result * of pattern parsing (see above) * @param inputLog the log line to decode * * @returnd the decoded ILoggingEvent */ public ILoggingEvent decode(FieldCapturer head, String inputLine) { FieldCapturer fieldCapturer = head; // ---------- build the pattern string ----------------- StringBuilder sb = new StringBuilder(); while(fieldCapturer != null) { String partialRegex = fieldCapturer.getRegexPattern(); if(fieldCapturer.isCapturing()) sb.append("(").append(partialRegex).append(")"); else sb.append(partialRegex); } fieldCapturer = fieldCapturer.next(); } // -------- do regex matching and capture fields -------- String regex = sb.toString(); Pattern pattern = Pattern.compile(regex); LoggingEvent event = new LoggingEvent(); Matcher matcher = pattern.matcher(inputLine); if(matcher.matches()) { int i = 0; while(fieldCapturer != null) { if(fieldCapturer.isCapturing) { String fieldAsStr = matcher.group(i); i++; // much of the work is done here fieldCapturer.captureField(event, fieldAsStr); } else { // ignore literal text } fieldCapturer = fieldCapturer.next(); } } else { logger.warn("Could not decode input line ["+inputLine+"]"); } return event; } Much of the work is delegated to the field capturer chain. Obviously, the decoder shown above assumes log events fitting in a single line. I just wanted to share this design before addressing the more complicated problem of multi-line decoding. Comments most welcome. - Ceki http://twitter.com/#!/ceki
participants (2)
-
ceki
-
Tony Trinh