A Deeper Dive Into Log Enrichment

Published on
25-02-2024
Author
Product Minting
Category
How to
https://cdn.aisys.pro/stories/a-deeper-dive-into-log-enrichment.jpg

In this article, we will conclude reviewing how logging concerns and code can be separated and decoupled from infrastructure and business code. In the previous article, we reviewed how to use DiagnosticSource and DiagnosticListener to achieve that for infrastructure operations.


Now, we will review how to enrich the logged data with additional context.


Essentially, the concept of enriching logged data revolves around registering additional context, or data, that should go together with the log messages. This context can be anything - from a simple string to a complex object, and it gets produced or known at some moment within the infrastructure or business code, not at the moment when the log messages are written.


So, we cannot simply add another property to log messages, as this code does not produce any logging output, or we want the context to get attached to multiple log messages that may be, or may be not, produced down the execution chain.


This context may be even conditional - such as adding specific data only to error log messages while all other messages do not need it.


We will use Serilog and its enrichment feature, as Serilog made it very flexible and powerful. Other solutions, too, have similar features at different maturity levels, and we will compare Serilog's enrichment with what Microsoft.Extensions.Logging provides out of the box.

An Overview of the Existing Log Enrichers for Serilog

Serilog comes packed with many useful enrichers that can be very handy for some scenarios. You can visit this page - https://github.com/serilog/serilog/wiki/Enrichment - to see the full list of enrichers and their descriptions.


For example, there are LogContext and GlobalLogContext enrichers which allow to push extra data to get logged with the log messages if they get written within a matching scope.


LogContext enricher is very similar to the concept of logging scopes in Microsoft.Extensions.Logging. Essentially, they both push some custom data and provide an IDisposable object that should be disposed of to remove the data from the log context.


That is, as long as the IDisposable object is in scope, the data will be attached to all log messages written within that scope. When it gets disposed, the data will no longer be attached.


Serilog and Microsoft documentation provide these examples:

// For Serilog
log.Information("No contextual properties");

using (LogContext.PushProperty("A", 1))
{
    log.Information("Carries property A = 1");

    using (LogContext.PushProperty("A", 2))
    using (LogContext.PushProperty("B", 1))
    {
        log.Information("Carries A = 2 and B = 1");
    }

    log.Information("Carries property A = 1, again");
}

// For Microsoft.Extensions.Logging scopes
using (logger.BeginScope(new List<KeyValuePair<string, object>> {
    new KeyValuePair<string, object>("TransactionId", transactionId),
})) {
    _logger.LogInformation(MyLogEvents.GetItem, "Getting item {Id}", id);
    todoItem = await _context.TodoItems.FindAsync(id);
    if (todoItem == null) {
        _logger.LogWarning(MyLogEvents.GetItemNotFound,
            "Get({Id}) NOT FOUND", id);
    
        return NotFound();
    }
}

While useful for some scenarios, they are quite limited. The best usage for this type of log enrichment is in middleware or decorator type of implementations, where the scope is well defined, and the data is known at the moment of the scope creation, and we are certain the data has no value outside of the scope.


For example, we can use it to attach a correlation ID to all log messages within a single HTTP request processing scope.


GlobalLogContext enricher is similar to LogContext, but it is global - it pushes the data to all log messages written within an application. However, the real use cases for this type of enrichment are very limited.

Custom Log Enricher Sample for Serilog

As a matter of fact, implementing custom log enrichers for Serilog is very easy - just implement the ILogEventEnricher interface, and register the enricher with the logger configuration. The interface has only one method to implement - Enrich - which accepts the log event and enriches it with the data you want.


Let's review a sample implementation for a custom log enricher.

public sealed class CustomLogEnricher : ILogEventEnricher {
    private readonly IHttpContextAccessor contextAccessor;
    
    public CustomLogEnricher(IHttpContextAccessor contextAccessor) {
        this.contextAccessor = contextAccessor;
    }
    
    public void Enrich(LogEvent logEvent, ILogEventPropertyFactory propertyFactory) {
        var source = contextAccessor.HttpContext?.RequestServices.GetService<ICustomLogEnricherSource>();
        if (source is null) {
            return;
        }

        var loggedProperties = source.GetCustomProperties(logEvent.Level);
        foreach (var item in loggedProperties) {
            var property = item.ProduceProperty(propertyFactory);
            logEvent.AddOrUpdateProperty(property);
        }
    }
}

As we see, this enricher relies on an ICustomLogEnricherSource to get the custom properties to enrich the log event with. Normally, the log enrichers are long-lived instances and even follow the Singleton pattern - they are registered once at the application startup and live for the whole application lifetime.


So, we need to decouple the enricher from the source of the custom properties, and the source can be a scoped instance with a limited lifetime.

public sealed class CustomLogEnricherSource : ICustomLogEnricherSource {
    private readonly Dictionary<string, CustomLogEventProperty> properties = new();
    
    public ICustomLogEnricherSource With<T>(string property, T? value, LogEventLevel logLevel) where T : class {
        if (value is null) {
            return this;
        }
    
        properties[property] = new CustomLogEventProperty(property, value, logLevel);
    
        return this;
    }
    
    public void Remove(string property) {
        properties.Remove(property);
    }
    
    public IEnumerable<CustomLogEventProperty> GetCustomProperties(LogEventLevel logLevel) {
        foreach (var item in properties.Values) {
            if (item.Level <= logLevel) {
                yield return item;
            }
        }
    }
}

// And CustomLogEventProperty is a simple struct (you can make it a class, too):
public struct CustomLogEventProperty {
    private LogEventProperty? propertyValue;
    
    public CustomLogEventProperty(string property, object value, LogEventLevel level) {
        Name = property;
        Value = value;
        Level = level;
    }
    
    public string Name { get; }
    
    public object Value { get; }
    
    public LogEventLevel Level { get; }
    
    public LogEventProperty ProduceProperty(ILogEventPropertyFactory propertyFactory) {
        propertyValue ??= propertyFactory.CreateProperty(Name, Value, destructureObjects: true);
        return propertyValue;
    }
}

This implementation has a few simple details:

  • The enricher is decoupled from the source of the custom properties for log messages to account for different lifetimes of these components and different responsibilities. If needed be, the enricher can be made by consuming a list of ICustomLogEnricherSource instances, as different application components may produce different custom properties.


  • The source (ICustomLogEnricherSource) can be injected into any component that needs to enrich log messages with custom properties. Ideally, it should be a scoped instance, as it does not provide a convenient way of expiring the custom properties.


  • The enricher and the source implement simple filtering logic based on the log level of a log message. However, the logic may be much more complex and rely on other LogEvent properties, or even on the application state.


  • CustomLogEventProperty caches a produced LogEventProperty instance to avoid creating it multiple times for the same custom property, as it may get attached to multiple log messages.


Another important detail is in this line of code, the destructureObjects parameter:

propertyFactory.CreateProperty(Name, Value, destructureObjects: true);


When we attach complex objects - classes, records, structs, collections, etc. - and this parameter is not set to true, Serilog will call ToString on the object and attach the result to the log message. While records have a ToString implementation that will output all public properties, and we can override ToString for our classes and structs, this is not always the case.


Also, such an output does not do well for structured logging, as it will be a simple string, and we will not be able to search and filter log messages based on the properties of the attached object. So, we set this parameter to true here. The simple types (value types and strings) will do just fine with either value of this parameter.


Another benefit of this implementation is that once a custom property is registered in the source, it remains there for all log messages until it gets removed or the source gets disposed (remember, it should be a scoped instance). Some implementations may need a better control over the lifetime of the custom properties, and this can be achieved in different ways.


For example, by providing specific CustomLogEventProperty implementations or by providing a callback to check if the custom property should be removed.


This expiration logic may be used in GetCustomProperties method to pre-check a custom property and remove it from the source if it has expired.

How This Helps With Decoupling Logging Concerns From Business and Infrastructure Code

Well, ideally we would not want to mix logging concerns with business and infrastructure code. This may be achieved with various decorators, middlewares, and other patterns that allow to 'wrap' the business and infrastructure code with logging-specific code.


However, this may not always be possible, and sometimes, it may be more convenient to inject such an intermediary abstraction as ICustomLogEnricherSource into the business and infrastructure code and let it register the custom data for logging.


This way, the business and infrastructure code does not need to know anything about actual logging, while still mixing it up with some logging-aware pieces of code.


Anyway, this code will be much less coupled and much more testable. We may even introduce something like NullLogEnricherSource for a no-op implementation that will not take up any performance and memory for some very-hot-path scenarios.

Performance Considerations

As Microsoft states,

Logging should be so fast that it isn't worth the performance cost of asynchronous code.


So, whenever we enrich our log messages with additional context, we should be aware of the performance implications. Generally speaking, Serilog log enrichment implementation is very fast, at least it is on par with Microsoft logging scopes, but producing log messages will take more time if we attach more data to them.


The more custom properties we attach, the more complex objects are among them, and the more time and memory it will take to produce a log message. So, a developer should be very thoughtful about what data to attach to log messages, when to attach it, and what its lifetime should be.


Below is a small benchmark results table showing the performance and memory consumption for both Serilog log enrichment and Microsoft logging scopes. Each benchmark method would produce 20 log messages with different ways of enriching them with custom properties.


No custom properties attached:

| Method                               | Mean      | Error    | StdDev   | Gen0   | Allocated |
| SerilogLoggingNoEnrichment           |  74.22 us | 1.194 us | 1.116 us | 1.2207 |  21.25 KB |
| MicrosoftLoggingNoEnrichment         |  72.58 us | 0.733 us | 0.685 us | 1.2207 |  21.25 KB |


Each log message has three strings attached:

| Method                               | Mean      | Error    | StdDev   | Gen0   | Allocated |
| SerilogLoggingWithContext            |  77.47 us | 0.551 us | 0.516 us | 1.7090 |   28.6 KB |
| SerilogLoggingWithEnrichment         |  79.89 us | 1.482 us | 2.028 us | 1.7090 |  29.75 KB |
| MicrosoftLoggingWithEnrichment       |  81.86 us | 1.209 us | 1.131 us | 1.8311 |  31.22 KB |


Each log message has three complex objects (records) attached:

| Method                               | Mean      | Error    | StdDev   | Gen0   | Allocated |
| SerilogLoggingWithObjectContext      | 108.49 us | 1.193 us | 1.058 us | 5.3711 |  88.18 KB |
| SerilogLoggingWithObjectEnrichment   | 106.07 us | 0.489 us | 0.409 us | 5.3711 |  89.33 KB |
| MicrosoftLoggingWithObjectEnrichment |  99.63 us | 1.655 us | 1.468 us | 6.1035 | 100.28 KB |The 

The Serilog*Context methods use LogContext.PushProperty, the Serilog*Enrichment methods use the custom enricher and source implementation given in the article, while Microsoft* methods use logging scopes.


We can see that performance and memory consumption are very similar for most cases, and enriching with complex objects is more costly than enriching with simple types. The only exception is logging with Microsoft implementation when we attach records to log messages.


This is because logging scopes do not destructure complex objects, and use ToString method for serialization when attaching them to log messages. This makes Microsoft implementation slightly faster but consumes more memory.


If we would use classes with the default ToString implementation, the memory consumption would be much smaller, but these custom objects would be logged as fully qualified type names - totally useless.


And in any case, we would not be able to search and filter log messages based on the properties of these objects with Microsoft implementation due to not destructuring them.


So, this is the limitation of Microsoft logging scopes that we should be aware of - no destructuring of complex objects while simple types are logged fine and can be searchable and filterable.


Note: the benchmark source code and code samples for this article can be found in the GitHub repository

Conclusion

Proper log enrichment is a 'quality of life improvement' for the development and maintenance of the code. It allows the ability to attach additional context to log messages that are not known at the moment of the log message creation.


This context can be anything - from a simple string to a complex object, and it gets produced or known at some moment within the infrastructure or business code.


It is a way to separate logging and observability code from infrastructure and business code and to make the code more maintainable, testable, and readable.

References


Discussion (20)

Not yet any reply