Skip to main content

Parsing Expressions in LogQL

Introduction

When working with log data in Grafana Loki, you often need to extract specific fields or values from unstructured log lines. This is where parsing expressions come into play. Parsing expressions are powerful features in LogQL that allow you to transform raw log data into structured data that you can then query, filter, and analyze.

In this guide, we'll explore how parsing expressions work in LogQL, the different types available, and how to use them effectively to extract meaningful information from your logs.

What are Parsing Expressions?

Parsing expressions in LogQL are operations that extract data from log lines and create structured data with labeled fields. They take unstructured text and convert it into a set of key-value pairs that can be referenced in subsequent expressions.

LogQL supports several types of parsing expressions:

  • JSON
  • Logfmt
  • Pattern parsing with regex
  • Unpack
  • Pattern parsing with line format

Let's explore each of these methods.

JSON Parsing

If your logs are already formatted as JSON, LogQL makes it easy to extract and query fields from them.

Basic JSON Parsing

The json parser extracts all fields from a JSON log line.

logql
{job="myapp"} | json

Given a log line like:

json
{"level":"info","msg":"Request completed","method":"GET","path":"/api/users","status":200,"duration_ms":42,"user_id":"user123"}

The json parser would extract all fields: level, msg, method, path, status, duration_ms, and user_id.

Extracting Specific Fields

You can also extract only specific fields using the parameter form:

logql
{job="myapp"} | json method, status, duration_ms

This would only extract the method, status, and duration_ms fields.

Nested JSON Objects

For nested JSON objects, you can use dot notation:

logql
{job="myapp"} | json request.method, request.path, response.status

Given a log line:

json
{
"level": "info",
"msg": "Request completed",
"request": {
"method": "GET",
"path": "/api/users"
},
"response": {
"status": 200,
"duration_ms": 42
},
"user_id": "user123"
}

This would extract request.method, request.path, and response.status.

Logfmt Parsing

Logfmt is a key-value logging format commonly used in applications. LogQL can parse logfmt-formatted logs using the logfmt parser.

Basic Logfmt Parsing

logql
{job="myapp"} | logfmt

Given a log line:

level=info msg="Request completed" method=GET path=/api/users status=200 duration_ms=42 user_id=user123

This would extract all key-value pairs.

Extracting Specific Fields

Similar to JSON parsing, you can extract specific fields:

logql
{job="myapp"} | logfmt method, status, duration_ms

Pattern Parsing with Regular Expressions

Regular expressions provide a powerful way to extract fields from log lines that follow a specific pattern but aren't in a standard format like JSON or logfmt.

Using the |~ "regex" Syntax

The basic syntax for regex pattern parsing is:

logql
{job="myapp"} | regexp "(?P<field1>pattern1)(?P<field2>pattern2)..."

For example, to parse an access log:

logql
{job="nginx"} | regexp `(?P<ip>\S+) - (?P<user>\S+) \[(?P<timestamp>[^\]]+)\] "(?P<method>\S+) (?P<path>\S+) (?P<protocol>\S+)" (?P<status>\d+) (?P<bytes>\d+)`

Given a log line:

192.168.1.1 - user123 [10/Oct/2023:13:55:36 +0000] "GET /api/users HTTP/1.1" 200 1234

This would extract ip, user, timestamp, method, path, protocol, status, and bytes fields.

Named Capture Groups

The (?P<name>pattern) syntax defines named capture groups. These names become the labels for your extracted data.

Unpack Parsing

The unpack expression is useful for logs that contain structured data encoded as strings within fields.

For example, if you have JSON logs where one field contains another encoded JSON string:

logql
{job="myapp"} | json | unpack metadata

With a log line:

json
{"level":"info","metadata":"{\"user\":\"user123\",\"region\":\"us-west\"}","msg":"Request processed"}

The unpack expression would parse the metadata field and extract its contents as separate fields.

Pattern Parsing with Line Format

Line format parsing uses a template string to extract fields. This can be easier to use than regular expressions for simple patterns.

logql
{job="myapp"} | line_format "{{.level}} - {{.message}}"

This would extract fields named level and message based on the template pattern.

Practical Examples

Let's look at some real-world examples of how parsing expressions can be used.

Example 1: Analyzing API Response Times

logql
{job="api-gateway"}
| json
| status >= 200 and status < 300
| unwrap duration_ms
| avg_over_time({job="api-gateway"} | json | unwrap duration_ms [5m])

This query:

  1. Extracts JSON fields from API gateway logs
  2. Filters for successful responses (status 2xx)
  3. Unwraps the duration_ms field for analysis
  4. Calculates the average response time over 5-minute windows

Example 2: Finding Failed Login Attempts

logql
{job="auth-service"}
| json
| event="login_attempt" and success="false"
| count_over_time({job="auth-service"} | json | event="login_attempt" and success="false" [15m]) by (user_id, ip_address)
| sort > (count)

This query:

  1. Parses JSON from authentication service logs
  2. Filters for failed login attempts
  3. Counts failed attempts per user and IP address over 15-minute windows
  4. Sorts by count in descending order to identify potential brute force attacks

Example 3: Analyzing Complex Nested Log Structures

logql
{job="microservice"}
| json
| request.client.ip=~"192\.168\..*"
| json response.headers.contentType="application/json"

This query:

  1. Parses JSON from microservice logs
  2. Filters for requests from the 192.168.* internal network
  3. Further filters for responses with JSON content type

Best Practices for Parsing Expressions

  1. Extract Only What You Need: Instead of parsing all fields, extract only the ones relevant to your analysis to improve performance.

  2. Use Labels Wisely: Labels created by parsing expressions contribute to Loki's index; excessive unique label values can impact performance.

  3. Pre-Process Logs When Possible: If you control the log format, structure logs as JSON or logfmt at the source to make parsing easier.

  4. Test Regular Expressions: Complex regex patterns can be resource-intensive. Test them on a small sample of data first.

  5. Use Line Format for Simple Cases: For simple parsing needs, line_format can be easier to work with than regular expressions.

Summary

Parsing expressions are fundamental to getting the most out of Grafana Loki by transforming unstructured log data into structured data you can analyze. We've covered:

  • JSON parsing for structured logs
  • Logfmt parsing for key-value formatted logs
  • Regular expression parsing for custom log formats
  • Unpack parsing for nested structured data
  • Line format parsing for template-based extraction

By mastering these parsing techniques, you'll be able to extract valuable insights from your log data and build powerful monitoring and alerting systems with Grafana Loki.

Additional Resources

Exercises

  1. Parse a Nginx access log to extract the HTTP method, path, status code, and response time.
  2. Create a query to find the top 10 URLs with the highest average response times in the last hour.
  3. Parse logs from a custom application format and visualize error rates over time.
  4. Extract fields from a nested JSON structure and create alerts based on specific field values.
  5. Combine multiple parsing expressions to transform and analyze complex log structures.


If you spot any mistakes on this website, please let me know at [email protected]. I’d greatly appreciate your feedback! :)