How to tail log entries through the systemd journal using Python

This post describes how to use the Python bindings of systemd in order to follow and parse log entries of a specific service in real time through systemd's journal. This is an alternative way to tail log files which can be very easily customized.

Following and parsing log files in real time is one of the most common practices when it comes to service monitoring for misuse. While doing maintenance of my systems during the last days, I experimented with journalctl, part of systemd, and realized that, in case of services that send their log entries to the syslog, it might be a much better idea to parse the log entries as they pass through systemd’s journal instead of tailing the actual log file.

Using the tail and journalctl commands

Log files can be followed with the tail command:

~]# tail --lines 0 --follow /path/to/myservice.log

The equivalent command using journalctl is:

~]# journalctl --lines 0 --follow _SYSTEMD_UNIT=myservice.service

As you can see, the only real difference when journalctl is used is that, instead of parsing the actual log file, we use one or more special journal fields to isolate the syslog entries of the service we are interested in monitoring.

Advantages of using journalctl

Using journalctl instead of the tail command has some advantages which facilitate the whole process:

  1. You do not have to worry about log file rotation. Log file rotation is standard practice in Linux systems and, when parsing the actual log file directly, you have to take it into consideration, otherwise your monitoring script may end up tailing the rotated log file. This is not an issue when tailing log entries through systemd’s journal, because at this stage the recorded log entries have not been sent to the actual log files yet.
  2. It is easier to filter log entries. Log entries in the systemd journal are groups of environment variables. Each variable resembles a field which can be used for the isolation of the log entries we are interested in. Read more about the special journal fields you can use.

Invoking journalctl directly from the script

One way of following log entries in the systemd journal from within your Python scripts is to invoke journalctl directly using the subprocess module.

While doing research on the web about this, I found a very useful code snippet on a question at StackOverflow which shows how to use a polling object to follow the output of the tail command in a way that does not block the execution of the script. I had never used select.poll() before, but the code snippet was all I needed in order to understand its usefulness and usage. The following is a modified version of that snippet that polls journalctl for new entries:

import sys
import subprocess
import select
 
args = ['journalctl', '--lines', '0', '--follow', '_SYSTEMD_UNIT=myservice.service']
f = subprocess.Popen(args, stdout=subprocess.PIPE)
p = select.poll()
p.register(f.stdout)
 
while True:
    if p.poll(100):
        line = f.stdout.readline()
        print(line.strip())

Using the Python bindings of systemd

Although calling external commands is quite efficient, it is not my method of choice while writing my scripts. I usually run such scripts using an unprivileged user, so, if the script calls external commands, extra sudo configuration is required. Most of the time this is not a problem, but I tend to use Python bindings, if they are available, as I find this method more convenient.

So, in this case, instead of invoking journalctl as an external command, it’s a lot more convenient to use the Python bindings of systemd in order to read the systemd journal entries. If you run your script using an unprivileged user, all you need to do in order to gain read access to the systemd’s journal is to make the user a member of the systemd-journal system group.

The following source code snippet makes use of the systemd.journal module:

import sys
import datetime
import time
import select
import pprint
from systemd import journal
 
# Create a systemd.journal.Reader instance
j = journal.Reader()
 
# Set the reader's default log level
j.log_level(journal.LOG_INFO)
 
# Only include entries since the current box has booted.
j.this_boot()
j.this_machine()
 
# Filter log entries
j.add_match(
    _SYSTEMD_UNIT=u'myservice.service',
    SYSLOG_IDENTIFIER=u'myservice/module',
    _COMM=u'myservicecommand'
)
 
# Move to the end of the journal
j.seek_tail()
 
# Important! - Discard old journal entries
j.get_previous()
 
# Create a poll object for journal entries
p = select.poll()
 
# Register the journal's file descriptor with the polling object.
journal_fd = j.fileno()
poll_event_mask = j.get_events()
p.register(journal_fd, poll_event_mask)
 
# Poll for new journal entries every 250ms
while True:
    if p.poll(250):
        if j.process() == journal.APPEND:
            for entry in j:
                pprint.pprint(entry)
 
    print('waiting ... %s' % datetime.datetime.now())

This code is an enhanced version of the polling for journal events example of the official documentation of the systemd Python bindings. It uses the same polling method found in the previous code snippet (posted above) in order to follow log entries as they pass through the systemd journal.

Here are a few important notes:

  1. After moving the cursor to the end of the journal with j.seek_tail(), we call j.get_previous(). In my tests, if this method is not called, older journal entries are returned together with the first entry that is returned by the polling object. I’m not really sure why this happens and the documentation did not help much. Any suggestions or ideas are appreciated.
  2. We check the value of j.process() so that only new log entries are retrieved (j.process() == journal.APPEND).

Conclusion

Parsing log files is a very common practice in system administration. Polling systemd’s journal for new log entries, either by calling the external journalctl command from within your scripts or by using a systemd.journal.Reader instance, seems to be a more flexible method compared to using the tail command. However, do note that the techniques described in this post apply to systems that use systemd and only to services that record their log messages to the syslog. In case your service logs directly to its own log file, using the tail command is the only available option.

How to tail log entries through the systemd journal using Python by George Notaras is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
Copyright © 2016 - Some Rights Reserved

George Notaras avatar

About George Notaras

George Notaras is the editor of the G-Loaded Journal, a technical blog about Free and Open-Source Software. George, among other things, is an enthusiast self-taught GNU/Linux system administrator. He has created this web site to share the IT knowledge and experience he has gained over the years with other people. George primarily uses CentOS and Fedora. He has also developed some open-source software projects in his spare time.