Why so many log files?
deathduck:~/src/clean/mozharness [17:39:57] (default) 608$ ls -l logs total 24 drwxr-xr-x 8 asasaki staff 272B Oct 11 17:18 ./ drwxr-xr-x 15 asasaki staff 510B Oct 11 17:18 ../ -rw-r--r-- 1 asasaki staff 0B Oct 11 17:18 log_critical.log -rw-r--r-- 1 asasaki staff 0B Oct 11 17:18 log_error.log -rw-r--r-- 1 asasaki staff 0B Oct 11 17:18 log_fatal.log -rw-r--r-- 1 asasaki staff 5.1K Oct 11 17:18 log_info.log -rw-r--r-- 1 asasaki staff 3.6K Oct 11 17:18 log_raw.log -rw-r--r-- 1 asasaki staff 0B Oct 11 17:18 log_warning.log
Because each log file is a superset of all levels above it, you can easily glance at the warning or error log file and determine whether the script detected any warnings or errors.
(debug > info > warning > error > critical > fatal, where '>' means is-a-superset-of)
These are all standard logging module levels except for
FATAL, which I added to indicate a critical error that should also exit the script.
What do Mozharness logs look like?
Here's a snippet from running
scripts/configtest.py (without arguments):
17:18:00 INFO - Testing /src/clean/mozharness/scripts/../configs/deb_repos/staging_release_mozilla-release.py. 17:18:00 INFO - Good. 17:18:00 INFO - Testing /src/clean/mozharness/scripts/../configs/test/test.py. 17:18:00 INFO - Good. 17:18:00 INFO - 5 of 5 python config files were good. 17:18:00 INFO - ##### 17:18:00 INFO - ##### ConfigTest summary: 17:18:00 INFO - ##### 17:18:00 INFO - 17 of 17 json config files were good. 17:18:00 INFO - 5 of 5 python config files were good.
The timestamp and log level are prepended to most log files and the console output. By default, we also create a raw log without that information, which is potentially more scrape-friendly.
How does Mozharness detect errors?
The core error detection is through the ShellMixin.run_command
Generic error lists are defined in errors.py. These are ordered lists of substrings and/or regular expressions, which, if matched against a log line, determines its log level. When we hit a match, we exit the loop.
Two very important features I want to add here:
If we set
'context_lines': (10, 3), I want ten lines up and three lines down to also be marked at this log level.
This requires buffering lines before outputting to the log file and/or console.
This allows for more useful contents in the error log. Rather than only highlighting, say,
make: *** [all] Error 2, you can also include some lines of context above and/or below that line.
- explanations, to translate obscure error messages into "this probably means [insert human readable message here]". More here.