- What is JQ?
-
Parsing Logs
- General Commands
-
Parsing
gitlab-rails/production_json.log
andgitlab-rails/api_json.log
- Find all requests with a 5XX status code
- Top 10 slowest requests
- Find and pretty print all requests related to a project
- Find all requests with a total duration > 5 seconds
- Find all project requests with more than 5 rugged calls
- Find all requests with a Gitaly duration > 10 seconds
- Find all requests with a queue duration > 10 seconds
- Top 10 requests by # of Gitaly calls
- Output a specific time range
-
Parsing
gitlab-rails/production_json.log
-
Parsing
gitlab-rails/api_json.log
- Print top API user agents
- Parsing
gitlab-workhorse/current
- Print top Workhorse user agents
-
Parsing
gitlab-rails/geo.log
-
Parsing
gitaly/current
- Parsing
gitlab-shell/gitlab-shell.log
Parsing GitLab logs with jq
We recommend using log aggregation and search tools like Kibana and Splunk whenever possible,
but if they are not available you can still quickly parse
GitLab logs in JSON format
(the default in GitLab 12.0 and later) using jq
.
fast-stats
tool.What is JQ?
As noted in its manual, jq
is a command-line JSON processor. The following examples
include use cases targeted for parsing GitLab log files.
Parsing Logs
The examples listed below address their respective log files by their relative Omnibus paths and default filenames. Find the respective full paths in the GitLab logs sections.
General Commands
Pipe colorized jq
output into less
jq . <FILE> -C | less -R
Search for a term and pretty-print all matching lines
grep <TERM> <FILE> | jq .
Skip invalid lines of JSON
jq -cR 'fromjson?' file.json | jq <COMMAND>
By default jq
errors out when it encounters a line that is not valid JSON.
This skips over all invalid lines and parses the rest.
Print a JSON log’s time range
cat log.json | (head -1; tail -1) | jq '.time'
Use zcat
if the file has been rotated and compressed:
zcat @400000006026b71d1a7af804.s | (head -1; tail -1) | jq '.time'
zcat some_json.log.25.gz | (head -1; tail -1) | jq '.time'
Get activity for correlation ID across multiple JSON logs in chronological order
grep -hR <correlationID> | jq -c -R 'fromjson?' | jq -C -s 'sort_by(.time)' | less -R
Parsing gitlab-rails/production_json.log
and gitlab-rails/api_json.log
Find all requests with a 5XX status code
jq 'select(.status >= 500)' <FILE>
Top 10 slowest requests
jq -s 'sort_by(-.duration_s) | limit(10; .[])' <FILE>
Find and pretty print all requests related to a project
grep <PROJECT_NAME> <FILE> | jq .
Find all requests with a total duration > 5 seconds
jq 'select(.duration_s > 5000)' <FILE>
Find all project requests with more than 5 rugged calls
grep <PROJECT_NAME> <FILE> | jq 'select(.rugged_calls > 5)'
Find all requests with a Gitaly duration > 10 seconds
jq 'select(.gitaly_duration_s > 10000)' <FILE>
Find all requests with a queue duration > 10 seconds
jq 'select(.queue_duration_s > 10000)' <FILE>
Top 10 requests by # of Gitaly calls
jq -s 'map(select(.gitaly_calls != null)) | sort_by(-.gitaly_calls) | limit(10; .[])' <FILE>
Output a specific time range
jq 'select(.time >= "2023-01-10T00:00:00Z" and .time <= "2023-01-10T12:00:00Z")' <FILE>
Parsing gitlab-rails/production_json.log
Print the top three controller methods by request volume and their three longest durations
jq -s -r 'group_by(.controller+.action) | sort_by(-length) | limit(3; .[]) | sort_by(-.duration_s) | "CT: \(length)\tMETHOD: \(.[0].controller)#\(.[0].action)\tDURS: \(.[0].duration_s), \(.[1].duration_s), \(.[2].duration_s)"' production_json.log
Example output
CT: 2721 METHOD: SessionsController#new DURS: 844.06, 713.81, 704.66
CT: 2435 METHOD: MetricsController#index DURS: 299.29, 284.01, 158.57
CT: 1328 METHOD: Projects::NotesController#index DURS: 403.99, 386.29, 384.39
Parsing gitlab-rails/api_json.log
Print top three routes with request count and their three longest durations
jq -s -r 'group_by(.route) | sort_by(-length) | limit(3; .[]) | sort_by(-.duration_s) | "CT: \(length)\tROUTE: \(.[0].route)\tDURS: \(.[0].duration_s), \(.[1].duration_s), \(.[2].duration_s)"' api_json.log
Example output
CT: 2472 ROUTE: /api/:version/internal/allowed DURS: 56402.65, 38411.43, 19500.41
CT: 297 ROUTE: /api/:version/projects/:id/repository/tags DURS: 731.39, 685.57, 480.86
CT: 190 ROUTE: /api/:version/projects/:id/repository/commits DURS: 1079.02, 979.68, 958.21
Print top API user agents
jq --raw-output '[.route, .ua] | @tsv' api_json.log | sort | uniq -c | sort -n
Example output:
89 /api/:version/usage_data/increment_unique_users # plus browser details
567 /api/:version/jobs/:id/trace gitlab-runner # plus version details
1234 /api/:version/internal/allowed GitLab-Shell
This sample response seems typical. A custom tool or script might be causing a high load if the output contains many:
- Third party libraries like
python-requests
orcurl
. - GitLab CLI clients.
You can also use fast-stats top
to extract performance statistics.
Parsing gitlab-workhorse/current
Print top Workhorse user agents
jq --raw-output '[.uri, .user_agent] | @tsv' current | sort | uniq -c | sort -n
Example output:
89 /api/graphql # plus browser details
567 /api/v4/internal/allowed GitLab-Shell
1234 /api/v4/jobs/request gitlab-runner # plus version details
Similar to the API ua
data,
deviations from this common order might indicate scripts that could be optimized.
The performance impact of runners checking for new jobs can be reduced by increasing
the check_interval
setting,
for example.
Parsing gitlab-rails/geo.log
Find most common Geo sync errors
If the geo:status
Rake task
repeatedly reports that some items never reach 100%,
the following command helps to focus on the most common errors.
jq --raw-output 'select(.severity == "ERROR") | [.project_path, .message] | @tsv' geo.log | sort | uniq -c | sort | tail
Parsing gitaly/current
Use the following examples to troubleshoot Gitaly.
Find all Gitaly requests sent from web UI
jq 'select(."grpc.meta.client_name" == "gitlab-web")' current
Find all failed Gitaly requests
jq 'select(."grpc.code" != null and ."grpc.code" != "OK")' current
Find all requests that took longer than 30 seconds
jq 'select(."grpc.time_ms" > 30000)' current
Print top ten projects by request volume and their three longest durations
jq --raw-output --slurp '
map(
select(
."grpc.request.glProjectPath" != null
and ."grpc.request.glProjectPath" != ""
and ."grpc.time_ms" != null
)
)
| group_by(."grpc.request.glProjectPath")
| sort_by(-length)
| limit(10; .[])
| sort_by(-."grpc.time_ms")
| [
length,
.[0]."grpc.time_ms",
.[1]."grpc.time_ms",
.[2]."grpc.time_ms",
.[0]."grpc.request.glProjectPath"
]
| @sh' current \
| awk 'BEGIN { printf "%7s %10s %10s %10s\t%s\n", "CT", "MAX DURS", "", "", "PROJECT" }
{ printf "%7u %7u ms, %7u ms, %7u ms\t%s\n", $1, $2, $3, $4, $5 }'
Example output
CT MAX DURS PROJECT
206 4898 ms, 1101 ms, 1032 ms 'groupD/project4'
109 1420 ms, 962 ms, 875 ms 'groupEF/project56'
663 106 ms, 96 ms, 94 ms 'groupABC/project123'
...
Find all projects affected by a fatal Git problem
grep "fatal: " current | \
jq '."grpc.request.glProjectPath"' | \
sort | uniq
Parsing gitlab-shell/gitlab-shell.log
For investigating Git calls via SSH, from GitLab 12.10.
Find the top 20 calls by project and user:
jq --raw-output --slurp '
map(
select(
.username != null and
.gl_project_path !=null
)
)
| group_by(.username+.gl_project_path)
| sort_by(-length)
| limit(20; .[])
| "count: \(length)\tuser: \(.[0].username)\tproject: \(.[0].gl_project_path)" ' \
gitlab-shell.log
Find the top 20 calls by project, user, and command:
jq --raw-output --slurp '
map(
select(
.command != null and
.username != null and
.gl_project_path !=null
)
)
| group_by(.username+.gl_project_path+.command)
| sort_by(-length)
| limit(20; .[])
| "count: \(length)\tcommand: \(.[0].command)\tuser: \(.[0].username)\tproject: \(.[0].gl_project_path)" ' \
gitlab-shell.log