1. Best solution: Python
Using bash
for task such as this might be slightly too complex, because it doesn't have sufficient tools for that purpose. Certainly it can be done, but with very large amount of effort. Therefore we need set of tools that can allow us to parse log file in a simpler way. Python offers such set of tools via datetime
module.
The python script presented below takes 3 arguments on command line: single- or double- quoted beginning timestamp, single- or double- quoted ending timestamp, and the file to read. The format of timestamps should be consistent with 'Mon day HH:MM:SS` format.
#!/usr/bin/env python
import datetime as dt
import sys
def convert_to_seconds(timestring):
year = str(dt.date.today().year)
dtobj = dt.datetime.strptime( year + ' ' + timestring , '%Y %b %d %H:%M:%S' )
return int(dtobj.strftime('%s'))
beginning = convert_to_seconds(sys.argv[1])
ending = convert_to_seconds(sys.argv[2])
with open(sys.argv[3]) as log:
for line in log:
logstamp = " ".join(line.strip().split()[0:3])
s_logstamp = convert_to_seconds(logstamp)
if s_logstamp < beginning: continue
if s_logstamp >= beginning and s_logstamp <= ending:
print(line.strip())
sys.stdout.flush()
if s_logstamp > ending: break
Test run on /var/log/syslog
:
$ ./read_log_range.py 'Feb 8 13:57:00' 'Feb 8 14:00:00' /var/log/syslog
Feb 8 13:57:59 eagle gnome-session[28631]: (nm-applet:28825): GdkPixbuf-CRITICAL **: gdk_pixbuf_composite: assertion 'dest_x >= 0 && dest_x + dest_width <= dest->width' failed
Feb 8 13:59:55 eagle org.gtk.vfs.Daemon[28480]: ** (process:2259): WARNING **: Couldn't create directory monitor on smb://x-gnome-default-workgroup/. Error: Operation not supported by backend
Feb 8 13:59:59 eagle gnome-session[28631]: (nm-applet:28825): GdkPixbuf-CRITICAL **: gdk_pixbuf_composite: assertion 'dest_x >= 0 && dest_x + dest_width <= dest->width' failed
2. Bash
Of course, it is possible do to so in bash
, with use of date
and awk
utilities for extracting the timestamps and conversions. Below is the bash
implementation of the same python script.
#!/usr/bin/env bash
#set -x
str_to_seconds(){
date -d"$1" +%s
}
main(){
local date1=$1
local date2=$2
local logfile=$3
local s_date1=$(str_to_seconds "$date1")
local s_date2=$(str_to_seconds "$date2")
while IFS= read -r line;
do
timestamp=$(awk '{print $1,$2,$3}' <<< "$line")
s_timestamp=$(str_to_seconds "$timestamp")
[ $s_timestamp -lt $s_date1 ] && continue
if [ $s_timestamp -ge $s_date1 ] && [ $s_timestamp -le $s_date2 ]
then
printf "%s\n" "$line"
fi
[ $s_timestamp -gt $s_date2 ] && break
done < "$logfile"
}
main "$@"
3. Comparison of the two approaches
Naturally, bash
version takes much longer time. Shell isn't made for processing of large amount of data, such as logs. For instance, on my machine with SSD and dual core processor, the shell took a significant amount of time to read almost 13,000 line file:
$ time ./read_log_range.sh 'Feb 8 13:56:00' 'Feb 8 14:00:00' '/var/log/syslog' &> /dev/null
0m39.18s real 0m02.48s user 0m02.68s system
$ wc -l /var/log/syslog
12878 /var/log/syslog
Even several optimizations with if
statements didn't help. Compare that with it's python alternative:
$ time ./read_log_range.py 'Feb 8 13:56:00' 'Feb 8 14:00:00' '/var/log/syslog' &> /dev/null
0m00.60s real 0m00.53s user 0m00.07s system
$ wc -l /var/log/syslog
12878 /var/log/syslog
As you can see, python was about 65 times faster than its bash
counterpart.
$date_1
not@date_1
in your first loop. Same for the second. Also${choice,,}
isn't correct, just use${choice}
or$choice
– Sergiy Kolodyazhnyy Feb 05 '17 at 23:31Kay, will try now :)
– Temple Pate Feb 05 '17 at 23:36-mtime
works - if you want to use absolute datetimes then probably what you want is something like-newermt
and-not -newermt
– steeldriver Feb 05 '17 at 23:44find /srv/log/mail
tells me you have a folder/srv/log/mail
and you want to find all files modified between date_1 and date_2 – Sergiy Kolodyazhnyy Feb 06 '17 at 00:10Feb 23 13:54
or13:56
, that line won't be found by sed. Like Jacob said, it can be done fairly easily in python. What I want to know is this: where does time stamp appear ? Is this the first couple words on the line ? Like for example,Feb 8 16:12:34 eagle systemd[1]: Started CUPS Scheduler.
? – Sergiy Kolodyazhnyy Feb 08 '17 at 23:20