-1

I am looking into trying to setup a BASH script that would allow me to input a date range, confirm the range, and then actually search for said range. But each time I try, it seems to just come up empty for some reason. I was following the logic based HERE which is a bit old, but hopefully had some accurate parts to it. I know the code is rough, and can be cleaned up, but i'm still rather new at this and any help would be very appreciated.

#!/bin/bash

date_1=''
date_2=''

read -p "Please Enter the Beggining Time. Exp. Aug 1 00:00:01 " date_1;

read -p "Please Enter the Beggining Time. Exp. Aug 1 00:00:01 " date_2;

while :
 do
    read -p "Is this Date correct? @date_1" choice
    case ${choice} in
        y|ye|yes) break;;
        n|no) echo "Try again"; exec $0;;
    esac
done
while :
 do
    read -p "Is this Date correct? @date_2" choice
    case ${choice} in
        y|ye|yes) break;;
        n|no) echo "Try again"; exec $0;;
    esac
done

echo $date_1 , $date_2
find /srv/log/mail -mtime $(date +%s -d"$date_1") -mtime $(date +%s -d"$date_2")
Temple Pate
  • 173
  • 1
  • 2
  • 18
  • Should be $date_1 not @date_1 in your first loop. Same for the second. Also ${choice,,} isn't correct, just use ${choice} or $choice – Sergiy Kolodyazhnyy Feb 05 '17 at 23:31
  • Sorry, i fixed the @date in the original git post, i copied from my notes :(

    Kay, will try now :)

    – Temple Pate Feb 05 '17 at 23:36
  • Yea, if I run the find by itself it seems to be what's broken. The original command I used to test still comes up blank. find /srv/log/mail -mtime $(date +%s -d"Feb 5 00:01:28") -mtime $(date +%s -d"Feb 5 00:01:27") – Temple Pate Feb 05 '17 at 23:42
  • 2
    Yes you seem to have a fundamental misunderstanding of how find's -mtime works - if you want to use absolute datetimes then probably what you want is something like -newermt and -not -newermt – steeldriver Feb 05 '17 at 23:44
  • @steeldriver your not wrong.. kinda just picked it up from the before. I was reading up on it now :) – Temple Pate Feb 05 '17 at 23:46
  • @steeldriver , I was reading and the only documentation i've found for newermt is via files only. The directory listed above is an actual file, would that matter? – Temple Pate Feb 05 '17 at 23:54
  • 2
    Just a question then, are you trying to find files that were modified within specific range , or are you trying to find timestamps within a specific file ? find /srv/log/mail tells me you have a folder /srv/log/mail and you want to find all files modified between date_1 and date_2 – Sergiy Kolodyazhnyy Feb 06 '17 at 00:10
  • I'm trying to find the time stamp of a file. Ah yes, that would explain a lot.. – Temple Pate Feb 06 '17 at 00:19
  • 1
    @TemplePate so . . .you're trying to find all files that have timestamp within specific range , right ? See, we need to clearly define what you're trying to achieve here , and then we can proceed to actually fixing the scrip. I already have rewritten your script to search for files in range from date 1 to date 2, but I need to confirm if that's what you're asking about – Sergiy Kolodyazhnyy Feb 06 '17 at 00:44
  • @Serg (redacted, thought of something else). My goal is to search through mail logs, but I want to make it end user friendly so someone could input a date range, and then search for said email address within that date range. I was just going to add grep into the script. But overall that's my goal.. – Temple Pate Feb 06 '17 at 16:09
  • Would this command work then?
    sed -n '/Feb 23 13:55/,/Feb 23 14:00/p' /var/log/mail.log
    
    – Temple Pate Feb 08 '17 at 22:54
  • @TemplePate No, that sed command wouldn't really work, because if the time stamp is Feb 23 13:54 or 13:56, that line won't be found by sed. Like Jacob said, it can be done fairly easily in python. What I want to know is this: where does time stamp appear ? Is this the first couple words on the line ? Like for example, Feb 8 16:12:34 eagle systemd[1]: Started CUPS Scheduler. ? – Sergiy Kolodyazhnyy Feb 08 '17 at 23:20
  • you want to do what ? isn't it easier to just find date and grep the one you want ? – Pavlos Theodorou Feb 09 '17 at 07:53
  • @PavlosTheodorou The reason I was working on this, is for the sake of a T1 team that isn't familiar with grep. :) – Temple Pate Feb 09 '17 at 15:30
  • Just a reminder that if my answer solved your problem, please award the bounty. Half of bounties that aren't awarded, will be given to highest voted question , so half of it goes wasted if unawarded – Sergiy Kolodyazhnyy Feb 12 '17 at 22:16

1 Answers1

3

1. Best solution: Python

Using bash for task such as this might be slightly too complex, because it doesn't have sufficient tools for that purpose. Certainly it can be done, but with very large amount of effort. Therefore we need set of tools that can allow us to parse log file in a simpler way. Python offers such set of tools via datetime module.

The python script presented below takes 3 arguments on command line: single- or double- quoted beginning timestamp, single- or double- quoted ending timestamp, and the file to read. The format of timestamps should be consistent with 'Mon day HH:MM:SS` format.

#!/usr/bin/env python
import datetime as dt
import sys

def convert_to_seconds(timestring):
    year = str(dt.date.today().year)
    dtobj = dt.datetime.strptime( year + ' ' + timestring , '%Y %b %d %H:%M:%S' )
    return int(dtobj.strftime('%s'))

beginning = convert_to_seconds(sys.argv[1])
ending = convert_to_seconds(sys.argv[2])

with open(sys.argv[3]) as log:
    for line in log:
        logstamp = " ".join(line.strip().split()[0:3])
        s_logstamp = convert_to_seconds(logstamp)
        if s_logstamp < beginning: continue
        if s_logstamp >= beginning and s_logstamp <= ending:
            print(line.strip())
            sys.stdout.flush()
        if s_logstamp > ending: break

Test run on /var/log/syslog:

$ ./read_log_range.py 'Feb 8 13:57:00'  'Feb 8 14:00:00' /var/log/syslog                              
Feb  8 13:57:59 eagle gnome-session[28631]: (nm-applet:28825): GdkPixbuf-CRITICAL **: gdk_pixbuf_composite: assertion 'dest_x >= 0 && dest_x + dest_width <= dest->width' failed
Feb  8 13:59:55 eagle org.gtk.vfs.Daemon[28480]: ** (process:2259): WARNING **: Couldn't create directory monitor on smb://x-gnome-default-workgroup/. Error: Operation not supported by backend
Feb  8 13:59:59 eagle gnome-session[28631]: (nm-applet:28825): GdkPixbuf-CRITICAL **: gdk_pixbuf_composite: assertion 'dest_x >= 0 && dest_x + dest_width <= dest->width' failed

2. Bash

Of course, it is possible do to so in bash, with use of date and awk utilities for extracting the timestamps and conversions. Below is the bash implementation of the same python script.

#!/usr/bin/env bash
#set -x
str_to_seconds(){
    date -d"$1" +%s
}

main(){
    local date1=$1
    local date2=$2
    local logfile=$3

    local s_date1=$(str_to_seconds "$date1")
    local s_date2=$(str_to_seconds "$date2")

    while IFS= read -r line;
    do
        timestamp=$(awk '{print $1,$2,$3}' <<< "$line")
        s_timestamp=$(str_to_seconds "$timestamp")
        [ $s_timestamp -lt $s_date1  ] && continue
        if [ $s_timestamp -ge $s_date1  ] && [ $s_timestamp -le $s_date2  ]
        then
            printf "%s\n" "$line"
        fi
        [ $s_timestamp -gt $s_date2  ] && break

    done < "$logfile"
}

main "$@"

3. Comparison of the two approaches

Naturally, bash version takes much longer time. Shell isn't made for processing of large amount of data, such as logs. For instance, on my machine with SSD and dual core processor, the shell took a significant amount of time to read almost 13,000 line file:

$ time ./read_log_range.sh 'Feb 8 13:56:00'  'Feb 8 14:00:00' '/var/log/syslog' &> /dev/null          
    0m39.18s real     0m02.48s user     0m02.68s system

$ wc -l /var/log/syslog 
12878 /var/log/syslog

Even several optimizations with if statements didn't help. Compare that with it's python alternative:

$ time ./read_log_range.py 'Feb 8 13:56:00'  'Feb 8 14:00:00' '/var/log/syslog' &> /dev/null          
    0m00.60s real     0m00.53s user     0m00.07s system

$ wc -l /var/log/syslog                                                                               
12878 /var/log/syslog

As you can see, python was about 65 times faster than its bash counterpart.

Sergiy Kolodyazhnyy
  • 105,154
  • 20
  • 279
  • 497
  • yeah right. all this to do what ? to get a few files per specific date ? i would rather find + grep the date i want – Pavlos Theodorou Feb 09 '17 at 07:54
  • 2
    @PavlosTheodorou All this to print all lines from time A to time B. And grep is a line-matching tool, so it cannot be very accurate. Also, there's no need for find. OP knows which file to use. Maybe you want to post an answer ? I would be interested to see how you do it with grep, if you can – Sergiy Kolodyazhnyy Feb 09 '17 at 07:57
  • I'm very glad you mentioned the cons of the bash, it's the only thing I am even slightly familiar with, and the log file we use here is 100k+ lines of logs. (We just switched to SSD's, but apparently won't help with the bash side..) – Temple Pate Feb 09 '17 at 15:46
  • @TemplePate Indeed, bash is slow and switching to SSD doesn't help much. Are there any other things you're familiar with ? Perhaps Perl or awk ? – Sergiy Kolodyazhnyy Feb 09 '17 at 21:28
  • @serg unfortunately not.. I've learned them before but since a Noc tech doesn't usually get paid to do these things, especially just interfaces for T1 techs, I've forgotten most of it already :/ – Temple Pate Feb 10 '17 at 22:55