Icon

Software Engineering, Architecture, Web Development and beyond…

Analyzing Log Files with awk (gawk), grep

console Consider that you want to analyze server logs to find out whether the application server has failed at startup – i.e. whether there are some error messages or exceptions in server logs – or just successfully started without any failure from the start time on that the start script was executed, or maybe grep something between two certain points in time. Here is the way of how you can implement a script using awk and grep without any need of other programming langauges like Perl, Python, etc.

Here is how my log file seems:

INFO   | jvm 1    | 2012/01/04 17:52:17 | INFO: JK: ajp13 listening on /0.0.0.0:8009
INFO   | jvm 1    | 2012/01/04 17:52:17 | 04.01.2012 17:52:17 org.apache.jk.server.JkMain start
INFO   | jvm 1    | 2012/01/04 17:52:17 | INFO: Jk running ID=0 time=0/38  config=null
INFO   | jvm 1    | 2012/01/04 17:52:17 | 04.01.2012 17:52:17 org.apache.catalina.startup.Catalina start
INFO   | jvm 1    | 2012/01/04 17:52:17 | INFO: Server startup in 20703 ms

To compare two dates semantically – i don’t mean a lexical comparison in this case, we do need to convert the string expressions to actual time format in order to find out whether the first date given is greater than the second one and vice versa (or numerical comparison in msecs). You could even just use grep to find some occurencies of a string in a text, but in this case we want to narrow our context according to an interval and grep in it.

# (c) Erhan Bagdemir 2012 GPL 2.0 or later
#!/bin/bash
 
server_startup="clear"
 
function grep_interval() {
 
found=$(tail -n $2 $1 | gawk -v d="$3" '{
  t=$5" "$6
  regex="(^20[0-9][0-9]).([0][1-9]|1[0-2]).([0-2][0-9]|30|31)[[:space:]](0[0-9]|1[0-9]|2[1-3]).([0-5][0-9]).([0-5][0-9])"
  match(t,regex,arr)
  ref=mktime(d" ")
  for (i = 1; i < 7; i++) {
      sub(/^0/,"",arr[i]);
  }
  time_in_log=mktime(arr[1]" "arr[2]" "arr[3]" "arr[4]" "arr[5]" "arr[6]" "0)
  if (time_in_log > ref) {
      print $0
  }
 }' | grep -E -w -i "$4")
 
 if [ -n "$found" ]; then
      server_startup=$found
 fi
 
}
# search for errors from the time on 
remote_start_time=$(date +"%Y' '%m' '%d' '%k' '%M' '%S")
 
# start tomcat server
cd /usr/local/tomcat/bin
./catalina.sh start
 
# log file
log=catalina.out
 
# limit the count of lines not to grep the whole file
limit=100
 
# grep
grep_interval $log $limit $remote_start_time "FATAL|ERROR|SERVER\sSTARTUP"
 
# output
echo $server_startup

The variable “t” in AWK script holds the date (5th element in the line $5) and time values (6th position) which’re extracted from the log file, like “2012/01/04 17:52:17″.
With AWK’s “match” function we put all the matches using regular expression “regex” into an array, arr. ref is the startup time which’s passed with -v parameter to the awk like gawk -v d="$3". Since the mktime() doesn’t like leading zeros applying sub() function on array elements we can remove these unwanted zeros in each array element iterating through. time_in_log is the time converted to miliseconds from log. With time_in_log > ref we’re looking for log entries from the reference time (server startup) in msecs on and not just matching two strings but rather semantically . grep -E -w -i "$4" searching for occurencies defined with expression "FATAL|ERROR|SERVER\sSTARTUP" in a line.

With this smart shell script you can easily find out whether your server started successfully or not without any need of other programming language or tool on the shell.

PS: You would need to adjust the regular expression and the positionings of the date/time sections according to your requirements.



Writting a First Test Blog from my smartphone.

Here is my test article written on my iPhone. Thank to WordPress App which makes it happen. I think that i have a plausible reason to buy a new pad device.

20111029-212128.jpg

erhan

Author


Hello, I'm Erhan Bagdemir and this is my blog. I talk about Java, J2EE, Frameworks, web application development, OOAD and various other topics often related to programming.

Erhan Bagdemir  Profil von Erhan Bagdemir auf LinkedIn anzeigen

ebagdemir on Stackoverflow

ebagdemir on Twitter

    Hamburg

    Slideshow