Compare commits

...

3 Commits

Author SHA1 Message Date
cschantz 9861f117e9 Fix bash syntax error caused by apostrophe in awk comment
The comment "it's too old" contained an apostrophe (single quote) which
broke the bash single-quote enclosure of the awk script, causing:
  "syntax error near unexpected token '}'"

Changed to "too old" to avoid the apostrophe.

In bash, single-quoted strings cannot contain single quotes/apostrophes.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2025-12-31 22:24:55 -05:00
cschantz 08fc2e0745 Fix timestamp comparison to use epoch seconds for accurate filtering
Previous commit used string comparison which failed across month/year
boundaries (e.g., "01/Jan/2026" < "31/Dec/2025" due to day comparison).

Now converts timestamps to epoch seconds for proper numerical comparison:
- Cutoff calculated as epoch seconds (date +%s)
- Apache log timestamps converted from "dd/mmm/yyyy:HH:MM:SS" format
- Format conversion: replace slashes and first colon with spaces
- Numerical comparison ensures correct ordering across all boundaries

Tested with dates spanning year/month changes - works correctly.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2025-12-31 22:21:01 -05:00
cschantz ea26efaf0a Fix bot analyzer to filter log entries by timestamp, not just files
Previously, the script filtered log FILES by modification time but read
ALL entries from those files, causing "Last 1 hour" to show entries from
weeks/months ago if they were in recently-modified files.

Now filters individual log entries by parsing their timestamps and
comparing to the selected time range (1 hour, 6 hours, 24 hours, etc.).

Changes:
- Added cutoff timestamp calculation in awk BEGIN block
- Extract timestamp from each Apache log entry
- Skip entries older than cutoff with timestamp comparison
- Works with both GNU date and BSD date for portability

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2025-12-31 22:15:00 -05:00
+37 -1
View File
@@ -357,7 +357,19 @@ parse_logs() {
# Parse Apache Combined Log Format with error handling
# Format: IP - - [timestamp] "METHOD URL PROTOCOL" STATUS SIZE "REFERRER" "USER-AGENT"
awk -v domain="$domain" '
awk -v domain="$domain" -v hours_filter="$HOURS_BACK" -v days_filter="$DAYS_BACK" '
BEGIN {
# Calculate cutoff timestamp in epoch seconds for proper comparison
if (hours_filter != "") {
cmd = "date -d \"" hours_filter " hours ago\" +%s 2>/dev/null || date -v-" hours_filter "H +%s 2>/dev/null"
cmd | getline cutoff_epoch
close(cmd)
} else if (days_filter != "") {
cmd = "date -d \"" days_filter " days ago\" +%s 2>/dev/null || date -v-" days_filter "d +%s 2>/dev/null"
cmd | getline cutoff_epoch
close(cmd)
}
}
{
# Skip empty lines and malformed entries
if (NF < 10 || length($0) < 50) next
@@ -372,6 +384,30 @@ parse_logs() {
timestamp = "unknown"
}
# Filter by timestamp if time filter is set
if ((hours_filter != "" || days_filter != "") && timestamp != "unknown" && cutoff_epoch != "") {
# Extract just the date/time part (before timezone)
split(timestamp, ts_parts, " ")
log_ts = ts_parts[1]
# Convert Apache timestamp format for date parsing
# From: 31/Dec/2025:10:30:15
# To: 31 Dec 2025 10:30:15
log_ts_formatted = log_ts
sub(/:/, " ", log_ts_formatted) # Replace first : with space
gsub(/\//, " ", log_ts_formatted) # Replace all / with space
# Convert to epoch seconds (GNU date for Linux, BSD date for macOS)
cmd = "date -d \"" log_ts_formatted "\" +%s 2>/dev/null || date -j -f \"%d %b %Y %H:%M:%S\" \"" log_ts_formatted "\" +%s 2>/dev/null"
cmd | getline log_epoch
close(cmd)
# Numerical comparison of epoch seconds
if (log_epoch != "" && log_epoch < cutoff_epoch) {
next # Skip this entry, too old
}
}
# Extract HTTP method, URL, and status
if (match($0, /"([A-Z]+) ([^ ]+) [^"]*" ([0-9]+) ([0-9-]+)/, req)) {
http_method = req[1]