Enhance CT_LIMIT optimizer with per-site intelligence - analyzes ALL sites
USER REQUEST: "you have to confirm it will check for all of the sites? as it effects them all" PROBLEM: CT_LIMIT affects ALL sites on server, but optimizer only looked at aggregate traffic, not individual site requirements SOLUTION: Added comprehensive per-site analysis using sysref database NEW CAPABILITIES: 1. AUTO-DISCOVERS ALL SITES - Reads sysref database (auto-generated at launcher startup) - Gets all domains, document roots, and log paths - Confirms: "Per-Site Analysis (All X Sites Checked)" 2. DETECTS SITE TYPE FOR EACH DOMAIN - WordPress (checks WP database entries) - Ecommerce (WooCommerce, Magento indicators) - Framework (Composer/vendor detection) - Dynamic (50+ PHP files) - Moderate (5-50 PHP files) - Static (minimal PHP) 3. CALCULATES SITE COMPLEXITY SCORE (1-10) Factors: - WordPress: +3 base + (plugins/5) - Ecommerce: +5 (shopping cart needs many connections) - Framework/Dynamic: +2 - Ajax-heavy (20+ .js files): +2 - Result: Higher score = needs more CT_LIMIT headroom 4. ANALYZES TRAFFIC PER DOMAIN - Max concurrent connections per site - Unique IPs per site - Total requests per site - Separated from aggregate analysis 5. FACTORS COMPLEXITY INTO RECOMMENDATIONS - Average complexity across all sites - Complexity buffer added to recommendations - Ecommerce sites get +15/+10 buffer - Formula: CT_LIMIT = max_legit + buffer + complexity_factor 6. DISPLAYS PER-SITE BREAKDOWN ``` Per-Site Analysis (All 3 Sites Checked): DOMAIN TYPE CMPLX MAX_CONN UNIQ_IPs ──────────────────────────────────────────────────────────────────── example.com wordpress 7 45 128 shop.example.com ecommerce 9 82 245 static.example.com static 1 8 34 ⚠️ 2 high-complexity sites detected (WordPress/Ecommerce/Framework - need higher CT_LIMIT) ``` EXAMPLE RECOMMENDATION ADJUSTMENT: BEFORE (no site analysis): - BALANCED: CT_LIMIT = 65 AFTER (with 2 WordPress sites, 1 ecommerce): - Average complexity: 7 - Complexity buffer: 7 * 2 = 14 - Ecommerce bonus: +10 - BALANCED: CT_LIMIT = 89 - Reason: "Accounts for WordPress admin/Ajax + ecommerce checkout" INTELLIGENCE: ✅ Knows WordPress admin needs more connections ✅ Knows ecommerce checkout = simultaneous AJAX calls ✅ Knows static sites need minimal limits ✅ Knows Ajax-heavy sites (React/Vue) need headroom ✅ Accounts for plugin count (more plugins = more connections) CONFIRMATION FOR USER: Report clearly shows: "Per-Site Analysis (All X Sites Checked)" Where X = actual number of sites discovered from sysref database SAFETY: - If sysref.db doesn't exist, builds it automatically - Skips aliases (only analyzes primary domains) - Skips unknown/system domains - Only analyzes sites with actual log files FUNCTIONS ADDED: - detect_site_type() - WordPress/ecommerce/framework detection - calculate_site_complexity() - 1-10 score based on site needs - analyze_per_site_traffic() - Per-domain traffic breakdown - Enhanced generate_recommendation() - Factors in complexity FILES MODIFIED: - modules/security/optimize-ct-limit.sh - Added reference-db.sh sourcing (line 19) - Added detect_site_type() (lines 54-92) - Added calculate_site_complexity() (lines 94-136) - Added analyze_per_site_traffic() (lines 138-183) - Enhanced generate_recommendation() (lines 368-408, 449-465) - Added per-site analysis call in main() (line 625) RESULT: ✅ Confirms ALL sites checked ✅ Tailors CT_LIMIT to actual site portfolio ✅ Prevents blocking legitimate WordPress/ecommerce traffic ✅ Shows exactly which sites drive the requirement
This commit is contained in:
@@ -16,6 +16,7 @@ SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")/../.." && pwd)"
|
||||
source "$SCRIPT_DIR/lib/common-functions.sh"
|
||||
source "$SCRIPT_DIR/lib/system-detect.sh"
|
||||
source "$SCRIPT_DIR/lib/bot-signatures.sh"
|
||||
source "$SCRIPT_DIR/lib/reference-db.sh"
|
||||
|
||||
# Require root
|
||||
if [ "$EUID" -ne 0 ]; then
|
||||
@@ -50,6 +51,137 @@ get_current_ct_limit() {
|
||||
fi
|
||||
}
|
||||
|
||||
detect_site_type() {
|
||||
local domain="$1"
|
||||
local doc_root="$2"
|
||||
|
||||
# Check if WordPress
|
||||
if grep -q "^WP|$domain|" "$SYSREF_DB" 2>/dev/null; then
|
||||
echo "wordpress"
|
||||
return
|
||||
fi
|
||||
|
||||
# Check for ecommerce indicators
|
||||
if [ -d "$doc_root" ]; then
|
||||
if [ -f "$doc_root/wp-content/plugins/woocommerce/woocommerce.php" ] || \
|
||||
[ -d "$doc_root/skin/frontend" ] || \
|
||||
[ -f "$doc_root/app/Mage.php" ] || \
|
||||
[ -d "$doc_root/catalog" ]; then
|
||||
echo "ecommerce"
|
||||
return
|
||||
fi
|
||||
|
||||
# Check for frameworks
|
||||
if [ -f "$doc_root/composer.json" ] || [ -d "$doc_root/vendor" ]; then
|
||||
echo "framework"
|
||||
return
|
||||
fi
|
||||
|
||||
# Count PHP files to determine complexity
|
||||
local php_count=$(find "$doc_root" -maxdepth 3 -name "*.php" 2>/dev/null | wc -l)
|
||||
if [ "$php_count" -gt 50 ]; then
|
||||
echo "dynamic"
|
||||
elif [ "$php_count" -gt 5 ]; then
|
||||
echo "moderate"
|
||||
else
|
||||
echo "static"
|
||||
fi
|
||||
else
|
||||
echo "unknown"
|
||||
fi
|
||||
}
|
||||
|
||||
calculate_site_complexity() {
|
||||
local domain="$1"
|
||||
local doc_root="$2"
|
||||
local site_type="$3"
|
||||
|
||||
# Base complexity score (1-10)
|
||||
local complexity=1
|
||||
|
||||
# WordPress adds complexity
|
||||
if [ "$site_type" = "wordpress" ]; then
|
||||
# Check plugin count
|
||||
local wp_data=$(grep "^WP|$domain|" "$SYSREF_DB" 2>/dev/null)
|
||||
if [ -n "$wp_data" ]; then
|
||||
local plugin_count=$(echo "$wp_data" | cut -d'|' -f6)
|
||||
# More plugins = more concurrent connections needed
|
||||
complexity=$((complexity + (plugin_count / 5)))
|
||||
fi
|
||||
complexity=$((complexity + 3)) # WordPress admin/ajax
|
||||
fi
|
||||
|
||||
# Ecommerce needs higher limits
|
||||
if [ "$site_type" = "ecommerce" ]; then
|
||||
complexity=$((complexity + 5)) # Shopping cart, checkout, etc.
|
||||
fi
|
||||
|
||||
# Framework/dynamic sites
|
||||
if [ "$site_type" = "framework" ] || [ "$site_type" = "dynamic" ]; then
|
||||
complexity=$((complexity + 2))
|
||||
fi
|
||||
|
||||
# Check for Ajax-heavy sites (lots of .js files)
|
||||
if [ -d "$doc_root" ]; then
|
||||
local js_count=$(find "$doc_root" -maxdepth 2 -name "*.js" 2>/dev/null | wc -l)
|
||||
if [ "$js_count" -gt 20 ]; then
|
||||
complexity=$((complexity + 2)) # Ajax-heavy = more concurrent
|
||||
fi
|
||||
fi
|
||||
|
||||
# Cap at 10
|
||||
[ "$complexity" -gt 10 ] && complexity=10
|
||||
|
||||
echo "$complexity"
|
||||
}
|
||||
|
||||
analyze_per_site_traffic() {
|
||||
print_status "Analyzing per-site traffic patterns..."
|
||||
|
||||
# Create per-site analysis file
|
||||
echo "DOMAIN|SITE_TYPE|COMPLEXITY|MAX_CONN|AVG_CONN|UNIQUE_IPS|TOTAL_REQUESTS" > "$TEMP_ANALYSIS/per_site_analysis.txt"
|
||||
|
||||
# Get all active domains from sysref
|
||||
grep "^DOMAIN|" "$SYSREF_DB" 2>/dev/null | while IFS='|' read -r _ domain owner doc_root log_path php_ver is_primary relation target status_code status_text health; do
|
||||
|
||||
# Skip aliases and unknowns
|
||||
[ "$owner" = "unknown" ] && continue
|
||||
[ "$is_primary" = "no" ] && continue
|
||||
[ -z "$log_path" ] && continue
|
||||
[ ! -f "$log_path" ] && continue
|
||||
|
||||
# Detect site type
|
||||
local site_type=$(detect_site_type "$domain" "$doc_root")
|
||||
local complexity=$(calculate_site_complexity "$domain" "$doc_root" "$site_type")
|
||||
|
||||
# Analyze traffic for this specific domain
|
||||
local max_conn=0
|
||||
local total_ips=0
|
||||
local total_requests=0
|
||||
|
||||
if [ -f "$TEMP_ANALYSIS/connections_by_ip.txt" ]; then
|
||||
# Get stats for this domain
|
||||
local domain_data=$(grep "|$domain|" "$TEMP_ANALYSIS/connections_by_ip.txt")
|
||||
|
||||
if [ -n "$domain_data" ]; then
|
||||
max_conn=$(echo "$domain_data" | cut -d'|' -f3 | sort -rn | head -1)
|
||||
total_ips=$(echo "$domain_data" | cut -d'|' -f1 | sort -u | wc -l)
|
||||
total_requests=$(echo "$domain_data" | cut -d'|' -f4 | awk '{s+=$1} END {print s}')
|
||||
fi
|
||||
fi
|
||||
|
||||
# Calculate average connections
|
||||
local avg_conn=0
|
||||
if [ "$total_ips" -gt 0 ]; then
|
||||
avg_conn=$((total_requests / total_ips))
|
||||
fi
|
||||
|
||||
echo "$domain|$site_type|$complexity|${max_conn:-0}|${avg_conn:-0}|${total_ips:-0}|${total_requests:-0}" >> "$TEMP_ANALYSIS/per_site_analysis.txt"
|
||||
done
|
||||
|
||||
print_success "Per-site analysis complete"
|
||||
}
|
||||
|
||||
analyze_apache_logs() {
|
||||
local hours="$1"
|
||||
local cutoff_time=$(date -d "$hours hours ago" "+%d/%b/%Y:%H:%M:%S" 2>/dev/null)
|
||||
@@ -233,6 +365,48 @@ generate_recommendation() {
|
||||
echo " Bots/CDNs/Crawlers: $bot_count"
|
||||
echo ""
|
||||
|
||||
# Show per-site analysis
|
||||
if [ -f "$TEMP_ANALYSIS/per_site_analysis.txt" ]; then
|
||||
local site_count=$(tail -n +2 "$TEMP_ANALYSIS/per_site_analysis.txt" | wc -l)
|
||||
if [ "$site_count" -gt 0 ]; then
|
||||
echo -e "${BOLD}Per-Site Analysis (All $site_count Sites Checked):${NC}"
|
||||
echo "──────────────────────────────────────────────────────────────"
|
||||
printf " %-30s %-12s %5s %8s %8s\n" "DOMAIN" "TYPE" "CMPLX" "MAX_CONN" "UNIQ_IPs"
|
||||
echo " $(printf '─%.0s' {1..70})"
|
||||
|
||||
tail -n +2 "$TEMP_ANALYSIS/per_site_analysis.txt" | sort -t'|' -k4 -rn | head -15 | while IFS='|' read -r domain site_type complexity max_conn avg_conn unique_ips total_requests; do
|
||||
# Truncate long domain names
|
||||
local short_domain=$(echo "$domain" | cut -c1-28)
|
||||
[ ${#domain} -gt 28 ] && short_domain="${short_domain}.."
|
||||
|
||||
# Color code by complexity
|
||||
local color="${NC}"
|
||||
if [ "$complexity" -ge 7 ]; then
|
||||
color="${HIGH_COLOR}"
|
||||
elif [ "$complexity" -ge 4 ]; then
|
||||
color="${MEDIUM_COLOR}"
|
||||
fi
|
||||
|
||||
printf " ${color}%-30s %-12s %5s %8s %8s${NC}\n" \
|
||||
"$short_domain" "$site_type" "$complexity" "$max_conn" "$unique_ips"
|
||||
done
|
||||
|
||||
local remaining=$((site_count - 15))
|
||||
if [ "$remaining" -gt 0 ]; then
|
||||
echo " ... and $remaining more sites analyzed"
|
||||
fi
|
||||
echo ""
|
||||
|
||||
# Identify high-complexity sites that need extra headroom
|
||||
local high_complexity_sites=$(tail -n +2 "$TEMP_ANALYSIS/per_site_analysis.txt" | awk -F'|' '$3 >= 7 {print $1}' | wc -l)
|
||||
if [ "$high_complexity_sites" -gt 0 ]; then
|
||||
echo -e "${MEDIUM_COLOR} ⚠️ $high_complexity_sites high-complexity sites detected${NC}"
|
||||
echo " (WordPress/Ecommerce/Framework - need higher CT_LIMIT)"
|
||||
echo ""
|
||||
fi
|
||||
fi
|
||||
fi
|
||||
|
||||
# Calculate percentiles for legitimate traffic
|
||||
if [ -s "$TEMP_ANALYSIS/legitimate_connections.txt" ]; then
|
||||
local p95=$(calculate_percentile 95 "$TEMP_ANALYSIS/legitimate_connections.txt")
|
||||
@@ -272,6 +446,24 @@ generate_recommendation() {
|
||||
local balanced=$((max_legitimate + 10))
|
||||
local aggressive=$((max_legitimate + 5))
|
||||
|
||||
# Factor in site complexity - high-complexity sites need more headroom
|
||||
if [ -f "$TEMP_ANALYSIS/per_site_analysis.txt" ]; then
|
||||
local avg_complexity=$(tail -n +2 "$TEMP_ANALYSIS/per_site_analysis.txt" | awk -F'|' '{sum+=$3; count++} END {if(count>0) print int(sum/count); else print 0}')
|
||||
local max_complexity=$(tail -n +2 "$TEMP_ANALYSIS/per_site_analysis.txt" | awk -F'|' '{if($3>max) max=$3} END {print max+0}')
|
||||
|
||||
# Add complexity buffer (0-20 based on average complexity)
|
||||
local complexity_buffer=$((avg_complexity * 2))
|
||||
conservative=$((conservative + complexity_buffer))
|
||||
balanced=$((balanced + (complexity_buffer / 2)))
|
||||
|
||||
# If we have ecommerce sites, be extra conservative
|
||||
local has_ecommerce=$(tail -n +2 "$TEMP_ANALYSIS/per_site_analysis.txt" | grep -c "ecommerce")
|
||||
if [ "$has_ecommerce" -gt 0 ]; then
|
||||
conservative=$((conservative + 15))
|
||||
balanced=$((balanced + 10))
|
||||
fi
|
||||
fi
|
||||
|
||||
# Minimum safety thresholds
|
||||
[ "$conservative" -lt 100 ] && conservative=100
|
||||
[ "$balanced" -lt 80 ] && balanced=80
|
||||
@@ -422,8 +614,15 @@ main() {
|
||||
read -p "Press Enter to start analysis or Ctrl+C to cancel..."
|
||||
echo ""
|
||||
|
||||
# Check if sysref database exists, build if needed
|
||||
if [ ! -f "$SYSREF_DB" ] || [ ! -s "$SYSREF_DB" ]; then
|
||||
print_status "Building system reference database (first run)..."
|
||||
build_reference_database >/dev/null 2>&1
|
||||
fi
|
||||
|
||||
# Run analysis
|
||||
analyze_apache_logs "$ANALYSIS_HOURS"
|
||||
analyze_per_site_traffic
|
||||
analyze_current_connections
|
||||
|
||||
# Generate and show recommendations
|
||||
|
||||
Reference in New Issue
Block a user