a51d968185
- Complete security menu restructure (3-mode: Analysis/Actions/Live) - Intelligent cPHulk enablement with CSF whitelist import - Live network security monitoring dashboard - Multi-source threat detection and classification - 50+ organized security tools across 4-level menu hierarchy - System health diagnostics with cPanel/WHM integration - Reference database for cross-module intelligence sharing
284 lines
7.9 KiB
Markdown
284 lines
7.9 KiB
Markdown
# SESSION INTELLIGENCE - Cross-Module Data Sharing
|
|
|
|
## Overview
|
|
|
|
The Server Toolkit now implements **Session Intelligence** - allowing modules to reference data collected by other modules during the current troubleshooting session. This is optimized for the **download → diagnose → troubleshoot → delete** workflow.
|
|
|
|
## Use Case
|
|
|
|
Since the toolkit is meant to be temporary (not permanently installed), we don't track historical trends. Instead, we enable **cross-module intelligence** so modules can make smarter recommendations based on what's happening RIGHT NOW.
|
|
|
|
## Example Scenarios
|
|
|
|
### Scenario 1: Bot Attack During System Load
|
|
```bash
|
|
# User runs System Health Check first
|
|
# Discovers: CPU at 95%, Memory at 92%, HIGH LOAD
|
|
|
|
# User then runs Bot Analyzer
|
|
# Bot analyzer checks: db_is_system_under_load
|
|
# Result: "High bot traffic detected, but system is already under load.
|
|
# Performance issues may be partially due to system resources,
|
|
# not just bots. Recommend addressing system load first."
|
|
```
|
|
|
|
### Scenario 2: Slow MySQL During Network Issues
|
|
```bash
|
|
# User runs System Health Check
|
|
# Discovers: TCP retransmission at 15%, HIGH network issues
|
|
|
|
# User then runs MySQL Query Analyzer
|
|
# MySQL analyzer checks: db_has_network_issues
|
|
# Result: "Slow queries detected, but network is experiencing high
|
|
# retransmission rates. Some query timeouts may be network-
|
|
# related rather than database performance."
|
|
```
|
|
|
|
### Scenario 3: Bot Attack + SSH Brute Force
|
|
```bash
|
|
# User runs System Health Check
|
|
# Discovers: 5,000 failed SSH attempts today
|
|
|
|
# User then runs Bot Analyzer
|
|
# Bot analyzer checks: db_is_under_attack
|
|
# Result: "Bot traffic detected AND system is under active SSH attack.
|
|
# Recommend immediate firewall hardening and cPHulk enablement."
|
|
```
|
|
|
|
## Architecture
|
|
|
|
### Data Storage: Reference Database (`.sysref`)
|
|
|
|
The health check saves current session metrics to `[HEALTH_BASELINE]` section:
|
|
|
|
**System Resources:**
|
|
- MEMORY_TOTAL_MB, MEMORY_USED_PERCENT
|
|
- CPU_LOAD_1MIN, CPU_CORES
|
|
- DISK_USED_PERCENT, IOWAIT_PERCENT
|
|
|
|
**Services:**
|
|
- HTTPD_STATUS, MYSQL_STATUS
|
|
- FIREWALL_STATUS, EMAIL_QUEUE_SIZE
|
|
- ZOMBIE_PROCESSES
|
|
|
|
**Network Status:**
|
|
- NETWORK_INTERFACE, NETWORK_MTU
|
|
- NETWORK_RX_ERRORS, NETWORK_TX_ERRORS
|
|
- NETWORK_RX_DROPPED, NETWORK_TX_DROPPED
|
|
- TCP_RETRANS_PERCENT
|
|
|
|
**Hardware Status:**
|
|
- DISK_SMART_STATUS
|
|
- HARDWARE_ERRORS
|
|
|
|
**Security Status:**
|
|
- SSH_FAILED_ATTEMPTS_TOTAL
|
|
- SSH_ATTACKS_TODAY
|
|
- CPHULK_STATUS
|
|
|
|
**Issue Counts:**
|
|
- CRITICAL_ISSUES, HIGH_ISSUES
|
|
- MEDIUM_ISSUES, LOW_ISSUES
|
|
|
|
### Helper Functions (`lib/reference-db.sh`)
|
|
|
|
#### Query Individual Metrics
|
|
```bash
|
|
value=$(db_get_health_metric "MEMORY_USED_PERCENT")
|
|
echo "Memory: $value%"
|
|
```
|
|
|
|
#### Intelligence Functions
|
|
|
|
**Check System Load:**
|
|
```bash
|
|
if db_is_system_under_load; then
|
|
echo "System under heavy load (CPU > 80% or Memory > 90%)"
|
|
# Adjust recommendations
|
|
fi
|
|
```
|
|
|
|
**Check Network Issues:**
|
|
```bash
|
|
if db_has_network_issues; then
|
|
echo "Network problems detected (retrans > 5% or errors > 100)"
|
|
# Consider network factors in analysis
|
|
fi
|
|
```
|
|
|
|
**Check Security Status:**
|
|
```bash
|
|
if db_is_under_attack; then
|
|
echo "Active attacks detected (> 100 SSH failures today)"
|
|
# Correlate with security findings
|
|
fi
|
|
```
|
|
|
|
#### Get All Metrics
|
|
```bash
|
|
db_get_all_health # Returns all HEALTH| lines
|
|
```
|
|
|
|
## Implementation in Modules
|
|
|
|
### Pattern 1: Contextual Recommendations
|
|
|
|
```bash
|
|
# In any module, after sourcing reference-db.sh
|
|
|
|
# Check system context
|
|
if db_is_system_under_load; then
|
|
echo "NOTE: System is currently under heavy load."
|
|
echo " Some issues may be resource-related."
|
|
fi
|
|
|
|
if db_has_network_issues; then
|
|
echo "NOTE: Network experiencing high retransmission rates."
|
|
echo " Connection issues may be network-related."
|
|
fi
|
|
|
|
if db_is_under_attack; then
|
|
echo "WARNING: System under active SSH attack."
|
|
echo " Security hardening recommended."
|
|
fi
|
|
```
|
|
|
|
### Pattern 2: Adjusted Thresholds
|
|
|
|
```bash
|
|
# MySQL slow query analyzer
|
|
|
|
# Normal threshold: 5 seconds
|
|
SLOW_THRESHOLD=5
|
|
|
|
# But if system is under load, adjust threshold
|
|
if db_is_system_under_load; then
|
|
SLOW_THRESHOLD=10
|
|
echo "System under load - using relaxed slow query threshold"
|
|
fi
|
|
```
|
|
|
|
### Pattern 3: Root Cause Analysis
|
|
|
|
```bash
|
|
# Website performance analyzer
|
|
|
|
if db_has_network_issues; then
|
|
echo "Website slow, AND network has issues."
|
|
echo "Root cause may be network, not website code."
|
|
echo "Recommendation: Fix network first, then re-test."
|
|
fi
|
|
```
|
|
|
|
## Testing
|
|
|
|
Run the test script to verify cross-module intelligence:
|
|
|
|
```bash
|
|
# First, generate session data
|
|
./launcher.sh
|
|
# Choose option 1: System Health Check
|
|
|
|
# Then test intelligence
|
|
./tools/test-cross-module-intelligence.sh
|
|
```
|
|
|
|
Expected output shows:
|
|
- All health metrics populated
|
|
- Intelligence functions working
|
|
- System status correctly identified
|
|
|
|
## Best Practices
|
|
|
|
### DO:
|
|
✅ Run System Health Check **FIRST** in troubleshooting session
|
|
✅ Use intelligence functions to provide context-aware recommendations
|
|
✅ Correlate findings across modules
|
|
✅ Adjust thresholds based on system state
|
|
|
|
### DON'T:
|
|
❌ Rely on this data for historical trend analysis (it's session-only)
|
|
❌ Assume data exists (always check if metric is populated)
|
|
❌ Make critical decisions solely on this data
|
|
❌ Store this long-term (it gets cleaned up)
|
|
|
|
## Example: Enhanced Bot Analyzer (Future)
|
|
|
|
```bash
|
|
# modules/security/bot-analyzer.sh
|
|
|
|
source "$SCRIPT_DIR/lib/reference-db.sh"
|
|
|
|
# After analysis, provide context
|
|
|
|
if db_has_network_issues; then
|
|
echo ""
|
|
print_warning "Network Issues Detected"
|
|
echo "System experiencing:"
|
|
echo " • TCP Retransmission: $(db_get_health_metric 'TCP_RETRANS_PERCENT')%"
|
|
echo " • Network errors: $(db_get_health_metric 'NETWORK_RX_ERRORS')"
|
|
echo ""
|
|
echo "Bot traffic may be compounded by network problems."
|
|
echo "Recommendation: Address network issues first (see System Health Check)"
|
|
fi
|
|
|
|
if db_is_system_under_load; then
|
|
echo ""
|
|
print_warning "System Under Heavy Load"
|
|
echo "Current state:"
|
|
echo " • CPU Load: $(db_get_health_metric 'CPU_LOAD_1MIN')"
|
|
echo " • Memory: $(db_get_health_metric 'MEMORY_USED_PERCENT')%"
|
|
echo ""
|
|
echo "High bot traffic + system load = performance degradation."
|
|
echo "Recommendation: Block bots AND investigate resource usage."
|
|
fi
|
|
```
|
|
|
|
## Files Modified
|
|
|
|
1. **modules/diagnostics/system-health-check.sh**
|
|
- Enhanced `save_health_baseline()` function
|
|
- Now saves network, hardware, and security metrics
|
|
- Lines: 1660-1758
|
|
|
|
2. **lib/reference-db.sh**
|
|
- Added `db_get_health_metric()` - query individual metrics
|
|
- Added `db_is_system_under_load()` - check if CPU/memory high
|
|
- Added `db_has_network_issues()` - check for network problems
|
|
- Added `db_is_under_attack()` - check for active attacks
|
|
- Added `db_get_all_health()` - get all health data
|
|
- Lines: 446-497
|
|
|
|
3. **tools/test-cross-module-intelligence.sh** (NEW)
|
|
- Test script demonstrating cross-module queries
|
|
- Shows how to use intelligence functions
|
|
|
|
## Data Lifetime
|
|
|
|
- **Created:** When System Health Check runs
|
|
- **Stored:** In `.sysref` file (memory + disk)
|
|
- **Expires:** After 1 hour OR when cleanup/reset runs
|
|
- **Removed:** When toolkit is deleted
|
|
|
|
## Future Enhancements
|
|
|
|
Potential modules that could benefit:
|
|
|
|
1. **WordPress Health Check**
|
|
- Check if slow WP sites correlate with network/load issues
|
|
|
|
2. **Backup Analyzer**
|
|
- Check if backup failures correlate with disk/load issues
|
|
|
|
3. **Email Troubleshooter**
|
|
- Check if email issues correlate with network/disk problems
|
|
|
|
4. **Resource Monitor**
|
|
- Compare current metrics vs health check baseline
|
|
|
|
## Summary
|
|
|
|
Session Intelligence transforms the toolkit from **isolated modules** into an **integrated diagnostic platform**. Each module can now make smarter, context-aware recommendations based on the complete picture of what's happening on the server RIGHT NOW.
|
|
|
|
No historical data needed. No complex trending. Just smart, session-aware troubleshooting.
|