Initial commit: Server Management Toolkit v2.0
- Complete security menu restructure (3-mode: Analysis/Actions/Live) - Intelligent cPHulk enablement with CSF whitelist import - Live network security monitoring dashboard - Multi-source threat detection and classification - 50+ organized security tools across 4-level menu hierarchy - System health diagnostics with cPanel/WHM integration - Reference database for cross-module intelligence sharing
This commit is contained in:
@@ -0,0 +1,283 @@
|
||||
# SESSION INTELLIGENCE - Cross-Module Data Sharing
|
||||
|
||||
## Overview
|
||||
|
||||
The Server Toolkit now implements **Session Intelligence** - allowing modules to reference data collected by other modules during the current troubleshooting session. This is optimized for the **download → diagnose → troubleshoot → delete** workflow.
|
||||
|
||||
## Use Case
|
||||
|
||||
Since the toolkit is meant to be temporary (not permanently installed), we don't track historical trends. Instead, we enable **cross-module intelligence** so modules can make smarter recommendations based on what's happening RIGHT NOW.
|
||||
|
||||
## Example Scenarios
|
||||
|
||||
### Scenario 1: Bot Attack During System Load
|
||||
```bash
|
||||
# User runs System Health Check first
|
||||
# Discovers: CPU at 95%, Memory at 92%, HIGH LOAD
|
||||
|
||||
# User then runs Bot Analyzer
|
||||
# Bot analyzer checks: db_is_system_under_load
|
||||
# Result: "High bot traffic detected, but system is already under load.
|
||||
# Performance issues may be partially due to system resources,
|
||||
# not just bots. Recommend addressing system load first."
|
||||
```
|
||||
|
||||
### Scenario 2: Slow MySQL During Network Issues
|
||||
```bash
|
||||
# User runs System Health Check
|
||||
# Discovers: TCP retransmission at 15%, HIGH network issues
|
||||
|
||||
# User then runs MySQL Query Analyzer
|
||||
# MySQL analyzer checks: db_has_network_issues
|
||||
# Result: "Slow queries detected, but network is experiencing high
|
||||
# retransmission rates. Some query timeouts may be network-
|
||||
# related rather than database performance."
|
||||
```
|
||||
|
||||
### Scenario 3: Bot Attack + SSH Brute Force
|
||||
```bash
|
||||
# User runs System Health Check
|
||||
# Discovers: 5,000 failed SSH attempts today
|
||||
|
||||
# User then runs Bot Analyzer
|
||||
# Bot analyzer checks: db_is_under_attack
|
||||
# Result: "Bot traffic detected AND system is under active SSH attack.
|
||||
# Recommend immediate firewall hardening and cPHulk enablement."
|
||||
```
|
||||
|
||||
## Architecture
|
||||
|
||||
### Data Storage: Reference Database (`.sysref`)
|
||||
|
||||
The health check saves current session metrics to `[HEALTH_BASELINE]` section:
|
||||
|
||||
**System Resources:**
|
||||
- MEMORY_TOTAL_MB, MEMORY_USED_PERCENT
|
||||
- CPU_LOAD_1MIN, CPU_CORES
|
||||
- DISK_USED_PERCENT, IOWAIT_PERCENT
|
||||
|
||||
**Services:**
|
||||
- HTTPD_STATUS, MYSQL_STATUS
|
||||
- FIREWALL_STATUS, EMAIL_QUEUE_SIZE
|
||||
- ZOMBIE_PROCESSES
|
||||
|
||||
**Network Status:**
|
||||
- NETWORK_INTERFACE, NETWORK_MTU
|
||||
- NETWORK_RX_ERRORS, NETWORK_TX_ERRORS
|
||||
- NETWORK_RX_DROPPED, NETWORK_TX_DROPPED
|
||||
- TCP_RETRANS_PERCENT
|
||||
|
||||
**Hardware Status:**
|
||||
- DISK_SMART_STATUS
|
||||
- HARDWARE_ERRORS
|
||||
|
||||
**Security Status:**
|
||||
- SSH_FAILED_ATTEMPTS_TOTAL
|
||||
- SSH_ATTACKS_TODAY
|
||||
- CPHULK_STATUS
|
||||
|
||||
**Issue Counts:**
|
||||
- CRITICAL_ISSUES, HIGH_ISSUES
|
||||
- MEDIUM_ISSUES, LOW_ISSUES
|
||||
|
||||
### Helper Functions (`lib/reference-db.sh`)
|
||||
|
||||
#### Query Individual Metrics
|
||||
```bash
|
||||
value=$(db_get_health_metric "MEMORY_USED_PERCENT")
|
||||
echo "Memory: $value%"
|
||||
```
|
||||
|
||||
#### Intelligence Functions
|
||||
|
||||
**Check System Load:**
|
||||
```bash
|
||||
if db_is_system_under_load; then
|
||||
echo "System under heavy load (CPU > 80% or Memory > 90%)"
|
||||
# Adjust recommendations
|
||||
fi
|
||||
```
|
||||
|
||||
**Check Network Issues:**
|
||||
```bash
|
||||
if db_has_network_issues; then
|
||||
echo "Network problems detected (retrans > 5% or errors > 100)"
|
||||
# Consider network factors in analysis
|
||||
fi
|
||||
```
|
||||
|
||||
**Check Security Status:**
|
||||
```bash
|
||||
if db_is_under_attack; then
|
||||
echo "Active attacks detected (> 100 SSH failures today)"
|
||||
# Correlate with security findings
|
||||
fi
|
||||
```
|
||||
|
||||
#### Get All Metrics
|
||||
```bash
|
||||
db_get_all_health # Returns all HEALTH| lines
|
||||
```
|
||||
|
||||
## Implementation in Modules
|
||||
|
||||
### Pattern 1: Contextual Recommendations
|
||||
|
||||
```bash
|
||||
# In any module, after sourcing reference-db.sh
|
||||
|
||||
# Check system context
|
||||
if db_is_system_under_load; then
|
||||
echo "NOTE: System is currently under heavy load."
|
||||
echo " Some issues may be resource-related."
|
||||
fi
|
||||
|
||||
if db_has_network_issues; then
|
||||
echo "NOTE: Network experiencing high retransmission rates."
|
||||
echo " Connection issues may be network-related."
|
||||
fi
|
||||
|
||||
if db_is_under_attack; then
|
||||
echo "WARNING: System under active SSH attack."
|
||||
echo " Security hardening recommended."
|
||||
fi
|
||||
```
|
||||
|
||||
### Pattern 2: Adjusted Thresholds
|
||||
|
||||
```bash
|
||||
# MySQL slow query analyzer
|
||||
|
||||
# Normal threshold: 5 seconds
|
||||
SLOW_THRESHOLD=5
|
||||
|
||||
# But if system is under load, adjust threshold
|
||||
if db_is_system_under_load; then
|
||||
SLOW_THRESHOLD=10
|
||||
echo "System under load - using relaxed slow query threshold"
|
||||
fi
|
||||
```
|
||||
|
||||
### Pattern 3: Root Cause Analysis
|
||||
|
||||
```bash
|
||||
# Website performance analyzer
|
||||
|
||||
if db_has_network_issues; then
|
||||
echo "Website slow, AND network has issues."
|
||||
echo "Root cause may be network, not website code."
|
||||
echo "Recommendation: Fix network first, then re-test."
|
||||
fi
|
||||
```
|
||||
|
||||
## Testing
|
||||
|
||||
Run the test script to verify cross-module intelligence:
|
||||
|
||||
```bash
|
||||
# First, generate session data
|
||||
./launcher.sh
|
||||
# Choose option 1: System Health Check
|
||||
|
||||
# Then test intelligence
|
||||
./tools/test-cross-module-intelligence.sh
|
||||
```
|
||||
|
||||
Expected output shows:
|
||||
- All health metrics populated
|
||||
- Intelligence functions working
|
||||
- System status correctly identified
|
||||
|
||||
## Best Practices
|
||||
|
||||
### DO:
|
||||
✅ Run System Health Check **FIRST** in troubleshooting session
|
||||
✅ Use intelligence functions to provide context-aware recommendations
|
||||
✅ Correlate findings across modules
|
||||
✅ Adjust thresholds based on system state
|
||||
|
||||
### DON'T:
|
||||
❌ Rely on this data for historical trend analysis (it's session-only)
|
||||
❌ Assume data exists (always check if metric is populated)
|
||||
❌ Make critical decisions solely on this data
|
||||
❌ Store this long-term (it gets cleaned up)
|
||||
|
||||
## Example: Enhanced Bot Analyzer (Future)
|
||||
|
||||
```bash
|
||||
# modules/security/bot-analyzer.sh
|
||||
|
||||
source "$SCRIPT_DIR/lib/reference-db.sh"
|
||||
|
||||
# After analysis, provide context
|
||||
|
||||
if db_has_network_issues; then
|
||||
echo ""
|
||||
print_warning "Network Issues Detected"
|
||||
echo "System experiencing:"
|
||||
echo " • TCP Retransmission: $(db_get_health_metric 'TCP_RETRANS_PERCENT')%"
|
||||
echo " • Network errors: $(db_get_health_metric 'NETWORK_RX_ERRORS')"
|
||||
echo ""
|
||||
echo "Bot traffic may be compounded by network problems."
|
||||
echo "Recommendation: Address network issues first (see System Health Check)"
|
||||
fi
|
||||
|
||||
if db_is_system_under_load; then
|
||||
echo ""
|
||||
print_warning "System Under Heavy Load"
|
||||
echo "Current state:"
|
||||
echo " • CPU Load: $(db_get_health_metric 'CPU_LOAD_1MIN')"
|
||||
echo " • Memory: $(db_get_health_metric 'MEMORY_USED_PERCENT')%"
|
||||
echo ""
|
||||
echo "High bot traffic + system load = performance degradation."
|
||||
echo "Recommendation: Block bots AND investigate resource usage."
|
||||
fi
|
||||
```
|
||||
|
||||
## Files Modified
|
||||
|
||||
1. **modules/diagnostics/system-health-check.sh**
|
||||
- Enhanced `save_health_baseline()` function
|
||||
- Now saves network, hardware, and security metrics
|
||||
- Lines: 1660-1758
|
||||
|
||||
2. **lib/reference-db.sh**
|
||||
- Added `db_get_health_metric()` - query individual metrics
|
||||
- Added `db_is_system_under_load()` - check if CPU/memory high
|
||||
- Added `db_has_network_issues()` - check for network problems
|
||||
- Added `db_is_under_attack()` - check for active attacks
|
||||
- Added `db_get_all_health()` - get all health data
|
||||
- Lines: 446-497
|
||||
|
||||
3. **tools/test-cross-module-intelligence.sh** (NEW)
|
||||
- Test script demonstrating cross-module queries
|
||||
- Shows how to use intelligence functions
|
||||
|
||||
## Data Lifetime
|
||||
|
||||
- **Created:** When System Health Check runs
|
||||
- **Stored:** In `.sysref` file (memory + disk)
|
||||
- **Expires:** After 1 hour OR when cleanup/reset runs
|
||||
- **Removed:** When toolkit is deleted
|
||||
|
||||
## Future Enhancements
|
||||
|
||||
Potential modules that could benefit:
|
||||
|
||||
1. **WordPress Health Check**
|
||||
- Check if slow WP sites correlate with network/load issues
|
||||
|
||||
2. **Backup Analyzer**
|
||||
- Check if backup failures correlate with disk/load issues
|
||||
|
||||
3. **Email Troubleshooter**
|
||||
- Check if email issues correlate with network/disk problems
|
||||
|
||||
4. **Resource Monitor**
|
||||
- Compare current metrics vs health check baseline
|
||||
|
||||
## Summary
|
||||
|
||||
Session Intelligence transforms the toolkit from **isolated modules** into an **integrated diagnostic platform**. Each module can now make smarter, context-aware recommendations based on the complete picture of what's happening on the server RIGHT NOW.
|
||||
|
||||
No historical data needed. No complex trending. Just smart, session-aware troubleshooting.
|
||||
Reference in New Issue
Block a user