Skip to content

Add network-ping command for fast ICMP connectivity testing #75

@inureyes

Description

@inureyes

Problem

Currently, the bssh ping command tests SSH connectivity by establishing a full SSH connection, authenticating, and executing a command (echo 'pong'). While this is useful for verifying SSH service availability, it's relatively slow (2-5 seconds per host due to SSH handshake and authentication overhead).

Users often want to quickly check basic network connectivity across all cluster nodes without the SSH overhead, similar to the traditional ping command that uses ICMP packets and responds in milliseconds.

Proposed Solution

Add a new network-ping command that performs fast ICMP ping tests across all cluster nodes to verify basic network connectivity.

Implementation Details

CLI Changes

Add a new subcommand to src/cli.rs:

#[derive(Debug, Subcommand)]
pub enum Commands {
    // ... existing commands ...
    
    /// Test network connectivity using ICMP ping (fast)
    #[command(visible_alias = "nping")]
    NetworkPing {
        /// Number of ping packets to send (default: 4)
        #[arg(short = 'c', long, default_value = "4")]
        count: u32,
        
        /// Timeout for each ping in seconds (default: 2)
        #[arg(short = 't', long, default_value = "2")]
        timeout: u64,
    },
}

Core Implementation

Create src/commands/network_ping.rs:

  • Use a Rust ICMP ping library (see options below)
  • Perform parallel ping tests across all nodes
  • Display results with latency statistics (min/avg/max/stddev)
  • Show packet loss percentage
  • Color-coded output (green for good latency, yellow for medium, red for high/loss)

Suggested Libraries

Option 1: surge-ping (Recommended)

  • Pure Rust, async-friendly with tokio
  • Cross-platform (Linux, macOS, Windows)
  • No external dependencies
  • Supports both privileged and unprivileged ICMP
[dependencies]
surge-ping = "0.8"

Option 2: fastping-rs

  • Simple API, battle-tested
  • Requires raw sockets (may need elevated privileges)

Option 3: pnet (lower-level)

  • Full network protocol suite
  • More complex but very flexible
  • Requires privileged access

Example Implementation Skeleton

use anyhow::Result;
use surge_ping::{Client, Config, ICMP, PingIdentifier, PingSequence};
use std::net::IpAddr;
use std::time::Duration;
use tokio::time::timeout;

pub async fn network_ping_nodes(
    nodes: Vec<Node>,
    count: u32,
    ping_timeout: u64,
    max_parallel: usize,
) -> Result<()> {
    // Create ICMP client
    let client = Client::new(&Config::default())?;
    
    // Create tasks for each node
    let tasks: Vec<_> = nodes.iter().map(|node| {
        let client = client.clone();
        let host = node.host.clone();
        tokio::spawn(async move {
            ping_host(&client, &host, count, ping_timeout).await
        })
    }).collect();
    
    // Execute with concurrency limit
    // ... parallel execution logic ...
    
    // Display results with statistics
    // ... formatting and output ...
    
    Ok(())
}

async fn ping_host(
    client: &Client,
    host: &str,
    count: u32,
    timeout_secs: u64,
) -> Result<PingStats> {
    let addr: IpAddr = host.parse()?;
    let mut pinger = client.pinger(addr, PingIdentifier(rand::random())).await;
    
    let mut latencies = Vec::new();
    let mut lost = 0;
    
    for seq in 0..count {
        match timeout(
            Duration::from_secs(timeout_secs),
            pinger.ping(PingSequence(seq as u16), &[])
        ).await {
            Ok(Ok((_, duration))) => latencies.push(duration),
            _ => lost += 1,
        }
    }
    
    Ok(PingStats::from_latencies(latencies, lost, count))
}

Output Format

▶ Network Ping Test Results (12 nodes)

  ● 10.100.64.101  4/4 packets  min/avg/max = 0.5/1.2/2.1 ms
  ● 10.100.64.102  4/4 packets  min/avg/max = 0.8/1.5/2.3 ms
  ● 10.100.64.103  3/4 packets  min/avg/max = 1.2/2.1/3.0 ms  (25% loss)
  ● 10.100.64.104  0/4 packets  - Host unreachable

━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Summary: 3 reachable, 1 unreachable (25% success rate)
Average latency: 1.6 ms

Usage Examples

# Basic network ping to all cluster nodes
bssh -C production network-ping

# Custom packet count and timeout
bssh -C production network-ping -c 10 -t 1

# Quick alias
bssh -C production nping

# With specific hosts
bssh -H "host1,host2,host3" network-ping

Comparison: ping vs network-ping

Command Purpose Speed Tests
ping SSH connectivity test 2-5s per node SSH service + auth
network-ping Network connectivity test <100ms per node ICMP reachability

Files to Modify/Create

src/
├── cli.rs                       # Add NetworkPing subcommand
├── commands/
│   ├── mod.rs                  # Export network_ping module
│   └── network_ping.rs         # New: ICMP ping implementation
└── main.rs                      # Route NetworkPing command

Cargo.toml                       # Add surge-ping dependency
README.md                        # Document network-ping command

Dependencies to Add

[dependencies]
surge-ping = "0.8"               # ICMP ping library

Security Considerations

ICMP Raw Sockets:

  • On Linux: May require CAP_NET_RAW capability or root privileges
  • On macOS: Generally works without special permissions
  • On Windows: Requires administrator privileges

Solutions:

  1. Use surge-ping with unprivileged mode (SOCK_DGRAM) when possible
  2. Document privilege requirements in README
  3. Fall back to TCP ping if ICMP is unavailable
  4. Provide clear error messages if permissions are insufficient

Testing Plan

  1. Unit Tests:

    • Ping parsing and statistics calculation
    • Timeout handling
    • Packet loss detection
  2. Integration Tests:

    • Single host ping
    • Multi-host parallel ping
    • Timeout scenarios
    • Unreachable hosts
  3. Manual Testing:

    • Test on Linux (various distros)
    • Test on macOS
    • Test with different privilege levels
    • Compare with system ping command

Alternative Implementations

If ICMP proves problematic, consider:

  1. TCP Ping: Connect to SSH port without authentication
  2. HTTP Ping: If nodes have web services
  3. Hybrid: Try ICMP first, fall back to TCP

Related Issues

  • Complements existing ping command (SSH connectivity test)
  • Part of broader cluster management features

Priority

Low - Nice to have feature for quick network checks, though ping command already provides SSH connectivity testing.

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions