Cross-cutting Concerns¶
Auto-Reconnection¶
stateDiagram-v2
[*] --> Connected
Connected --> KeepaliveFail: SSH keepalive timeout<br/>(every 15s)
Connected --> TransportError: Network / Xray error
KeepaliveFail --> Cleanup: Close local listeners<br/>+ SSH connection
TransportError --> Cleanup
Cleanup --> Wait_2s: Attempt 1-8
Cleanup --> Wait_4s: Attempt 9
Cleanup --> Wait_8s: Attempt 10
Cleanup --> Wait_16s: Attempt 11
Cleanup --> Wait_30s: Attempt 12+
Wait_2s --> Reconnecting
Wait_4s --> Reconnecting
Wait_8s --> Reconnecting
Wait_16s --> Reconnecting
Wait_30s --> Reconnecting
Reconnecting --> Connected: Success<br/>(resets backoff)
Reconnecting --> Cleanup: Failure Both the forward tunnel (client) and reverse tunnel (server) implement exponential backoff reconnection:
- Backoff: 2s -> 4s -> 8s -> 16s -> 30s (max)
- Keepalive: SSH keepalive every 15 seconds; on failure, triggers reconnect
- TCP Keepalive: 30-second TCP keepalive on all connections
- Forward tunnel cleanup: On keepalive failure, all local listeners are closed first (unblocking Accept loops), then the SSH connection is closed, triggering the reconnect loop
Dynamic User Management¶
The SSH server re-reads authorized_keys on every authentication attempt. This means:
tw create usertakes effect immediately -- no need to restarttw serve- Revoking a user (removing their key from
authorized_keys) takes effect on the next connection attempt - Each key entry can have independent
permitopenrestrictions
Permitopen Enforcement¶
When the SSH server authenticates a client, it parses permitopen options from the matching authorized_keys entry and stores them in gossh.Permissions.Extensions["permitopen"]. On every direct-tcpip channel request, the server checks the target host:port against the permitted list. If no permitopen options are set (e.g., the server's own key), all destinations are allowed.
Relay Config Updates¶
sequenceDiagram
participant S as Server (tw)
participant TX as Temp Xray (configurable port)
participant R as Relay (SSH)
participant XR as Relay Xray
S ->> TX: Start temporary Xray instance
S ->> R: SSH through temp tunnel
S ->> R: sudo cat /usr/local/etc/xray/config.json
R -->> S: JSON config
S ->> S: Parse JSON, check duplicate UUID, add client
S ->> R: sudo tee config.json (updated)
S ->> XR: gRPC AlterInbound / AddUserOperation
alt gRPC fails
S ->> R: sudo systemctl restart xray
end
S ->> TX: Stop temporary Xray
Note over TX: Tunnel destroyed (expected) tw create user updates the relay's Xray config remotely:
- Starts a temporary Xray instance (dokodemo-door on
server.temp_xray_port+1, default 59001, separate fromtw serve) - SSHs into the relay through the temporary tunnel using the server's SSH key
- Reads
/usr/local/etc/xray/config.jsonviasudo cat - Parses the JSON, checks for duplicate UUID, adds new client entry
- Writes the updated config via
sudo tee /usr/local/etc/xray/config.json - Hot-adds the UUID via the Xray gRPC API (
AlterInbound/AddUserOperation); falls back tosystemctl restart xrayif the API call fails
Transport Protocol¶
Xray VLESS + XHTTP over TLS:
- VLESS: Lightweight proxy protocol with UUID-based authentication
- XHTTP: HTTP-based transport that splits data into standard HTTP requests/responses
- TLS: Terminated by Caddy on the relay; SNI matches the relay domain
- Result: Traffic is indistinguishable from normal HTTPS browsing to firewalls and DPI
Config Change Detection¶
flowchart LR
subgraph "On Startup"
A[Read config.yaml] --> B["SHA-256 hash<br/>(FileHash)"]
B --> C["Save as cfgHash"]
end
subgraph "Dashboard Poll (every 3s)"
D["/api/status"] --> E["FileHash()"]
E --> F{cfgHash ==<br/>current hash?}
F -->|Yes| G["No change"]
F -->|No| H["Show banner:<br/>'Config changed.<br/>Restart to apply.'"]
end
subgraph "On Restart"
I[User clicks Restart] --> J[Reload config from disk]
J --> K[Apply new log level]
K --> L["Save new cfgHash"]
end
style C fill:#1565C0,color:#fff
style H fill:#E65100,color:#fff
style L fill:#00897B,color:#fff config.FileHash() computes a SHA-256 digest of the raw config file on disk. Unlike Config.Hash() (which marshals the parsed struct), FileHash() captures all changes including unknown fields, comments, and formatting differences.
On server or client start, the current file hash is saved as cfgHash in the respective manager struct. The ConfigChanged() method on Ops compares the current file hash against the startup hash:
func (o *Ops) ConfigChanged() bool {
currentHash := config.FileHash()
// Compare against server's cfgHash if running...
// Compare against client's cfgHash if running...
}
The dashboard polls ConfigChanged() every 3 seconds via the /api/status endpoint. When a change is detected, the UI shows a banner:
"Configuration has changed. Restart/Reconnect to apply."
Changes take effect on restart
Config changes are never applied live to a running server or client. The user must explicitly restart (server) or reconnect (client) to pick up the new configuration. On restart, the config is reloaded from disk and the new log level is applied via logging.SetLevel().
Mode Enforcement¶
The mode field in config.yaml can be "server", "client", or empty. When set:
- CLI:
requireMode()inroot.gochecks the configured mode before executing a command. Server-only commands (e.g.,tw serve,tw create user) return an error in client mode, and vice versa. - Dashboard: The
pageData.Modefield is passed to all templates. Navigation links and page content adapt -- server-only pages (relay management, user management) are hidden when mode is"client".
func requireMode(expected string) error {
cfg, err := config.Load()
if cfg.Mode != "" && cfg.Mode != expected {
return fmt.Errorf("this is a %s command, but tw is configured in %s mode",
expected, cfg.Mode)
}
return nil
}
User Online Tracking¶
flowchart TD
REQ["/api/users/online"] --> CACHE{Cache valid?<br/>TTL 20s}
CACHE -->|Yes| RET[Return cached result]
CACHE -->|No| QUERY["GetOnlineUsers()<br/><i>gRPC via server tunnel → relay :10085</i>"]
QUERY --> PRIMARY["QueryStats 'online'<br/><i>user>>>UUID>>>online</i>"]
PRIMARY --> HAS{Stats<br/>available?}
HAS -->|Yes| MAP[Map UUID → online]
HAS -->|No| FALLBACK["QueryStats 'user>>>'<br/><i>Reset: true, check traffic</i>"]
FALLBACK --> TRAFFIC{Non-zero<br/>uplink/downlink?}
TRAFFIC -->|Yes| ONLINE[Mark UUID online]
TRAFFIC -->|No| OFFLINE[Mark UUID offline]
ONLINE --> MAP
OFFLINE --> MAP
MAP --> SAVE["Cache result<br/>(20s TTL)"]
SAVE --> RET
style REQ fill:#1565C0,color:#fff
style RET fill:#00897B,color:#fff
style FALLBACK fill:#E65100,color:#fff The server tracks which client users are currently connected by polling the relay's Xray Stats API:
- Stats query:
GetOnlineUsers()connects to the relay's Xray gRPC API (port10085) via the server's already-running Xray tunnel (usingsshThroughServerTunnel(), which avoids creating a temporary Xray instance). - Primary method: Queries
QueryStatswith pattern"online"looking foruser>>>{UUID}>>>onlinestats entries (XraystatsUserOnlinefeature). - Fallback: If no online stats are available, falls back to traffic-based detection: queries
user>>>pattern withReset_: true, and any UUID with non-zerotraffic>>>uplinkortraffic>>>downlinksince the last poll is considered online. The server's own UUID is excluded. - Caching: Results are cached with a 20-second TTL (
onlinePolltimestamp). Concurrent refresh attempts return the stale cache instead of blocking. - Relay setup:
EnsureRelayStats()runs at server startup, patching the relay's Xray config to addstats,StatsService, andpolicy(both system-level and user-level stats) if missing. If patching occurs, Xray is restarted on the relay.
The dashboard's users page calls /api/users/online which returns the online map. Online users are shown with a badge in the UI.
Relay compatibility
The statsUserOnline feature requires Xray v26.2.6+. Older relays fall back to traffic-based detection, which has lower granularity (a user appears online only while actively transferring data).
Dashboard Architecture¶
The dashboard is a server-rendered web application using Go templates and vanilla JavaScript. All assets (templates, CSS, JS, xterm.js vendor files) are embedded in the binary via go:embed.
Log Streaming¶
The teeHandler wraps the existing slog.Handler to duplicate every log record into a logBuffer ring buffer (capacity: 500 entries). This architecture preserves the original handler chain -- including the dynamic slog.LevelVar for runtime log level changes -- while feeding the dashboard.
flowchart LR
SLOG["slog.Default()"] --> TEE["teeHandler"]
TEE --> INNER["slog.TextHandler<br/>→ stderr"]
TEE --> BUF["logBuffer<br/><i>ring, 500 entries</i>"]
BUF --> S1["Subscriber 1<br/><i>chan cap 64</i>"]
BUF --> S2["Subscriber 2<br/><i>chan cap 64</i>"]
BUF --> SN["Subscriber N<br/><i>chan cap 64</i>"]
S1 --> SSE1["SSE /api/logs"]
S2 --> SSE2["SSE /api/logs"]
SN --> SSEN["SSE /api/logs"]
style SLOG fill:#1565C0,color:#fff
style TEE fill:#1976D2,color:#fff
style BUF fill:#7E57C2,color:#fff
style SSE1 fill:#00897B,color:#fff
style SSE2 fill:#00897B,color:#fff
style SSEN fill:#00897B,color:#fff Each SSE subscriber gets a buffered channel (capacity 64). Slow subscribers have events dropped rather than blocking the log pipeline.
Progress Events (SSE)¶
Long-running operations (relay provisioning, server start, user creation) report progress via ProgressFunc callbacks. The SSE hub (sseHub) manages sessions:
sequenceDiagram
participant B as Browser
participant API as REST API
participant HUB as SSE Hub
participant OPS as Ops Layer
B ->> API: POST /api/server/start
API ->> HUB: sseHub.create()
HUB -->> API: sessionID + ProgressFunc
API -->> B: {"session_id": "abc123"}
B ->> HUB: EventSource /api/events/abc123
OPS ->> OPS: Start operation...
OPS ->> HUB: ProgressFunc(step 1/5, "Loading config")
HUB -->> B: SSE event: {step:1, total:5, msg:"Loading config"}
OPS ->> HUB: ProgressFunc(step 2/5, "Starting SSH")
HUB -->> B: SSE event: {step:2, total:5, msg:"Starting SSH"}
OPS ->> HUB: ProgressFunc(step 5/5, "completed")
HUB -->> B: SSE event: {step:5, total:5, status:"completed"}
HUB ->> HUB: Close session - Dashboard initiates an operation via a REST API call (e.g.,
POST /api/server/start) - The handler creates an SSE session with
sseHub.create(), which returns a session ID and aProgressFunc - The session ID is returned to the browser in the JSON response
- The browser opens an
EventSourceconnection to/api/events/{sessionID} - Progress events flow:
opsmethod ->ProgressFunc->sseSession.ch-> SSE stream -> browser - Terminal events (
status: "completed"withstep == total, orstatus: "failed") close the session
WebSocket SSH Terminal¶
sequenceDiagram
participant XT as xterm.js (Browser)
participant WS as WebSocket Handler
participant SSH as SSH Session
participant R as Relay (PTY)
XT ->> WS: WebSocket upgrade
WS ->> SSH: ops.RelaySSH() via Xray tunnel
SSH ->> R: Request PTY (xterm-256color, 80x24)
R -->> SSH: PTY allocated
par stdout goroutine
R -->> SSH: terminal output
SSH -->> WS: binary message
WS -->> XT: render in xterm.js
and stdin goroutine
XT ->> WS: binary (keystrokes)
WS ->> SSH: stdin
SSH ->> R: input
end
XT ->> WS: text: {"type":"resize","cols":120,"rows":40}
WS ->> SSH: WindowChange(120, 40) The /api/relay/ssh endpoint provides a browser-based SSH terminal to the relay:
- Browser opens a WebSocket connection
- Server upgrades the connection and establishes an SSH session to the relay via the Xray tunnel (
ops.RelaySSH()) - A PTY is requested (
xterm-256color, 80x24) - Two goroutines bridge the streams:
- SSH stdout -> WebSocket: binary messages carry terminal output
- WebSocket -> SSH stdin: binary messages carry keyboard input; text messages carry JSON control frames (
{"type":"resize","cols":120,"rows":40"})
- The browser renders the terminal using xterm.js with the fit addon for automatic resizing
Build-time Version Injection¶
The application version is stored in a single package-level variable:
This variable is overridden at build time via Go's -ldflags -X mechanism. The Makefile auto-detects the version from git describe --tags --always --dirty, while the GitHub Actions release workflow injects the exact tag name (e.g. v1.2.3).
The version is consumed in two places:
- CLI: Cobra's
Versionfield on the root command, enablingtw --version - Dashboard API: The
/api/statusendpoint includes theversionfield in its JSON response
Xray Version Pinning¶
The Xray version installed on the relay is controlled by a single constant:
This constant is injected into both:
- cloud-init.yaml.tmpl: used during
tw create relay-serverto provision new relays - install-script.sh.tmpl: used for manual relay setup via
tw dashboard
Both templates pass --version {{ .XrayVersion }} to the official Xray install script. This ensures the relay runs the exact same Xray version as the in-process xray-core dependency in go.mod, preventing protocol incompatibilities.
Version sync
When upgrading github.com/xtls/xray-core in go.mod, the XrayVersion constant in generate.go must be updated to match. Existing relays will continue running the old version until reprovisioned or manually updated.
Key Dependencies¶
| Dependency | Version | Purpose |
|---|---|---|
github.com/xtls/xray-core | v26.2.6 | In-process VLESS + XHTTP transport |
golang.org/x/crypto/ssh | v0.31.0 | Embedded SSH server + client tunnels |
github.com/spf13/cobra | v1.8.1 | CLI framework |
github.com/google/uuid | v1.6.0 | UUID generation for Xray clients |
github.com/gorilla/websocket | v1.5.3 | WebSocket for dashboard SSH terminal |
google.golang.org/grpc | v1.69.2 | gRPC API server + relay stats queries |
gopkg.in/yaml.v3 | v3.0.1 | Configuration file handling |