Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 3 additions & 1 deletion architecture/compute-runtimes.md
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,8 @@ when a sandbox create request asks for GPU resources.
| Docker | Local development with Docker available. | Container plus nested sandbox namespace. | Uses host networking so loopback gateway endpoints work from the supervisor. |
| Podman | Rootless or single-machine deployments. | Container plus nested sandbox namespace. | Uses the Podman REST API, OCI image volumes, and CDI GPU devices when available. |
| Kubernetes | Cluster deployment through Helm. | Pod plus nested sandbox namespace. | Uses Kubernetes API objects, service accounts, secrets, PVC-backed workspace storage, and GPU resources. |
| VM | Experimental microVM isolation. | Per-sandbox libkrun VM. | Gateway spawns `openshell-driver-vm` as a subprocess over a private, state-local Unix socket. The VM driver boots a cached bootstrap `rootfs.ext4`, prepares requested OCI images inside a bootstrap VM with `umoci`, attaches the prepared image disk read-only, and gives each sandbox a writable `overlay.ext4` for merged-root changes and runtime material. The driver persists each accepted launch request beside the overlay and restarts those VMs on driver startup without recreating the overlay. |
| VM | Experimental microVM isolation. | Per-sandbox libkrun VM. | Managed endpoint-backed driver. The gateway spawns `openshell-driver-vm`, waits for its Unix socket, and then consumes it through the same remote `compute_driver.proto` path used by unmanaged endpoint drivers. The VM driver boots a cached bootstrap `rootfs.ext4`, prepares requested OCI images inside a bootstrap VM with `umoci`, attaches the prepared image disk read-only, and gives each sandbox a writable `overlay.ext4` for merged-root changes and runtime material. The driver persists each accepted launch request beside the overlay and restarts those VMs on driver startup without recreating the overlay. |
| Extension | Out-of-tree drivers operated alongside the gateway. | Whatever boundary the driver implements. | Selected by a non-reserved custom `compute_drivers = ["<name>"]` entry with `[openshell.drivers.<name>].socket_path`, or at launch time by pairing `--drivers <name>` with `--compute-driver-socket=<path>`. Reserved built-in names such as `vm`, `docker`, `podman`, and `kubernetes` cannot be used as unmanaged socket endpoints. The gateway connects to a UDS the operator already provisioned, runs `GetCapabilities`, logs the advertised `driver_name`, and dispatches all sandbox lifecycle calls through `compute_driver.proto`. The driver process and socket lifecycle are operator-owned; the gateway does not spawn, supervise, or remove unmanaged extension drivers. The trust boundary is the socket's filesystem permissions: the operator must ensure only the gateway uid can read/write it. |

Per-sandbox CPU and memory values currently enter the driver layer through
template resource limits. Docker and Podman apply them as runtime limits.
Expand Down Expand Up @@ -84,6 +85,7 @@ The supervisor must be available inside each sandbox workload:
| Podman | Read-only OCI image volume containing the supervisor binary. |
| Kubernetes | Sandbox pod image or pod template configuration. |
| VM | Embedded in the guest rootfs bundle. |
| Extension | Defined by the out-of-tree driver. |

Driver-controlled environment variables must override sandbox image or template
values for sandbox ID, sandbox name, gateway endpoint, relay socket path, TLS
Expand Down
70 changes: 64 additions & 6 deletions crates/openshell-core/src/config.rs
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@
//! Configuration management for `OpenShell` components.

use serde::{Deserialize, Serialize};
use std::collections::BTreeMap;
use std::fmt;
#[cfg(unix)]
use std::io::{Read, Write};
Expand Down Expand Up @@ -69,6 +70,27 @@ impl ComputeDriverKind {
}
}

/// Normalize a configured compute driver name.
///
/// Built-in driver names and custom remote driver names share the same
/// selection namespace. The normalized value is lowercase ASCII and may contain
/// letters, digits, `-`, and `_`.
pub fn normalize_compute_driver_name(value: &str) -> Result<String, String> {
let value = value.trim();
if value.is_empty() {
return Err("compute driver name cannot be empty".to_string());
}
if !value
.bytes()
.all(|b| b.is_ascii_alphanumeric() || matches!(b, b'-' | b'_'))
{
return Err(format!(
"invalid compute driver name '{value}'. use ASCII letters, digits, '-' or '_'"
));
}
Ok(value.to_ascii_lowercase())
}

impl fmt::Display for ComputeDriverKind {
fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
f.write_str(self.as_str())
Expand Down Expand Up @@ -358,7 +380,14 @@ pub struct Config {
/// The config shape allows multiple drivers so the gateway can evolve
/// toward multi-backend routing. Current releases require exactly one
/// configured driver.
pub compute_drivers: Vec<ComputeDriverKind>,
pub compute_drivers: Vec<String>,

/// Operator-provided endpoints for named remote compute drivers.
///
/// This is populated by CLI/env inputs such as `--compute-driver-socket`.
/// TOML-authored endpoints live under `[openshell.drivers.<name>]` and are
/// resolved by the gateway config loader.
pub compute_driver_endpoints: BTreeMap<String, PathBuf>,

/// TTL for SSH session tokens, in seconds. 0 disables expiry.
pub ssh_session_ttl_secs: u64,
Expand Down Expand Up @@ -559,6 +588,7 @@ impl Config {
gateway_jwt: None,
database_url: String::new(),
compute_drivers: vec![],
compute_driver_endpoints: BTreeMap::new(),
ssh_session_ttl_secs: default_ssh_session_ttl_secs(),
grpc_rate_limit_requests: None,
grpc_rate_limit_window_secs: None,
Expand Down Expand Up @@ -614,11 +644,27 @@ impl Config {

/// Create a new configuration with the configured compute drivers.
#[must_use]
pub fn with_compute_drivers<I>(mut self, drivers: I) -> Self
pub fn with_compute_drivers<I, D>(mut self, drivers: I) -> Self
where
I: IntoIterator<Item = ComputeDriverKind>,
I: IntoIterator<Item = D>,
D: ToString,
{
self.compute_drivers = drivers.into_iter().collect();
self.compute_drivers = drivers
.into_iter()
.map(|driver| driver.to_string())
.collect();
self
}

/// Register a Unix domain socket endpoint for a named remote driver.
#[must_use]
pub fn with_compute_driver_endpoint(
mut self,
name: impl Into<String>,
socket: impl Into<PathBuf>,
) -> Self {
self.compute_driver_endpoints
.insert(name.into(), socket.into());
self
}

Expand Down Expand Up @@ -766,8 +812,8 @@ mod tests {
use super::is_reachable_unix_socket;
use super::{
ComputeDriverKind, Config, DEFAULT_SERVICE_ROUTING_DOMAIN, GatewayJwtConfig, detect_driver,
docker_host_unix_socket_path, is_unix_socket, podman_socket_candidates_from_env,
podman_socket_responds,
docker_host_unix_socket_path, is_unix_socket, normalize_compute_driver_name,
podman_socket_candidates_from_env, podman_socket_responds,
};
#[cfg(unix)]
use std::io::{Read as _, Write as _};
Expand Down Expand Up @@ -803,6 +849,18 @@ mod tests {
assert!(err.contains("unsupported compute driver 'firecracker'"));
}

#[test]
fn compute_driver_name_normalization_accepts_builtin_and_custom_names() {
assert_eq!(normalize_compute_driver_name(" VM ").unwrap(), "vm");
assert_eq!(
normalize_compute_driver_name("Kyma_GPU-1").unwrap(),
"kyma_gpu-1"
);

let err = normalize_compute_driver_name("kyma/gpu").unwrap_err();
assert!(err.contains("invalid compute driver name"));
}

#[test]
fn config_defaults_to_loopback_bind_address() {
let expected: SocketAddr = "127.0.0.1:17670".parse().expect("valid address");
Expand Down
188 changes: 184 additions & 4 deletions crates/openshell-server/src/cli.rs
Original file line number Diff line number Diff line change
Expand Up @@ -109,7 +109,17 @@ struct RunArgs {
value_delimiter = ',',
value_parser = parse_compute_driver
)]
drivers: Vec<ComputeDriverKind>,
drivers: Vec<String>,

/// Path to a Unix domain socket served by a remote compute driver
/// implementing `compute_driver.proto`.
///
/// When set, the socket is associated with the single driver name supplied
/// by `--drivers` or `OPENSHELL_DRIVERS`. Reserved built-in driver names
/// such as Docker, Podman, Kubernetes, and VM do not accept socket
/// endpoints.
#[arg(long, env = "OPENSHELL_COMPUTE_DRIVER_SOCKET")]
compute_driver_socket: Option<PathBuf>,

/// Disable TLS entirely — listen on plaintext HTTP.
/// Use this when the gateway sits behind a reverse proxy or tunnel
Expand Down Expand Up @@ -235,6 +245,7 @@ async fn run_from_args(mut args: RunArgs, matches: ArgMatches) -> Result<()> {
if let Some(file) = file.as_ref() {
merge_file_into_args(&mut args, &file.openshell.gateway, &matches);
}
normalize_compute_driver_socket_args(&mut args, &matches)?;

let local_tls = apply_runtime_defaults(&mut args)?;
let local_jwt = defaults::complete_local_jwt_config()?;
Expand Down Expand Up @@ -371,6 +382,13 @@ async fn run_from_args(mut args: RunArgs, matches: ArgMatches) -> Result<()> {
args.grpc_rate_limit_requests,
args.grpc_rate_limit_window_seconds,
)?;
if let Some(socket) = args.compute_driver_socket.clone() {
let driver = args
.drivers
.first()
.expect("normalize_compute_driver_socket_args sets a driver for socket endpoints");
config = config.with_compute_driver_endpoint(driver.clone(), socket);
}

if let Some(ttl) = file
.as_ref()
Expand Down Expand Up @@ -457,8 +475,8 @@ async fn run_from_args(mut args: RunArgs, matches: ArgMatches) -> Result<()> {
.into_diagnostic()
}

fn parse_compute_driver(value: &str) -> std::result::Result<ComputeDriverKind, String> {
value.parse()
fn parse_compute_driver(value: &str) -> std::result::Result<String, String> {
openshell_core::config::normalize_compute_driver_name(value)
}

fn resolve_config_path(args: &RunArgs) -> Result<Option<PathBuf>> {
Expand Down Expand Up @@ -657,10 +675,52 @@ fn validate_grpc_rate_limit_args(requests: Option<u64>, window_seconds: Option<u
Ok(())
}

fn normalize_compute_driver_socket_args(args: &mut RunArgs, matches: &ArgMatches) -> Result<()> {
let Some(socket) = args.compute_driver_socket.as_ref() else {
return Ok(());
};
if socket.as_os_str().is_empty() {
return Err(miette::miette!(
"--compute-driver-socket must not be an empty path"
));
}
if arg_defaulted(matches, "drivers") {
return Err(miette::miette!(
"--compute-driver-socket requires --drivers <name> or OPENSHELL_DRIVERS=<name> to select a non-reserved compute driver name"
));
}

match args.drivers.as_slice() {
[driver] => {
let driver = openshell_core::config::normalize_compute_driver_name(driver)
.map_err(|err| miette::miette!("{err}"))?;
if matches!(
driver.parse::<ComputeDriverKind>().ok(),
Some(
ComputeDriverKind::Docker
| ComputeDriverKind::Podman
| ComputeDriverKind::Kubernetes
| ComputeDriverKind::Vm
)
) {
return Err(miette::miette!(
"--compute-driver-socket cannot be combined with reserved built-in compute driver '{driver}'"
));
}
args.drivers[0] = driver;
Ok(())
}
drivers => Err(miette::miette!(
"--compute-driver-socket requires exactly one compute driver name, got: {}",
drivers.join(",")
)),
}
}

fn effective_single_driver(args: &RunArgs) -> Option<ComputeDriverKind> {
match args.drivers.as_slice() {
[] => openshell_core::config::detect_driver(),
[driver] => Some(*driver),
[driver] => driver.parse().ok(),
_ => None,
}
}
Expand Down Expand Up @@ -1561,6 +1621,126 @@ ssh_session_ttl_secs = 1234
assert!(!super::is_singleplayer_driver(&multi));
}

#[test]
fn compute_driver_socket_flag_uses_explicit_driver_name() {
let _lock = ENV_LOCK
.lock()
.unwrap_or_else(std::sync::PoisonError::into_inner);
let _g1 = EnvVarGuard::remove("OPENSHELL_COMPUTE_DRIVER_SOCKET");
let _g2 = EnvVarGuard::remove("OPENSHELL_DRIVERS");

let (mut args, matches) = parse_with_args(&[
"openshell-gateway",
"--db-url",
"sqlite::memory:",
"--drivers",
"Kyma",
"--compute-driver-socket",
"/run/openshell/kyma.sock",
]);
super::normalize_compute_driver_socket_args(&mut args, &matches).unwrap();
assert_eq!(
args.compute_driver_socket.as_deref(),
Some(std::path::Path::new("/run/openshell/kyma.sock"))
);
assert_eq!(args.drivers, ["kyma"]);
assert!(super::effective_single_driver(&args).is_none());
}

#[test]
fn compute_driver_socket_requires_explicit_driver_name() {
let _lock = ENV_LOCK
.lock()
.unwrap_or_else(std::sync::PoisonError::into_inner);
let _g1 = EnvVarGuard::remove("OPENSHELL_COMPUTE_DRIVER_SOCKET");
let _g2 = EnvVarGuard::remove("OPENSHELL_DRIVERS");

let (mut args, matches) = parse_with_args(&[
"openshell-gateway",
"--db-url",
"sqlite::memory:",
"--compute-driver-socket",
"/run/openshell/kyma.sock",
]);
let err = super::normalize_compute_driver_socket_args(&mut args, &matches).unwrap_err();

assert!(
err.to_string().contains("requires --drivers <name>"),
"unexpected error: {err}"
);
}

#[test]
fn compute_driver_socket_rejects_reserved_builtin_drivers() {
let _lock = ENV_LOCK
.lock()
.unwrap_or_else(std::sync::PoisonError::into_inner);
let _g1 = EnvVarGuard::remove("OPENSHELL_COMPUTE_DRIVER_SOCKET");
let _g2 = EnvVarGuard::remove("OPENSHELL_DRIVERS");

let (mut args, matches) = parse_with_args(&[
"openshell-gateway",
"--db-url",
"sqlite::memory:",
"--drivers",
"docker",
"--compute-driver-socket",
"/run/openshell/extension.sock",
]);
let err = super::normalize_compute_driver_socket_args(&mut args, &matches).unwrap_err();
assert!(
err.to_string()
.contains("cannot be combined with reserved built-in compute driver 'docker'"),
"unexpected error: {err}"
);
}

#[test]
fn compute_driver_socket_rejects_vm_endpoint() {
let _lock = ENV_LOCK
.lock()
.unwrap_or_else(std::sync::PoisonError::into_inner);
let _g1 = EnvVarGuard::remove("OPENSHELL_COMPUTE_DRIVER_SOCKET");
let _g2 = EnvVarGuard::remove("OPENSHELL_DRIVERS");

let (mut args, matches) = parse_with_args(&[
"openshell-gateway",
"--db-url",
"sqlite::memory:",
"--drivers",
"vm",
"--compute-driver-socket",
"/run/openshell/vm.sock",
]);
let err = super::normalize_compute_driver_socket_args(&mut args, &matches).unwrap_err();
assert!(
err.to_string()
.contains("cannot be combined with reserved built-in compute driver 'vm'"),
"unexpected error: {err}"
);
}

#[test]
fn compute_driver_socket_reads_from_env_var() {
let _lock = ENV_LOCK
.lock()
.unwrap_or_else(std::sync::PoisonError::into_inner);
let _g1 = EnvVarGuard::set(
"OPENSHELL_COMPUTE_DRIVER_SOCKET",
"/var/run/openshell/kyma.sock",
);
let _g2 = EnvVarGuard::set("OPENSHELL_DRIVERS", "kyma");

let (mut args, matches) =
parse_with_args(&["openshell-gateway", "--db-url", "sqlite::memory:"]);
super::normalize_compute_driver_socket_args(&mut args, &matches).unwrap();
assert_eq!(
args.compute_driver_socket.as_deref(),
Some(std::path::Path::new("/var/run/openshell/kyma.sock"))
);
assert_eq!(args.drivers, ["kyma"]);
}

#[test]
fn file_populates_service_routing_fields() {
let _lock = ENV_LOCK
Expand Down
Loading
Loading