Skip to content

DO-NOT-MERGE: RFC: NVMe-oF orchestrator coexistence — nvme-discoverd, ownership registry, and exclusion list#3442

Draft
martin-belanger wants to merge 12 commits into
linux-nvme:masterfrom
martin-belanger:discoverd-rfcs
Draft

DO-NOT-MERGE: RFC: NVMe-oF orchestrator coexistence — nvme-discoverd, ownership registry, and exclusion list#3442
martin-belanger wants to merge 12 commits into
linux-nvme:masterfrom
martin-belanger:discoverd-rfcs

Conversation

@martin-belanger

@martin-belanger martin-belanger commented Jun 11, 2026

Copy link
Copy Markdown

This PR is for review only and will not be merged. It contains four RFC documents proposing a coordinated set of features for NVMe-oF orchestrator coexistence in nvme-cli 3.0. Please use inline comments to provide feedback on specific sections.

Tip: Markdown files are shown as raw text by default in the diff view. Click the "Display the rich diff" button (the document icon at the top-right of each file) to render them as formatted text — much easier to read.


Background

As NVMe-oF deployments grow, hosts increasingly run multiple tools that manage NVMe-oF connections: nvme-stas, dracut/initramfs scripts, and soon nvme-discoverd. Without coordination, these orchestrators can conflict — one may disconnect a controller that another is actively managing, or connect to a controller that another has deliberately excluded.

This RFC set proposes two libnvme building blocks to solve the coexistence problem (ownership registry and exclusion list), and a new daemon orchestrator (nvme-discoverd) that puts them to use.


The four RFCs

rfc-nvme-orchestrator-coexistence.md — Start here

The top-level document. Frames the two distinct conflict scenarios (accidental disconnect, accidental connect), introduces the two prevention mechanisms (ownership registry and exclusion list), defines a three-tier orchestrator hierarchy (raw commands → manual orchestrators → daemon orchestrators), and shows how nvme-discoverd and nvme-stas naturally partition work without requiring IPC coupling between them.

rfc-nvme-registry.md — Ownership registry

A lightweight, cooperative registry under /run/nvme/registry/ that lets orchestrators declare ownership of connected controllers. nvme disconnect-all consults the registry before acting so it never disconnects a controller managed by a running daemon. The registry is a libnvme building block: a C API (libnvmf_registry_*), a Python binding, and a nvme registry CLI command family. An implementation PR already exists: #3425.

rfc-nvme-exclusion.md — Exclusion list

A human-administered exclusion list at /etc/nvme/exclusions/ that prevents orchestrators from auto-connecting to controllers the administrator wants excluded. Its design center is auto-discovered controllers (mDNS, CDC DLP) where there is no configuration entry to remove. Managed via nvme exclusion CRUD commands. Enforcement is cooperative — each orchestrator reads the list and skips matching controllers; libnvme does not enforce it.

rfc-nvme-discoverd.md — nvme-discoverd

A new daemon proposed for nvme-cli 3.0. It manages NVMe-oF connections for statically configured controllers, NBFT boot controllers, FC-discovered controllers, and (in a future release) mDNS-discovered controllers. Key design choices:

  • Connect-only by design — never issues disconnects; eliminates the entire connect/disconnect ordering complexity (CtrlTerminator problem). TP8010 fabric zoning is out of scope; that belongs to nvme-stas.
  • One systemd transient unit per controller — nvme-discoverd never calls libnvmf_connect() directly; blocking /dev/nvme-fabrics writes happen in child processes managed by systemd. Achieves Daniel's no-threading goal.
  • Registers ownership via --owner discoverd on each generated unit; NBFT reconnects use --owner nbft to preserve the lifetime invariant set at boot.
  • Respects the exclusion list before each connect and reconnect; NBFT controllers bypass the check unconditionally.
  • Coexists with nvme-stas through natural discovery partitioning: nvme-stas owns mDNS/TCP, nvme-discoverd owns NBFT/FC/RDMA and manual config. The registry-check rule in nvme-stas eliminates bounce loops without any IPC coupling.

What feedback is sought

  • Architecture: Does the three-tier orchestrator model match how you see these tools being used in practice?
  • Ownership registry: Any concerns with the directory layout, API shape, or cooperative-only enforcement model?
  • Exclusion list: File format, use cases, nvme exclusion command design?
  • nvme-discoverd: Transient unit design, state file layout, startup reconciliation logic, retry policy, NBFT handling?
  • Coexistence rules: The owner=nbft lifetime invariant, NBFT exclusion bypass, unowned-connections-are-fair-game semantics — do these cover the cases you care about?

Related

Martin Belanger added 2 commits June 8, 2026 14:13
RFCs submitted for review only. Not intended to be merged.

Signed-off-by: Martin Belanger <martin.belanger@dell.com>
Signed-off-by: Martin Belanger <martin.belanger@dell.com>
Assisted-by: Claude:claude-sonnet-4-6 [Claude Code]
Martin Belanger added 6 commits June 11, 2026 18:17
Registry:
- Show only the guarded udev rule in §4; remove the naive unguarded
  version that preceded it — a skimming reader would copy the wrong one
- Aperiodic audit is now bidirectional: re-assert ownership for live
  controllers with missing entries, not only remove stale ones
- Document the residual TOCTOU race (dangerous direction) explicitly

Exclusion list:
- Fix Python match semantics: None/NULL caller params skip the
  comparison (same rule as C); they do not block a match
- Clarify host-iface= scope: only matches interface-pinned connections;
  manual controller= entries without host-iface= are not matched
- Add address normalization note (inet_pton/inet_ntop, not strcmp)
- connect-all --nbft is exempt from the exclusion list

Orchestrator coexistence:
- Scope "no D-Bus signal" to between orchestrators
- "No special-case logic required" scoped to disconnect logic in nvme-stas
- connect-all --nbft exemption noted in Tier 2 summary
- Add versioning note: these behaviors co-ship in nvme-cli 3.0 /
  nvme-stas 3.0; earlier pairings do not provide the guarantees

nvme-discoverd:
- Add RuntimeDirectoryPreserve=yes; without it systemd removes
  /run/nvme/discoverd on every stop/crash, destroying devid files
  needed by active ExecStop= lines and state files for crash recovery
- Add registry ownership check before adopting pre-existing connections:
  skip any controller owned by neither discoverd nor nbft
- Correct varlink -> D-Bus throughout: StartTransientUnit, StopUnit,
  RestartUnit, and JobRemoved are all org.freedesktop.systemd1.Manager
- Scope "never issues disconnects" to steady-state operation
- Clarify referral DC fizzle-out is intentional; add open questions:
  stale-cache aging (#2), shutdown vs. mounts (#3), NBFT root (#4)
- Minor: "three lines" -> "four lines" for --devid-file specifier;
  GPL-2.0-only (not -or-later); inline unit comments moved to their
  own lines (systemd unit syntax has no end-of-line comments)

Signed-off-by: Martin Belanger <martin.belanger@dell.com>
Assisted-by: Claude:claude-sonnet-4-6 [Claude Code]
Add §3.9 explaining the varlink/D-Bus situation for the systemd
interface. Not depending on D-Bus was a design wish — systemd is
migrating to varlink and new code should follow. However, varlink
is not yet sufficient: io.systemd.Unit.StartTransient was only added
in systemd v260 (March 2026), and StopUnit, RestartUnit, and
ResetFailedUnit have no varlink equivalent at all.

Since discoverd already depends on libsystemd for sd_event, using
sd-bus (systemd's own D-Bus implementation, part of libsystemd)
adds no new library dependency. The D-Bus usage is scoped to the
systemd interface only; discoverd's own client-facing socket (§3.7)
remains varlink-only.

Add open question #5 to track the future migration once the varlink
unit lifecycle API is complete.

Signed-off-by: Martin Belanger <martin.belanger@dell.com>
Assisted-by: Claude:claude-sonnet-4-6 [Claude Code]
Content fixes, verified against the kernel, systemd, libnvme, and
nvme-stas sources where applicable:

- io.systemd.Unit.StartTransient is in systemd v261 (still unreleased),
  not v260 as previously stated; v261 still lacks StopUnit, RestartUnit,
  and ResetFailedUnit, strengthening the D-Bus rationale (§3.9)
- a unique-NQN DC gets the kernel's 5 s NVME_DEFAULT_KATO, not a dead
  session; --keep-alive-tmo=30 normalizes the DC kato regardless of NQN
- FC WWN traddr comparison is case-insensitive string equality; nothing
  strips colon separators
- TID hash field order now matches nvme-stas staslib/trid.py exactly
- atomic write protocol documented as mkstemp() with random suffix,
  matching the implementation
- disconnect-all confirmation: prompt on TTY for both --force and
  --owner; non-interactive invocations proceed without prompting

Design additions from review:

- transient units carry Before=nvme-discoverd.service so discoverd
  stops before shutdown-time disconnects, preventing reconnect attempts
  (and FC kickstart re-issue) against a shutting-down system
- NBFT entries can be Discovery Controllers, not just IOCs; the
  --owner nbft substitution applies to both unit types
- host-traddr= exclusion entries provide interface exclusion for RDMA
  and FC, where host-iface does not apply
- registry §4.4 covers both initramfs boot paths: NBFT and the FC
  kickstart (unowned until discoverd adopts them at startup)
- the recycled-devid stale-entry edge case requires the udev rule race
  and a libnvme bypass to coincide; normally the rule has already
  cleaned up

Readability: split large sections into topical subsections (registry
§1/§4, exclusion §3/§7, discoverd §3.2, coexistence §5) and break up
long paragraphs throughout. No content changes from restructuring.

Signed-off-by: Martin Belanger <martin.belanger@dell.com>
Assisted-by: Claude:claude-fable-5 [Claude Code]
Add §11 "Future Release: DC Retention Policy" documenting how
discovery-derived configuration is retained after a discovery source
becomes unavailable.

Introduce `discovery-retention-time` (replacing the mDNS-only
`zeroconf-stale-timeout`) as a unified parameter covering all dynamic
discovery sources: mDNS, referral DCs, and FC kickstart DCs. Statically
configured and NBFT-derived DCs are retried indefinitely — they represent
explicit intent and are not subject to the retention timer.

Introduce `fc-kickstart-interval-minutes` for periodic FC fabric probing,
motivated by the equipment replacement scenario where a new DC may have a
different address/NQN and can only be discovered via an active kickstart.

Update §4 and §7.1 to note that FC kickstart has no TID cache in the
initial release, but the retention policy will require one. Update §5
retry policy to reflect the static vs. dynamic DC distinction. Align
§11.1 timer-start semantics with §11.2 (timer starts on source
disappearance regardless of connection state).

Signed-off-by: Martin Belanger <martin.belanger@dell.com>
Assisted-by: Claude:claude-sonnet-4-6 [Claude Code]
Periodic FC kickstart is opt-in, not on by default. nvme-discoverd ships
on all Linux systems — enabling periodic fabric probing by default on
laptops and desktops with no FC infrastructure would be wasteful and
surprising.

Use 0 to disable (consistent with systemd's convention for interval
parameters, e.g. WatchdogSec=0). 0 was previously invalid; infinity was
the disable value. Any value >= 1 is a valid interval in minutes.

Fix the consistency issue introduced by the default change: soften "not
acceptable in a production environment" to "may not be acceptable in an
FC production environment", and reframe the orthogonality paragraph to
read naturally with the default-is-0 framing.

Signed-off-by: Martin Belanger <martin.belanger@dell.com>
Assisted-by: Claude:claude-sonnet-4-6 [Claude Code]
The entry-ID hash was unnecessary complexity: exclusion list management
is human-only, so a sequential number scoped to a single interactive
`nvme exclusion remove` invocation suffices, and `libnvmf_exclusion_remove()`
now matches by exact entry-string content instead. The Python bindings
no longer expose exclusion_add/remove/create/delete at all — providing a
programmatic management API would contradict the human-administered
design; only the read-only exclusion_match/lists/entries() remain.

Signed-off-by: Martin Belanger <martin.belanger@dell.com>
Assisted-by: Claude:claude-sonnet-4-6 [Claude Code]
@igaw igaw added the rfc For tracking discussions new features etc. label Jun 16, 2026
@igaw igaw marked this pull request as draft June 16, 2026 18:54
Martin Belanger added 3 commits June 16, 2026 22:05
The RFC referenced the legacy NVMe-oF autoconnect components only in
passing (§1, §7.1), with no single place describing what nvme-cli
installs today, when each piece fires, and how nvme-discoverd subsumes
it. Add §12 with a full inventory and a clear split between components
that establish connections (replaced) and those doing orthogonal tuning,
key provisioning, or interface naming (kept).

Document the NBFT late-connect path in particular: nvmf-connect-nbft.service
has no [Install] section and is driven only by a NetworkManager dispatcher
script, so it never fires under systemd-networkd or other managers.
discoverd's retry loop covers the same case manager-agnostically, making
the late-connect a latency optimization rather than a correctness
requirement.

Renumber Open Questions to §13 and Glossary to §14.

Signed-off-by: Martin Belanger <martin.belanger@dell.com>
Assisted-by: Claude:claude-opus-4-8 [Claude Code]
Expand the RFC with reviewer-driven clarifications:

- §6.1: explain why discoverd uses its own discoverd.conf rather than
  reusing config.json (json-c dependency; JSON has no comments) or
  discovery.conf (DC-only; cannot express IOCs or global toggles).
- §12.1/§12.2: nvmf-connect.target was only a collective handle for the
  daemon-less design; the discoverd daemon is that coordinator, so no
  target equivalent is needed.
- §12.2: scope the autoconnect-rule replacement to the running system.
  Initramfs (Phase 1) connect is dracut's job (74nvmf, formerly 95nvmf),
  which has never used 70-nvmf-autoconnect.rules. Note that
  StartTransientUnit works over systemd's private socket without the
  D-Bus daemon, so discoverd is kept out of Phase 1 by design, not
  impossibility. Recommend removing the dead 70-nvmf-autoconnect.conf
  snippet from nvme-cli.
- §12.5: emphasize systemd-networkd as the common NetworkManager
  alternative that the legacy NBFT dispatcher path does not cover.

Signed-off-by: Martin Belanger <martin.belanger@dell.com>
Assisted-by: Claude:claude-opus-4-8 [Claude Code]
Reconcile the exclusion RFC with what was actually built (review item L13):

- §6.1: libnvmf_exclusion_match() takes a struct libnvmf_tid * rather than
  seven positional string arguments — fewer call-site mistakes.  Document the
  new libnvmf_exclusion_entry_valid() helper used to pre-validate hand-edited
  files without filesystem side effects.
- §6.3: mark the proposed key_value_list_parse()/_free() utilities as
  intentionally not provided; libnvmf_tid_parse() already covers parsing a
  semicolon-separated key=value string, and the matcher parses entries
  internally.
- §5: show the actual --name/--entry option form instead of positional
  <name>/<entry> arguments.

Signed-off-by: Martin Belanger <martin.belanger@dell.com>
Assisted-by: Claude:claude-opus-4-8 [Claude Code]
Comment thread RFCs/rfc-nvme-discoverd.md Outdated
Comment thread RFCs/rfc-nvme-discoverd.md Outdated
Comment thread RFCs/rfc-nvme-orchestrator-coexistence.md Outdated
Comment thread RFCs/rfc-nvme-discoverd.md Outdated
Comment thread RFCs/rfc-nvme-discoverd.md Outdated
Comment thread RFCs/rfc-nvme-discoverd.md Outdated
Comment thread RFCs/rfc-nvme-discoverd.md Outdated
Comment thread RFCs/rfc-nvme-discoverd.md
Comment thread RFCs/rfc-nvme-discoverd.md
Comment thread RFCs/rfc-nvme-discoverd.md Outdated

@tbzatek tbzatek left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well written design documents, I appreciate the level of detail!

Question about the legacy nvme connect-all --nbft: still being used by dracut, so I guess it stays, yet discoverd reimplements some logic. Would be nice to unify the codebase to avoid duplication. My concern is interpretation of various NBFT flags and processes... although certain best practices got added in the Boot Spec, there are still gaps. There have always been disparity between various UEFI implementations and the OS, even though we generally tend to avoid adding quirks in the nvme-cli NBFT code for firmware bugs.

Comment thread RFCs/rfc-nvme-orchestrator-coexistence.md Outdated

**Tools that bypass libnvme or use NULL owner** produce unowned connections — no registry entry is written, and `disconnect-all` treats them as freely disconnectable. The most common sources are:

- **UDisks**: a D-Bus daemon that provides block-device management to desktop environments; it calls libblockdev (`bd_nvme_connect`), which calls libnvme internally, but neither participates in registry ownership.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The plan is to always supply owner information, e.g. 'udisks', with a possibility to override this via an optional argument to any arbitrary string.

Since UDisks is rather high-level layer, we're able to respect exclusions actively and refuse connection, unless forced. At this point only simple connect & disconnect commands are provided, falling into Tier 1. However there were plans to provide simple discovery incl. mDNS support browser functionality in the future, leaning towards Tier 2 (as there will likely always be an external consumer to trigger any action first).

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The plan is to always supply owner information, e.g. 'udisks', ...

Thanks — I'll update the UDisks entry accordingly: supplies owner='udisks' (overridable to an arbitrary string), actively respects exclusions and refuses unless forced, Tier 1 today. I'll add this to the RFC.

However there were plans to provide simple discovery incl. mDNS support browser functionality in the future...

On the future mDNS direction, though, I'd like to clarify before we pencil in a Tier 2 / mDNS-browser role for UDisks — because mDNS/DNS-SD discovery of Discovery Controllers is a genuine can of worms, and a third independent browser would compound it. We're already wrestling with how to coordinate mDNS browsing between just two orchestrators, nvme-discoverd and nvme-stas: having both browse at once is a misconfiguration we have to actively detect and arbitrate (the "zeroconf-conflict" problem, still unsolved in the general case). If UDisks adds its own mDNS browser, that's a third layer independently discovering the same DCs and potentially connecting, with no shared owner of "who browses mDNS on this host." Could you say more about what UDisks intends — a full mDNS DC-discovery browser, or something narrower (e.g. surfacing controllers another component already discovered)? If it's the former, I think we need a cross-orchestrator story for mDNS-browser ownership before multiple layers start doing it independently — two is already hard, and three concurrent browsers on one host would be painful to reason about. I'd rather flag it now than debug three browsers fighting later.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally speaking, any new party can come to the game anytime.

The plan with UDisks is to provide underlying services for gvfs - it might actually be gvfs simply adding another protocol to its existing avahi-based network browser. UDisks would then provide high-level discovery and connect services, guarded by polkit rules. This is all on demand and it's tiered - i.e. the base gvfsd-network browser would just report machines in the network exposing _nvme-disc._tcp and discovery is only made upon user intention - i.e. discovery is not supposed to be done proactively. The target audience are end users with expectation that neither nvme-discoverd or nvme-stas are going to be configured for zeroconf on such systems. Neither gvfs or UDisks can use discoverd as that's systemd-only, this will need to be separate reimplementation, reusing as many general parts from libnvme as possible.

- An NBFT boot-path controller carries `owner=nbft` — nvme-stas never disconnects it.
- Since nvme-stas never disconnects a controller owned by another orchestrator, there is nothing for nvme-discoverd to reconnect — a bounce loop cannot occur.

**NBFT controllers are also immune to nvme-discoverd's exclusion list.** nvme-discoverd reconnects NBFT-sourced controllers unconditionally, regardless of any matching exclusion list entry. An exclusion entry targeting a boot device is a misconfiguration; nvme-discoverd logs a warning and reconnects. See `rfc-nvme-registry.md` §4.4 for the `owner=nbft` semantics.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need this for NBFT testing... although the design is well suited for real-world scenarios, we'd like to have option to manually disconnect a particular nbft controller and prevent it to be reconnected.

Further thoughts:

  • explicitly stop nvme-discoverd for such tests?
  • what about a multipath scenario where a particular network interface is intentionally going down: perform disconnect first and then tear down the interface to avoid the kernel host nvme driver reconnection mechanism?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good — there's a clean answer to all three.

Manually disconnect a particular NBFT controller, and keep it disconnected. There's a single command for exactly this: nvme disconnect --exclude <device>. It derives an exclusion entry from the controller's sysfs attributes (transport, traddr, trsvcid, subsystem NQN — plus host-iface when the connection is pinned to one), writes that entry to the exclusion list first, and only then disconnects. Writing the exclusion before the disconnect is deliberate: it guarantees the entry is in place before the device-removal event ever reaches discoverd, so discoverd gets no window to reconnect. That's the surgical, race-free option — it suppresses just that one controller while discoverd keeps managing everything else.

This works because we're relaxing the previous "NBFT bypasses exclusion" rule: discoverd now honours the exclusion list for NBFT controllers too. (owner=nbft still protects the controller from other orchestrators — only discoverd's own reconnect yields to your explicit exclusion. And since excluding a boot device is a foot-cannon, the nvme exclusion add path warns when an entry would match an owner=nbft controller, skippable with --force.) If you instead want to disconnect without excluding — i.e. let discoverd bring it straight back — plain nvme disconnect <device> is the no-checks single-device escape hatch with no exclusion side effect.

Stopping discoverd for tests? You can, and for a quick "stop managing everything" it's fine. But note it's the blunt instrument: discoverd is the single monitor for every connection on every interface, so stopping it suspends management of all paths, not just the controller under test. If you want to isolate one controller while the rest stay managed, the exclusion list above is the better lever.

The multipath interface-down case. Here I'm less sure there's a problem to solve, so let me lay out what actually happens and you can tell me if you're after something more. When a local interface is brought down under a live controller, the kernel doesn't special-case it — it enters its normal reconnect loop, which is slow (one attempt per reconnect-delay, default 10 s) and, when the connection is pinned with --host-iface, fails fast each time. It doesn't flood anything, and multipath has already failed I/O over to the live paths; when the interface returns, the path reconnects. discoverd, being connect-only, keeps that controller in its desired set and reconnects on return — the intended behaviour. So "disconnect first, then ifdown" is a reasonable manual hygiene step to avoid the (harmless, slow) reconnect churn, but it isn't something discoverd needs to orchestrate. If your goal is to permanently retire a path, that's a desired-set/config change (discoverd should stop wanting it) — and again the exclusion list is the tool. If you're seeing an actual failure beyond the cosmetic reconnect churn, tell me what it is and I'll dig in; otherwise I'd rather not build orderly single-path drain into a connect-only daemon without a concrete problem that needs it.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

discoverd now honours the exclusion list for NBFT controllers too.``

That's great to hear, thanks.

The multipath interface-down case.

This was mentiond as just another test case we've been performing, to test the link-up reconnection hooks. I.e. avoiding the kernel connection recovery mechanism and have a clean plate. This is obviously supposed to be handled by discoverd now, we'll align our tests accordingly. There are some more extreme use cases like in-place upgrades with intermediary userspace that is performing dark magic... unimportant for now I'd say.

Comment thread RFCs/rfc-nvme-registry.md
Comment thread RFCs/rfc-nvme-registry.md Outdated
Comment thread RFCs/rfc-nvme-discoverd.md Outdated
| `nvmf-connect-nbft.service` | systemd oneshot | On demand, started by the NM dispatcher on NBFT-interface up | **Replaced** — discoverd adopts/reconnects NBFT controllers from its NBFT cache (§7.1); see §12.5 |
| `80-nvmf-connect-nbft.sh` | NM dispatcher | NetworkManager interface-up for `nbft*`/HFI connections | **Replaced** — discoverd's retry loop, manager-agnostic (§12.5) |
| `70-nvmf-autoconnect.conf` | dracut conf | Build-time `install_items+=` snippet that copies `70-nvmf-autoconnect.rules` into the initramfs | **Remove outright** (pre-existing cruft). dracut has never used `70-nvmf-autoconnect.rules` and ships its own initramfs mechanism, so this snippet only copies an inert rule into the initrd. Safe to delete from nvme-cli independently of discoverd; early-boot connect is dracut's job (§7.1, §12.2) |
| `65-persistent-net-nbft.rules` | udev rule | Naming of `nbft*` interfaces | **Kept** — interface naming, not connect logic |

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure whether this ever served its purpose... @mwilck ?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suspect you're right that it never really served a purpose, and the reason fits the rest of the picture. This rule does one thing — pin nbft* interface names so udev doesn't rename them — and the only thing that ever consumed that name was the NM dispatcher (80-nvmf-connect-nbft.sh), which matches nbft* to trigger the late-NBFT connect — i.e. the very path you just described as never really working. So with no working consumer, the naming rule has nothing to serve. I currently list it as "Kept" (orthogonal naming), but I'm inclined to move it to "Remove" alongside the NM-dispatcher path: if the late-NBFT machinery goes, this goes with it. I'll defer the definitive history to @mwilck — he'd know if anything else ever relied on the nbft* name — and tie the final disposition to the broader question of whether discoverd carries NBFT handling at all.

Comment thread RFCs/rfc-nvme-discoverd.md Outdated
Comment thread RFCs/rfc-nvme-discoverd.md
Comment thread RFCs/rfc-nvme-discoverd.md
Comment thread RFCs/rfc-nvme-discoverd.md Outdated
Updates forllowing following peer reviews

Signed-off-by: Martin Belanger <martin.belanger@dell.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

rfc For tracking discussions new features etc.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants