| Cosmos Gen 3

Overview

The scheduler is preemptive, pluggable, virtual-time. It manages thread lifecycle (create, ready, run, block, sleep, exit), drives context switches from the timer interrupt, exposes blocking sync primitives (Mutex, ConditionVariable, Monitor) and a non-blocking SpinLock, and feeds the GC's stack-scanning phase through a global thread registry.

The design splits policy from mechanism. Mechanism (context switching, the timer-tick entry, the thread registry, the GC bridge) is fixed and shared. Policy (what to pick next, what to do on a tick, how to balance CPUs) is a single interface, IScheduler, that any algorithm can implement. The default algorithm is Stride, a virtual-time fair-share scheduler. It is one of many possible plug-ins. See Plugging in a scheduling algorithm for examples.

 ┌──────────────────────────────────────────────────────────────────┐
 │  Mechanism (fixed)                                               │
 │  ─────────────────                                               │
 │  - SchedulerManager:    lifecycle dispatch, thread registry      │
 │  - PerCpuState, Thread: extensible TCB and per-CPU state         │
 │  - IRQ stub + asm:      register save and restore                │
 │  - Mutex, CV, Monitor:  park and unpark via SchedulerManager     │
 │  - GC stack-scan bridge                                          │
 └──────────────────────────┬───────────────────────────────────────┘
                            │ IScheduler interface
 ┌──────────────────────────┴───────────────────────────────────────┐
 │  Policy (pluggable)                                              │
 │  ───────────────────                                             │
 │  - StrideScheduler (default)                                     │
 │  - RoundRobin, MLFQ, EDF, FIFO, ...  (see examples below)        │
 └──────────────────────────────────────────────────────────────────┘

The architecture-specific assembly (Interrupts.s on x64, ContextSwitch.s on ARM64) handles the actual register save and restore. It is the only part of the system that knows about register layouts.

Layers

Layer	Responsibility
`IScheduler`	Pluggable policy interface: lifecycle hooks, `PickNext`, `OnTick`, load balancing
`SchedulerManager`	Mechanism: thread registry, lifecycle dispatch, timer-tick handler, GC bridge
`SchedulerExtensible`	Base class with a single per-instance slot used by the active scheduler to attach its bookkeeping to `Thread` and `PerCpuState`
`Thread`	Thread Control Block: identity, state, stack layout, TLAB
`ThreadContext` (x64, ARM64)	Saved-register layout that mirrors the IRQ stub
`PerCpuState`	Per-CPU state: `CurrentThread`, `IdleThread`, lock, scheduler's per-CPU slot
Sync primitives (`SpinLock`, `Mutex`, `ConditionVariable`, `Monitor`)	Park and unpark on top of `BlockThread` and `ReadyThread`
`Stride/*`	The default `IScheduler` implementation
`Bridge/Import` and `Bridge/Export`	Native trampolines and the stable initial-RIP entry stub
`Runtime/Thread.cs`	Runtime exports (`RhYield`, stack bounds, thread-static storage)

The complete file map is at the bottom under Source files.

Thread model

A Thread is a managed class that lives on the GC heap. It carries:

Identity and state: a globally unique Id, the assigned CpuId, a ThreadState (Created, Ready, Running, Blocked, Sleeping, Dead), and a ThreadFlags bitfield (KernelThread, IdleThread, Pinned, Managed).
Stack: StackBase, StackSize, StackPointer. The saved ThreadContext lives at the bottom of the stack; StackPointer points into it.
Accounting: CreatedAt, LastScheduledAt, TotalRuntime, WakeupTime.
GC: a thread-local bump allocator (AllocContext).
Scheduler slot: SchedulerData, an object? reserved for the active scheduler. Stride stores StrideThreadData here; another algorithm would store something different (see examples).

Two flags affect policy:

Pinned: the thread cannot migrate. Both SelectCpu and Balance skip it.
Managed: the entry parameter is a GCHandle<System.Threading.Thread>; the entry trampoline forwards into managed Thread.StartThread instead of decoding the parameter as a free Action.

Per-thread stack

Thread.InitializeStack allocates one contiguous chunk and lays the saved ThreadContext at the bottom (low address). The usable call/locals stack grows downward from StackBase + StackSize toward the context.

              one stack (StackSize bytes)
┌ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┐

│   StackBase                                 StackBase + StackSize   │
    │                                                       │
│   ▼                                                       ▼         │
    ┌──────────────────────────┬────────────────────────────┐
│   │ ThreadContext            │  usable stack              │         │
    │  (saved registers,       │  (grows downward from top) │
│   │   IRQ frame)             │                            │         │
    │                          │                            │
│   └──────────────────────────┴────────────────────────────┘         │
    ▲
│   │                                                                 │
    StackPointer (saved RSP; the IRQ stub restores from here)
└ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┘

The context address is rounded up so the SIMD save area is naturally aligned. The initial saved SP inside the new context is set so the first call inside the new thread lands on a 16-byte-aligned RSP after the prologue's push rbp.

The saved-context layout itself differs by architecture: x64 saves XMM, GPRs, and an iretq frame; ARM64 saves NEON, X0–X30, and Sp/Elr/Spsr. The scheduler does not look inside the context. Only the IRQ stub indexes into it.

Global thread registry

SchedulerManager owns a flat fixed-size array (Thread?[]) allocated once at boot. Every live thread, regardless of state, occupies one slot until it dies. The GC reads the array directly during the mark phase, with no interface dispatch (which could allocate during a collection).

 SchedulerManager._allThreads (one slot per live thread)
 ═══════════════════════════════════════════════════════════════════════

  [0] -> Thread { Flags=IdleThread, State=Running   }   (idle)
  [1] -> Thread { Flags=Managed,    State=Ready     }
  [2] -> Thread { Flags=Managed,    State=Blocked   }   (waiting on Mutex)
  [3] -> Thread { Flags=None,       State=Sleeping  }   (WakeupTime set)
  [4] -> null                                            (free slot)
  ...

Blocked, sleeping, and dead threads are not in any scheduler-owned run queue, but they remain in _allThreads. This is what keeps the GC correct: stack roots on parked threads must remain reachable.

Lifecycle

stateDiagram-v2
    [*] --> Created: CreateThread
    Created --> Running: ScheduleFromInterrupt picks new thread
    Running --> Ready: preempted
    Ready --> Running: ScheduleFromInterrupt
    Running --> Blocked: BlockThread (Mutex / CV.Wait)
    Running --> Sleeping: Sleep(timeoutMs)
    Blocked --> Ready: ReadyThread (Mutex.Release / CV.Signal)
    Sleeping --> Ready: CheckSleepingThreads (WakeupTime <= now)
    Sleeping --> Ready: ReadyThread (signaled)
    Running --> Dead: ExitThread
    Dead --> [*]

SchedulerManager owns every state transition. Each transition does two things: update Thread.State under interrupt-disabled scope, and notify the active IScheduler via the matching hook (OnThreadCreate, OnThreadReady, OnThreadBlocked, OnThreadYield, OnThreadExit). The scheduler uses these hooks to update its own data structures (run queue, priority bookkeeping, etc.).

The first run of a freshly created thread is a special case. Instead of resuming a saved frame with iretq, the IRQ exit path loads the configured RSP and jumps directly to the configured RIP. That RIP is always a single trampoline, ThreadNative.EntryPointStub, which forwards to the manager's invoke routine. The invoke routine then either runs an Action delegate or calls into managed System.Threading.Thread.StartThread, depending on the Managed flag. After this first entry, every thread (kernel or managed) has the same shape.

On exit, the manager flushes the TLAB back to the GC, calls OnThreadExit on the scheduler, and clears the registry slot. The thread's accumulated runtime is rolled into a global counter so the GetBusyCpuTimeNs metric stays monotonic across thread death.

Preemption flow

sequenceDiagram
    participant ASM as IRQ stub (asm)
    participant IM as InterruptManager
    participant SM as SchedulerManager
    participant SC as IScheduler (policy)

    Note over ASM: Timer IRQ, interrupts disabled
    ASM->>ASM: Save registers + IRQ frame at RSP
    ASM->>IM: __managed__irq(rdi = saved-context*)
    IM->>SM: OnTimerInterrupt(cpuId, currentRsp, elapsedNs)
    SM->>SM: CheckSleepingThreads (wake any expired Sleeping threads)
    SM->>SC: OnTick(state, current, elapsedNs)
    SC-->>SM: needsReschedule (bool)
    alt needsReschedule
        SM->>SC: PickNext(state)
        SC-->>SM: next (or null, fall back to IdleThread)
        SM->>SM: prev.StackPointer = currentRsp, prev.State = Ready
        SM->>SC: OnThreadYield(state, prev)
        SM->>ASM: stage target RSP + new-thread flag
    end
    SM-->>ASM: return
    Note over ASM: Check staged target RSP
    alt switch requested
        ASM->>ASM: load target RSP, restore registers
        alt new thread
            ASM->>ASM: jmp to entry RIP (skip iretq)
        else
            ASM->>ASM: iretq (resume saved frame)
        end
    else
        ASM->>ASM: iretq (resume current thread)
    end

Two notes on this flow:

The policy never touches registers. The IRQ stub captures the outgoing RSP and hands it to the manager; the manager hands a target RSP back. Everything in between (OnTick, PickNext, OnThreadYield) is plain managed C# operating on Thread and PerCpuState.
The new-thread tail is what bootstraps a fresh thread. Because a Created thread has no saved IRQ frame to iretq into, the IRQ exit path is told (via a flag staged from C#) to load the saved RSP and jump to the saved RIP instead. After that first entry, the thread is indistinguishable from any resumed thread.

Voluntary (non-IRQ) switches go through SchedulerManager.Schedule and ContextSwitch.Switch. They use the same target-RSP mechanism but are not the hot path; preemption dominates.

Default algorithm: Stride

Stride scheduling is virtual-time fair-share. Each thread has a weight (Tickets) and a stride (Stride1 / Tickets). Each scheduling round, the chosen thread's Pass advances by its stride. The run queue stays sorted by Pass, so the lowest-pass thread always runs next. Higher tickets means smaller stride, so Pass advances more slowly, so the thread gets more total CPU.

The per-CPU run queue is a single list sorted ascending by Pass:

 StrideCpuData.RunQueue (one per CPU)
 ═══════════════════════════════════════════════════════════════════════

   Pass = 1042            Pass = 1130           Pass = 1320
   ┌─────────────┐        ┌─────────────┐       ┌─────────────┐
   │ Thread A    │        │ Thread B    │       │ Thread C    │
   │ Stride=10485│        │ Stride=10485│       │ Stride=5242 │
   └─────────────┘        └─────────────┘       └─────────────┘
        ▲                                              ▲
        │                                              │
   PickNext() pops here                       Tail (highest Pass);
                                              also the migration victim
                                              picked by Balance()

What Stride does in each IScheduler hook:

Hook	Behavior
`OnThreadCreate`	Allocate `StrideThreadData` with default tickets, set `Pass = 0`
`OnThreadReady`	Choose between an interactive boost (`Pass = GlobalPass - Stride/2`) or a CFS-style starvation cap, then insert into the queue sorted by `Pass`
`OnThreadBlocked`	Remove from run queue, save `Remain = Pass - GlobalPass` to restore on wakeup
`OnTick`	Advance current thread's `Pass` and the CPU's `GlobalPass`; signal preempt if the head of the queue now has a lower `Pass` or the quantum has elapsed
`OnThreadYield`	Re-insert the yielding thread, but clamp `Pass` upward to `GlobalPass` so a long-blocked thread cannot perpetually outrank others
`PickNext`	Pop the head of the run queue (lowest `Pass`); return null if empty (manager runs the idle thread)
`SelectCpu`	Honor `Pinned`; otherwise prefer any CPU under 80% of the current CPU's load
`Balance`	Idle CPUs steal the tail thread (highest `Pass`, least hot) from the busiest peer
`SetPriority`	Re-ticket without losing relative position by scaling `Remain` by the stride ratio

Two refinements are reused by other algorithms:

Interactive boost: a thread that blocks frequently relative to its runtime is treated as I/O-bound and gets a head-start Pass on wakeup. The boost decays after a few ms.
Starvation cap: a long-blocked thread cannot wake with a Pass so far behind GlobalPass that it monopolizes the CPU. The cap is Pass = max(GlobalPass + Remain, GlobalPass - 2 * Stride1).

Plugging in a scheduling algorithm

Any algorithm that implements IScheduler is a drop-in replacement. The mechanism layer never assumes virtual time, run queues, or priorities. Switching the active scheduler is a single call:

SchedulerManager.SetScheduler(new MyScheduler());

The manager calls ShutdownCpu on the outgoing scheduler and InitializeCpu on the incoming one for every CPU.

The algorithm decides what data to attach to threads and CPUs by storing whatever it wants in Thread.SchedulerData and PerCpuState.SchedulerData (both inherited from SchedulerExtensible). Below are sketches of how a few classic algorithms map onto this interface.

Round-Robin

A FIFO queue with fixed-quantum preemption.

Hook	Behavior
`PerCpuState.SchedulerData`	A `Queue<Thread>`
`Thread.SchedulerData`	None, or a remaining-quantum counter
`OnThreadReady`	Enqueue at the tail
`OnThreadBlocked`	No-op (the thread is not in the queue while it runs)
`OnTick`	Decrement remaining quantum; preempt when it hits zero
`OnThreadYield`	Re-enqueue at the tail, reset quantum
`PickNext`	Dequeue the head
`SelectCpu`	Round-robin or shortest-queue
`Balance`	Steal half the queue from the busiest CPU
`SetPriority` / `GetPriority`	No-op / constant

Round-Robin does not need an interactive boost or starvation cap. The FIFO order already bounds latency at quantum * queue_depth.

Multi-Level Feedback Queue (MLFQ)

Several priority queues. Threads demote when they use a full quantum and promote when they block early.

Hook	Behavior
`PerCpuState.SchedulerData`	An array of `Queue<Thread>`, one per priority level
`Thread.SchedulerData`	Current level + quantum-used counter
`OnThreadReady`	Enqueue at the thread's current level
`OnThreadBlocked`	Promote one level (the thread blocked, treat it as interactive)
`OnTick`	Charge time to the running thread; if it consumed a full quantum at this level, demote one level on next yield
`PickNext`	Scan levels top-down for a non-empty queue
`Balance`	Periodically reset all threads to top priority (anti-starvation)

MLFQ does not track virtual time. Its bookkeeping is just integer levels.

Real-time scheduling (RTOS)

The framework is suitable for building a hard- or soft-real-time kernel. The policy/mechanism split lets you swap in a real-time algorithm without touching context switching or sync primitives. The three classic families all map onto IScheduler.

Fixed-priority preemptive (FPP). Every thread has a static priority. The highest-priority runnable thread always runs. Ties are broken by FIFO at each level. This is the default policy in most RTOSes (FreeRTOS, Zephyr, ThreadX).

Hook	Behavior
`PerCpuState.SchedulerData`	An array of `Queue<Thread>` indexed by priority
`Thread.SchedulerData`	Static priority
`OnThreadReady`	Enqueue at the tail of the thread's priority level
`OnTick`	Preempt if a higher-priority queue is non-empty (no quantum, pure priority)
`PickNext`	Scan top-down for the first non-empty level, dequeue head
`SetPriority`	Move between levels
`Balance`	Usually disabled. RT threads are pinned (see below)

Rate Monotonic (RM). Periodic tasks. Priority is statically assigned by frequency: shorter period gets higher priority. Optimal among static-priority schedulers when periods equal deadlines and the system is under the Liu/Layland bound.

Hook	Behavior
`Thread.SchedulerData`	`Period`, derived static priority
`OnThreadCreate`	Assign priority from `1 / Period`; reject if total utilization exceeds the schedulability bound
Otherwise	Same as FPP. RM is just FPP with a specific priority-assignment rule

Earliest-Deadline-First (EDF). Dynamic priority based on absolute deadline. Theoretically optimal on a single CPU (achieves 100% utilization vs. RM's ~69%) but harder to analyze under overload.

Hook	Behavior
`PerCpuState.SchedulerData`	A min-heap keyed on absolute deadline
`Thread.SchedulerData`	`Period`, `RelativeDeadline`, `AbsoluteDeadline`, `WCET`
`OnThreadReady`	Recompute `AbsoluteDeadline = now + RelativeDeadline`, insert into the heap
`OnTick`	Preempt if the heap root's deadline is earlier than the current thread's
`PickNext`	Pop the heap root
`SelectCpu`	Partitioned EDF: pin tasks per-CPU so per-CPU utilization stays under 1.0

Hard-real-time requirements

Any of these algorithms is only as deterministic as the surrounding mechanism. For a hard-RT build, four things matter:

Bounded interrupt latency. The IRQ stub captures RSP, calls into managed code, and returns. There is no unbounded loop on the path. Anything you add to OnTick or PickNext becomes part of worst-case latency, so RT algorithms keep these O(log n) or better. Stride's linear InsertByPass would not be acceptable; a heap or per-priority FIFO is.
Priority inheritance in Mutex. The current Mutex is FIFO-wake. This is fine for fair-share, but a low-priority holder of a mutex contended by a high-priority waiter causes priority inversion. A real-time mutex must temporarily boost the holder to the highest waiter's priority via SetPriority, then restore on release. The hooks for this exist; the policy does not.
Pinned for IRQ-affine threads. RT threads typically must not migrate (caches, deadline analysis, and partitioned EDF all assume affinity). The Pinned flag is the contract; SelectCpu and Balance already honor it.
Tickless or event-driven timer. The current timer is periodic. A hard-RT build replaces it with a one-shot timer programmed to the next deadline (or next quantum, whichever is sooner). This is a HAL change, not a scheduler change. OnTimerInterrupt does not care whether ticks are periodic.

The mechanism already provides a deterministic context switch (no allocation on the hot path), per-thread Pinned, the SetPriority hook, sleep with a wakeup deadline (Sleep(timeoutMs) is what a periodic task needs), and stack-bounded thread state (no dynamic stack growth to perturb timing).

What an RTOS build still has to add: the priority-inheritance protocol on Mutex, a tickless timer driver, and a worst-case-execution-time (WCET) budget on OnTick and PickNext of the chosen scheduler.

FIFO (cooperative)

A debugging policy: one global queue, no preemption. Threads run until they block or yield.

Hook	Behavior
`OnTick`	Always return `false` (never preempt)
`OnThreadYield`	Move to the tail
`PickNext`	Pop the head

Useful when chasing a race that disappears under preemption.

Common patterns

The same shape recurs across all of these:

Define what to store per thread and per CPU. Allocate it in OnThreadCreate and InitializeCpu, retrieve with GetSchedulerData<T>().
Decide queue shape (FIFO, sorted list, heap, multi-level). The mechanism layer does not care; it only calls PickNext.
Decide the preemption trigger (quantum exhaustion, queue-head priority, deadline, never). That logic lives in OnTick's return value.
Decide migration policy: which CPU to start on (SelectCpu) and how to rebalance (Balance). Honor the Pinned flag.
Hook into block/wake so threads leave and re-enter the runnable set correctly. Anything that needs to survive a sleep (saved progress, queue position, remaining quantum) is what OnThreadBlocked should stash.

A scheduler that does not need a hook can leave it as a no-op. The interface is wide so the mechanism can drive any algorithm without per-algorithm special cases. Not every hook is mandatory.

Synchronization primitives

Synchronization sits on top of BlockThread, ReadyThread, and Sleep, so it is policy-agnostic.

SpinLock: pure CAS, no scheduler interaction. Used internally by Mutex and ConditionVariable to protect their wait queues.
Mutex: recursive blocking lock. Owner ID, recursion depth, and a wait queue. On contention, the caller is parked via BlockThread. On release, the manager wakes the head of the queue. A bounded spin precedes parking so brief contention does not take the slow path.
ConditionVariable: signal/wait with mutex hand-off. Wait releases the mutex, parks, and re-acquires on wakeup. WaitTimeout parks with a wakeup deadline (Sleep) so the thread also resumes on expiry.
Monitor: Java-style composite of Mutex and ConditionVariable.

All three blocking primitives delegate the actual state transition to the manager, which wraps it in a disable-interrupts scope so the timer interrupt cannot observe a half-finished park.

GC integration

The scheduler is the GC's source of truth for stack roots. Two invariants matter:

Every registered thread is scanned, not just the running one. Stack locals on a thread blocked on a mutex are still reachable and must be marked. Earlier versions only scanned CurrentThread and lost objects whose only reference was on a parked stack.
Mark-phase iteration goes directly through the array, not via any interface. Interface dispatch can allocate, and the mark phase cannot allocate. This is why the registry is a flat array, not a List<Thread> exposed through IEnumerable.

For each non-Dead thread, the mark phase scans the saved ThreadContext (every saved register is a root candidate) and conservatively scans stack memory between the saved SP and StackBase + StackSize. The currently running thread is scanned via the live SP captured by the IRQ stub, not the cached StackPointer.

A defensive check rejects any candidate MethodTable* outside kernel higher-half (>= 0xFFFF800000000000). Without it, a stack-resident integer that lands in heap range can crash a downstream type-info read.

Runtime and managed-thread bridge

The scheduler exposes a thin runtime surface so the .NET runtime and BCL see it as a real threading implementation:

Runtime exports (RhYield, RhGetCurrentThreadStackBounds, RhGetThreadStaticStorage, RhSetThreadExitCallback) live in Runtime/Thread.cs. They delegate to SchedulerManager, or fall back to single-threaded behavior when the scheduler feature switch is off.
ThreadPlug redirects System.Threading.Thread.CreateThread so new Thread(action).Start() in a kernel project allocates a kernel Thread flagged Managed, sets the entry point to EntryPointStub, and registers it with the manager.
ContextSwitchNative is the C# side of the four native symbols that stage the IRQ exit path (set_context_switch_sp, get_context_switch_sp, set_context_switch_new_thread, get_sp). All four are [SuppressGCTransition] because they are pure register operations.

Feature switch

The scheduler is gated by CosmosEnableScheduler in the kernel .csproj, surfaced as CosmosFeatures.SchedulerEnabled. Two checkpoints honor it:

SchedulerManager.IsEnabled: every public manager entry point that mutates state checks it.
Runtime/Thread.cs: runtime exports fall back to single-threaded behavior when off.

A separate runtime flag SchedulerManager.Enabled is set after Initialize, SetScheduler, and idle-thread wiring complete. The timer interrupt path returns early until it flips. This avoids racing the very first context switch against a half-built scheduler state.

Source files

File	Path
Scheduler interface	`src/Cosmos.Kernel.Core/Scheduler/IScheduler.cs`
Scheduler manager	`src/Cosmos.Kernel.Core/Scheduler/SchedulerManager.cs`
Extensible base	`src/Cosmos.Kernel.Core/Scheduler/SchedulerExtensible.cs`
Thread TCB	`src/Cosmos.Kernel.Core/Scheduler/Thread.cs`
Thread state	`src/Cosmos.Kernel.Core/Scheduler/ThreadState.cs`
Per-CPU state	`src/Cosmos.Kernel.Core/Scheduler/PerCpuState.cs`
Thread context (x64)	`src/Cosmos.Kernel.Core/Scheduler/ThreadContext.X64.cs`
Thread context (ARM64)	`src/Cosmos.Kernel.Core/Scheduler/ThreadContext.ARM64.cs`
Voluntary switch	`src/Cosmos.Kernel.Core/Scheduler/ContextSwitch.cs`
SpinLock	`src/Cosmos.Kernel.Core/Scheduler/SpinLock.cs`
Mutex	`src/Cosmos.Kernel.Core/Scheduler/Mutex.cs`
Condition variable	`src/Cosmos.Kernel.Core/Scheduler/ConditionVariable.cs`
Monitor	`src/Cosmos.Kernel.Core/Scheduler/Monitor.cs`
Stride scheduler	`src/Cosmos.Kernel.Core/Scheduler/Stride/StrideScheduler.cs`
Stride thread data	`src/Cosmos.Kernel.Core/Scheduler/Stride/StrideThreadData.cs`
Stride CPU data	`src/Cosmos.Kernel.Core/Scheduler/Stride/StrideCpuData.cs`
Native imports	`src/Cosmos.Kernel.Core/Bridge/Import/ContextSwitchNative.cs`
Entry trampoline	`src/Cosmos.Kernel.Core/Bridge/Export/ThreadNative.cs`
Runtime exports	`src/Cosmos.Kernel.Core/Runtime/Thread.cs`
Managed Thread plug	`src/Cosmos.Kernel.Plugs/System/Threading/ThreadPlug.cs`
x64 IRQ + switch asm	`src/Cosmos.Kernel.Native.X64/CPU/Interrupts.s`
ARM64 switch asm	`src/Cosmos.Kernel.Native.ARM64/CPU/ContextSwitch.s`
GC mark integration	`src/Cosmos.Kernel.Core/Memory/GarbageCollector/GarbageCollector.Mark.cs`

References

The scheduler design draws on three primary sources:

Stride Scheduling: Deterministic Proportional-Share Resource Management, Waldspurger and Weihl (MIT/LCS/TM-528). PDF. The virtual-time fair-share algorithm in StrideScheduler (pass, stride, tickets, sorted-by-Pass run queue) comes from this paper.
Ekiben: a pluggable scheduler API. arXiv:2306.15076. The shape of IScheduler (lifecycle hooks, PickNext, OnTick, per-CPU state slot, policy/mechanism split) is modeled on Ekiben's EkibenScheduler trait. IScheduler.cs notes this inline.
Multithreading in .NET at the CLR Level: What Really Happens Under the Hood. codetodeploy on Medium. Background on how the CLR models threads. Used while wiring RhYield, thread-static storage, and the ThreadPlug for System.Threading.Thread.CreateThread.

Table of Contents