Northrop’s Model 437 Vanguard Advances to Flight Testing with Partner Software for Collaborative Autonomy and Mission Learning
Northrop Grumman’s Model 437 Vanguard flew again on September 20 after months of modification for the Beacon autonomy program. Company engineers installed the Prism autonomy stack, integrated new interfaces and prepared the aircraft for envelope expansion sorties at Mojave before partner software goes aboard.
The aircraft is a Scaled Composites design sized around a single Pratt & Whitney 535 class engine. Earlier technical descriptions place its wingspan and length at about forty-one feet, with internal volume and wiring provisions for avionics payloads suited to autonomy research. Flight time and payload room decide how many autonomy agents, sensors and recorders the team can fly on one mission.
Partner participation now stands at six. Autonodyne, Applied Intuition’s EpiSci unit, Red 6, Merlin Labs, Soar Technology and Shield AI plan to load mission software as discrete modules once Prism finishes its initial flight trials. Executives described functions that range from augmented reality training, traffic and formation logic to human-machine teaming, mission behaviors, collaborative patrol procedures and post-flight learning.
Program managers set a near-term sequence that keeps a safety pilot in the seat while Prism proves basic control and failsafes. The company counted five additional crewed flights before allowing Prism to fly the jet end-to-end later this year if test points clear. Defense officials said early sorties keep manual override authority and build from simple profiles to more complex ones.
Beacon Model 437 Vanguard Returns to Flight on Sept 20, 2025
The September 20 sortie restarted flight test after a long modification period. Scaled Composites reported hydraulic system updates, integration of Beacon subsystems and cockpit work to link pilot controls with the autonomy stack. That hangar integration period let teams run power-on checks and ground-based autonomy trials before returning to the air.
Envelope expansion flights now proceed in a standard pattern. Crews validate gear operation, basic handling, stability margins and data collection first. They add speed, altitude and maneuver boundaries stepwise. Test cards set altitudes and blocks for each task, with recovery profiles specified in case Prism or any hosted module hangs.
The 437’s endurance and range allow multi-module missions without fuel pressure, temperature or electrical limits becoming the main driver after an hour. The internal bay and hard points accept instrumented loads if a mission module needs extra compute or a specialized sensor. That flexibility gives Beacon room to test agents that demand high data rates or steady power without a separate chase aircraft.
The aircraft’s return to flight aligned with the late-September Air, Space and Cyber event near Washington. The public step coincided with Northrop’s outreach to software firms so teams could plan lab integration and airworthiness reviews together.
Prism Autonomy Open Architecture and Ground Integration
Prism sits as the host autonomy stack. It handles navigation, flight control hooks, vehicle health and safety-of-flight logic so third-party modules can focus on mission behaviors. Program leads describe interfaces that allow “plug-in” agents to receive state data and publish guidance without owning the aircraft’s core flight loops. That separation reduces the chance that a mission module destabilizes the platform when it misbehaves.
Open interfaces extend to the ground. Beacon pairs the aircraft with mission servers, radios and analysis tools that record time-aligned telemetry and module outputs. Teams then replay that data across lab benches to reproduce issues and re-tune agents before the next flight. According to industry sources, the ground stack also supports preflight plan loading and post-flight comparison of agent choices against baseline tactics.
Northrop’s approach answers a persistent hurdle. Many autonomy demonstrations prove a single behavior in isolation but fail when teams try to run multiple behaviors at once. Beacon aims to let two or three partner agents operate on the same sortie without stepping on each other’s timing or data feeds. Aviation reporting on Beacon emphasizes this division of responsibility as the practical route to concurrency.
Control authority stays simple. The safety pilot can take the jet back at any moment. Test directors can pull a dedicated circuit breaker or command Prism to idle and revert to manual flight. Those procedures mirror other autonomy testbeds and satisfy airworthiness reviewers.
Partner Modules: Red 6, Autonodyne, Applied Intuition, Merlin, SoarTech, Shield AI
Red 6 provides an outdoor augmented reality system that inserts virtual aircraft into the pilot’s view and into the host aircraft’s sensor timelines. The pilot sees a virtual formation or an intercept target. The host radar, EW suite and navigation stack receive synthetic returns consistent with that view. This arrangement lets testers build dense air pictures without live range traffic. Company executives said users can define terrain, threats, friendly scripts and spectral conditions to stress the agents.
Autonodyne’s initial role centers on coordinating multi-aircraft maneuvers. Formation logic weighs timing, geometry and sensor coverage rather than chasing a single best path for one vehicle. The aim is a tactically coherent plan for the entire package with hand-offs that keep sensors and weapons in favorable positions while maintaining de-confliction.
Applied Intuition’s EpiSci unit focuses on human-machine teaming. The team plans to capture how pilots interact with autonomy agents over many reps, then use those observations to refine when the agent proposes versus when it executes. During conference briefings, the company emphasized time in the seat with human pilots as the gating variable. Beacon’s long-endurance sorties and concurrency allow those reps to happen faster.
Merlin Labs brings a package of mission autonomy behaviors that ingest aircraft state, sensor inputs and synthetic data feeds, then publish decisions about route changes, threat reactions or timing gates for on-station tasks. Engineers will compare agent outputs with ground truth at debrief, then push updates. Merlin’s managers described expectations of “highly effective” decisions under stress, but results will be scored against recorded timelines.
Soar Technology supplies collaborative autonomy tuned to Combat Air Patrol procedures. A CAP demands that multiple aircraft reach station, divide sectors, keep sensors oriented and cycle fuel while preserving coverage. The company’s module gives Beacon a testable script with known measures. The CAP use case also fits mixed live-virtual traffic, where Red 6 can populate lanes with threats while SoarTech coordinates radar and timeline discipline.
Shield AI contributes learning across missions. The software supports pre-mission planning, in-mission execution and post-mission analysis under one umbrella, with the goal of modifying tactics and parameters as the data set grows. Under Beacon, this loop will run against real flight logs rather than sim-only tracks. Teams can run an agent, analyze outcomes and then fly the update the same week if airspace and safety boards agree.
According to industry sources, flying several partner modules at once offers an operational advantage. A single sortie can test how formation control interacts with augmented reality traffic while a learning agent tries new replans. Conflicts between modules then appear quickly, with logs pinpointing which interface or timing edge caused the fault.
Flight Test Schedule
The company’s public count called for five more crewed flights after September 20. Engineers intend to enable Prism control once those flights close the required test points. The plan sets a first phase where Prism runs but a human flies, followed by a phase where Prism commands the aircraft with the safety pilot ready. Only after those steps will Northrop start flying with partner modules in the loop. Executives said the goal is flights with several partner agents active at the same time.
Interface stability determines the pace. Prism owns navigation, flight logic and the health monitor. Mission modules publish intents through defined endpoints. If a module faults, Prism must reject out-of-bounds commands and keep the aircraft stable.
The airframe’s performance profile gives flight tests more room than typical surrogates. Earlier engineering notes indicate enough range and endurance to support multi-hour scripts with diverse modules and ample recorder storage. The jet’s layout also permits on-the-ground swaps of compute payloads between sorties.
Program staff described a straightforward acceptance of concurrency risk. One sortie may carry Red 6 traffic to stress sensor timelines, Autonodyne formation logic to enforce coverage geometry and Shield AI planning to re-prioritize waypoints as threats change. The same flight can carry SoarTech’s CAP behaviors as the overarching script. Test control assigns priority in advance so command arbitration remains deterministic.
Conference coverage placed these remarks in the context of broader Air Force interest in autonomy surrogates. Beacon offers an optionally crewed aircraft with CCA-like performance and dedicated integration support.
According to defense sources, safety cases and airspace coordination remain the gating items for any move to fully uncrewed sorties. Early flights use restricted corridors, chase support when required and predefined abort gates.
Applied Intuition’s acquisition of EpiSci earlier this year folded EpiSci’s air autonomy work into a larger toolchain for simulation, data labeling and test management. That move shortens the road from software bench tests to air trials.
Northrop declined to publish granular timelines for partner module air trials beyond the five-flight sequence. The practical constraint is lab readiness and airworthiness approval of each module. Teams with mature agents and documented interfaces will load earlier.
The program’s visible progress across September aligns with public statements at the Air, Space and Cyber event and with factory updates from Mojave. The schedule now depends more on interface stability and mission-module readiness than on airframe availability.
REFERENCE SOURCES
https://www.janes.com/osint-insights/defence-news/c4isr/afa-2025-northrop-grumman-aims-to-spur-development-of-ai-autonomy-through-beacon-test-ecosystem
https://breakingdefense.com/2025/09/northrop-grummans-partners-tout-ai-software-from-ar-training-to-mission-planning/
https://www.twz.com/air/scaled-composites-model-437-vanguard-jet-is-now-flying-as-an-ai-testbed
https://aviationweek.com/defense/aircraft-propulsion/northrop-grumman-set-fly-autonomy-testbed
https://breakingdefense.com/2025/09/northrop-grummans-ai-testbed-will-fly-for-the-first-time-this-fall-company-says/
https://www.ainonline.com/aviation-news/futureflight/2025-07-31/northrop-grumman-and-merlin-partner-autonomous-flight
The post Northrop’s Model 437 Vanguard Advances to Flight Testing with Partner Software for Collaborative Autonomy and Mission Learning appeared first on DEFENSE-AEROSPACE.
Northrop Grumman’s Model 437 Vanguard flew again on September 20 after months of modification for the Beacon autonomy program. Company engineers installed the Prism autonomy stack, integrated new interfaces and prepared the aircraft for envelope expansion sorties at Mojave before partner software goes aboard. The aircraft is a Scaled Composites design sized around a single Pratt & Whitney 535 class engine. Earlier technical descriptions place its wingspan and length at about forty-one feet, with internal volume and wiring provisions for avionics payloads suited to autonomy research. Flight time and payload room decide how many autonomy agents, sensors and recorders the
The post Northrop’s Model 437 Vanguard Advances to Flight Testing with Partner Software for Collaborative Autonomy and Mission Learning appeared first on DEFENSE-AEROSPACE.