+7 (495) 748-05-32

ЛИЧНЫЙ КАБИНЕТ

Войти Регистрация

Autopentest-drl Exclusive

Introduction

Decision Engine: A Deep Reinforcement Learning (DRL) engine (specifically a DQN model) serves as the brain, determining the most efficient attack paths based on the information gathered. autopentest-drl

Abstract

The increasing complexity of modern network infrastructures renders traditional manual penetration testing labor-intensive, error-prone, and non-scalable. This paper proposes AutoPenTest-DRL, a novel framework that leverages Deep Reinforcement Learning (DRL) to automate the process of network penetration testing. By modeling the attacker’s actions, network states, and reward mechanisms as a Markov Decision Process (MDP), our framework enables an autonomous agent to learn optimal attack paths, prioritize high-value targets, and adapt to dynamic network environments. Experimental results on virtualized network topologies demonstrate that AutoPenTest-DRL achieves higher coverage of vulnerabilities (up to 92%) and reduces testing time by 67% compared to rule-based automated scanners like OpenVAS and Metasploit’s autopwn. This work highlights DRL’s potential to revolutionize cybersecurity assessments through intelligent, goal-driven decision-making. Introduction Decision Engine : A Deep Reinforcement Learning

4. The Memory Replay Buffer with Causal Masking

Typical DRL replays random past experiences. For pentesting, causality is sacred. You cannot “un-exploit” a host. Therefore, AutoPentest-DRL uses a directed acyclic graph (DAG) experience replay, which respects the temporal order of compromises. ring vs. star)

  1. Efficiency: AutoPenTest-DRL completed the complex scenario in 7.4 minutes vs. 89 minutes for manual analysis.
  2. Exploration-Exploitation Balance: The agent learned to avoid fruitless brute-force attempts after ~2000 episodes, focusing on high-probability exploits first.
  3. Generalization: When tested on unseen network topologies (e.g., ring vs. star), the agent’s success rate dropped only to 84%, indicating reasonable transfer learning.

3.3 Action Selection and Execution

The agent selects an action based on current state (s_t) using an epsilon-greedy policy (decaying from 1.0 to 0.1). Selected actions are translated into concrete commands via an Action Mapper that interfaces with Metasploit’s RPC API and native Linux tools.

: The goal of frameworks like AutoPentest-DRL is to move beyond static vulnerability scanners (like

To "put together" a feature or implement this system, you need to integrate three core functional components: Information Gathering Attack Path Planning (the DRL engine), and Attack Execution Core Functional Components Information Gathering (Nmap):

Форма для публикации вакансии

Правила размещения вакансий в разделе «Вакансии»:

  1. Вакансии размещаются бесплатно.
  2. Размещение вакансий не зависит от членства в Институте внутренних аудиторов.
  3. Размещаются вакансии, относящиеся к внутреннему аудиту и внутреннему контролю.
  4. Размещать вакансии могут только организации-работодатели.
  5. Допускается одновременное размещение нескольких вакансий от одной организации-работодателя.
CAPTCHA

* Вакансия будет размещена после модерации
Все поля обязательные для заполнения

Нажимая на кнопку "Отправить", я даю свое согласие на обработку персональных данных
×