Efficient Policy Gradient Reinforcement Learning in High-Noise Long Horizon Settings
dc.contributor.author | Aitchison, Matthew | |
dc.date.accessioned | 2025-01-12T02:46:26Z | |
dc.date.available | 2025-01-12T02:46:26Z | |
dc.date.issued | 2025 | |
dc.description.abstract | This thesis explores the enhancement of Policy Gradient (PG) methods in reinforcement learning (RL), focusing on their application in real-world scenarios. It addresses challenges in efficient evaluation, noise reduction, and long-horizon discounting in RL. Key contributions include the Atari-5 dataset, reducing evaluation time in the Arcade Learning Environment, and the Dual Network Architecture (DNA) algorithm, improving Proximal Policy Optimization's (PPO) performance in vision-based tasks. The TVL algorithm, capable of learning over long horizons without discounting, demonstrates potential in high-noise environments. This research advances the understanding and application of PG methods, highlighting their practical implications in complex decision-making and robotics. | |
dc.identifier.uri | https://hdl.handle.net/1885/733731566 | |
dc.language.iso | en_AU | |
dc.title | Efficient Policy Gradient Reinforcement Learning in High-Noise Long Horizon Settings | |
dc.type | Thesis (PhD) | |
local.contributor.affiliation | ANU College of Engineering, Computing and Cybernetics, The Australian National University | |
local.contributor.supervisor | Kyburz, Penelope | |
local.identifier.doi | 10.25911/8T8S-TY70 | |
local.identifier.proquest | Yes | |
local.identifier.researcherID | ||
local.mintdoi | mint | |
local.thesisANUonly.author | 73287e0d-1680-4851-a32c-342a515bde96 | |
local.thesisANUonly.key | 7173dbae-0992-e48b-c7e3-4f6763a7d87f | |
local.thesisANUonly.title | 000000022450_TC_1 |
Downloads
Original bundle
1 - 1 of 1
Loading...
- Name:
- Aitchison_PhD_Thesis_2025.pdf
- Size:
- 10.45 MB
- Format:
- Adobe Portable Document Format
- Description:
- Thesis Material