Learning good actions to take

Cover that ‘optimality’ isn’t usually needed in practice
Cover that simulating actions isn’t usually like simulating other parts of systems because the data isn’t usually good at all!
Cover Bayesian optimisation and MCA-ES
Cover how to chunk up action domains into seperate sub-domains