TOWARDS SPECIALIZED REINFORCEMENT LEARNING FROM DIVERSE DATA