Unsupervised Detection and Synthesis of Speech and Environmental Sounds Using Generative Networks