I see differences between the three. The first means he mimicked only what he actually heard people say live in person. The second means he mimicked anything he heard, for example, birds, car noises, dogs barking, etc. The third means he mimicked anything people had said, say, on tv, in movies, on the internet, etc.