Paper: End-to-End Speaker-Dependent Voice Activity Detection gave us a method to use LSTM to implement speaker-dependent voice activity detection.
LSTM usually can handle about 200 length sequence effectively. However, if you should handle more than 200, for example 2000 length, how to do?
DC-Bi-LSTM is a tensorflow implementation of the paper: Densely Connected Bidirectional LSTM with Applications to Sentence Classification.