On the acoustic side, we use a score fusion of three strong models: recurrent nets with maxout activations, very deep convolutional nets with 3x3 kernels, and bidirectional long-short term memory nets which operate on bottleneck features.
Source: wiktionary