Visualquestionansweringisfundamentallycompositionalinnature—aquestionlikewhereisthedog?sharessubstructurewithquestionslikewhatcoloristhedog?andwhereisthecat?Thispaperseekstosimultaneouslyexploittherepresentationalcapacityofdeepnetworksandthecompositionallinguisticstructureofquestions.W