In the digital age, game-based learning platforms like Minecraft offer immersive and interactive environments where students engage in spontaneous communication, usually in English. However, the level of vocabulary required in listening skills remains underexplored. This study investigates the Common European Framework of Reference for Languages (CEFR) level of vocabulary used in a Minecraft gameplay session, focusing specifically on listening comprehension. The study involved two players engaging in a Minecraft gameplay. Analysis of a 20-minute audio-recorded session was conducted. Using the Text Inspector tool, the Minecraft gameplay interaction was found to be at B1+ (43%) CEFR level for listening. Analysis of word types statistics show that the players were using mostly A1 (42.74%), A2 (20.09%) and some B1 words (11.54%) and B2 words (5.13%) to negotiate decisions, make suggestions, express disagreement, and describe opinions. The use of modal verbs such as “should” and “can” indicates that learners were aware of politeness strategies. However, 18.38% of the word types were unlisted words such as colloquialisms and game-specific terminology. Playing Minecraft may develop spoken language but additional scaffolding is needed to move learners beyond intermediate proficiency.