Research Updates

Fall 2024: Collaboration with University of Arizona

I'm excited to announce my upcoming three-month stay as a visiting scholar at the University of Arizona, Second Language Acquisition & Teaching (SLAT) program.

During my stay, I will be focusing on the following activities:

  1. Collaborating with Dr. Reinhardt and the Games in L2 Learning and Teaching (GL2TL) Research Group to refine my theoretical framework and research methods.
  2. Improving data instruments for analyzing fluency, intelligibility, and comprehensibility in online audio data collected from gamers.
  3. Developing and piloting qualitative data collection methods like game observation techniques.
  4. Exploring specific game design elements and their potential impact on language learning outcomes.

This opportunity will provide valuable feedback, unique collaboration, and access to new research contexts, significantly contributing to the development of my project.

Summer 2024: Ongoing Instrument Development

Our current research focuses on developing two tools for language learning research: a conversational agent for L2 English dialogue data collection and a specialized dataset, or corpus, of L2 English usage in online gaming contexts including a conversational agent and a specialized dataset of L2 English.

Conversational Agent Development

Our research project involves developing a conversational agent to facilitate L2 English dialogue data collection. Utilizing a libre LLM and the Elixir programming language, we aim to create an agent capable of conducting brief, natural conversations with participants. A key focus is optimizing response times to closely simulate human interaction. Despite time constraints limiting proper validation, we anticipate that this tool will yield valuable data and increase participant engagement. The project is being developed in collaboration with a computer science student at the University of London. While not without limitations, this approach offers a practical method for gathering online dialogue data in support of our research objectives.

L2 English Dataset from Online Gamers

We are looking to create and analyze a corpus of L2 English from online gaming contexts. We plan to collect 30,000-50,000 words from recorded gaming sessions of non-native English speakers. Using the Natural Language Toolkit (NLTK), we will examine vocabulary usage, grammatical structures, and gaming-specific language across different gaming genres. This research will contribute to our understanding of L2 English usage in computer-mediated communication contexts, potentially informing both theory and pedagogy.

Early-Mid 2024: Survey Development and Pilot Studies

The first quarter of 2024 was dedicated to developing and testing the research instruments. We conducted several pilot studies to test our survey framework, refine methodologies and gain initial insights into participant preferences and technical requirements.

Custom Survey Framework

This project involved developing custom programs using JavaScript, PHP, HTML5, and CSS3 to enhance website functionality and implement a survey framework. This approach overcame the limitations of existing libre survey software, particularly in terms of design customization and functionality. The resulting program offers improved control over survey layout, navigation, and data storage, notably facilitating the collection and storage of audio data— a feature uncommon in standard survey software. Future program development contemplates rewriting the system using Go to prevent potential runtime errors, improve data accessibility and interpretability, and establish a foundation for future survey instruments.

Pilot results:

1. Topic preferences and task order (n=60)
Preferred task order
  1. Survey
  2. Native language task
  3. Three L2 English tasks
Preferred order for L2 tasks
  1. Monologue task
  2. Story re-telling task with video
  3. Dialogue
Topic preferences for monologue tasks
  • L2 English monologue: Hobbies
  • Native Language monologue: Hometown
2. Participation preferences (n=205)
  • ~75% prefer computer-human voice interaction in a second language
  • ~66% more likely to participate in online, self-paced studies without researcher interaction
3. Audio quality and participation challenges (n=15)
  • Tested survey framework and audio data collection
  • Revealed challenges in recruiting participants for dialogue tasks
4. Microphone and recording quality (n=15)
  • Tested microphone quality across various devices
  • Confirmed good overall audio quality for fluency studies
  • Identified potential issues with environmental noise

Software and Tools List

To support our research goals, we're continuously enhancing our technical skills. We're focusing on tools and technologies that will enable more sophisticated data analysis and improve our research methodologies. We are also dedicated to using libre software in order to guarantee participant privacy and security.

Planned software and tools:

  • Praat: For effective speech data analysis and interpretation
  • Python libraries:
    • NLTK (Natural Language Toolkit): For language processing, tokenization, and sentiment analysis
    • Pandas
    • NumPy
    • Parselmouth