>100 Views
November 06, 25
スライド概要
明治大学 総合数理学部 先端メディアサイエンス学科 中村聡史研究室
CollabTech 2025 Do You See What I See? Vocal Cues to Visual Acuity Discrepancies in VR-Based Stargazing Meiji University Sora Iida, Satoshi Nakamura
Contributions • Focus on conversation as a key aspect of collaborative stargazing. • Build a VR environment that allows shared observation with controlled visual-acuity differences. • Observe trends such as more Eh? utterances under Unequal vision condition, suggesting directions for future audio-based support systems. 1
Background • Stargazing: A collaborative and conversational activity • It is difficult to align exactly which star is being discussed. We think differences between observers (e.g., visual acuity, knowledge gap ) also influence the alignment of star recognition. Huh? The brightest one? They all look bright to me... How far to the right do you mean? Look see that brightest star? There’s another one just to the right of it. Can’t you see it? 2
Existing Visual Support Existing support for aligning what observers see • AR Systems[1], Virtual AR apps[2] Enable shared same views but bright screens disturb dark adaptation • Laser pointers Help indicate stars but are often restricted or impractical outdoors We need another way without using bright visual tools. Virtual AR apps[2] [1] Fifth Star Labs: Sky Guide, https://www.fifthstarlabs.com/. (see 2025-10-22) [2] Zhang, J., Sung, Y.T., Hou, H.T., Chang, K.E.: The development and evaluation of an augmented reality-based armillary sphere for astronomical observation instruction. Computers & Education 73, 178–188 (2014) 3
Our Research Goal Build an auditory-based conversational support system • Using AI, we estimate which stars the observers are misaligned about based on their conversation. • Provide advice to help them align their recognition of stars and constellations. Estimate misalignmen t Speaker A Let’s try explaining using the nearby bright star as a reference. Our System Let’s try describing the star I see from my side too! Speaker 4
Research Question To realize our system, it is necessary to investigate whether misalignments can be estimated from conversations during stargazing. How do visual acuity differences between partners influence conversational behavior during a shared stargazing task︖ • Conduct conversational task experiments in a virtual environment under the Equal and Unequal conditions. 5
Experiment : System Detail • Create original skies and constellations • Place about 500 stars • Installation of landmarks (moons, planets, mountains) 6
Experiment : System Create three levels of visual acuity (Good, Normal, Poor) This allowed pairs to share the same sky but perceive it differently. Good Normal Poor 7
Experiment : Design Task:VR-based constellation search Simulate a scenario where one partner knows the constellation and teaches the other. Describer • Knows the constellation’s shape, position. • Explains the constellation verbally to the Identifier within 600 seconds. Identifier • Does not know the constellation’s shape, position. • Listens to the Describer’s explanation to understand the constellation. • Takes a brief understanding test after the task. Experiment scene 8
Experiment : Design Combinations of visual acuity conditions Participants • 8 students in same lab (5 male, 3 female), aged 20-23 • Visual acuity of 0.7 or higher 9
Conversational Features 1. Question utterances Classify utterances into 9 categories. (Explanation, Backchannel, Response, Question, Confirmation, Gaze instruction, Request, Incomplete) 2. Number of Eh? Characteristics of “Eh?” when the listener fails to understand the speaker’s utterance. 3. Response Latency(s) The speech interval between the Describer and the Identifier. Explore how visual-acuity differences affected conversational behavior. 10
Conversational Features 1. Question utterances Classify utterances into 9 categories. (Explanation, Backchannel, Response, Question, Confirmation, Gaze instruction, Request, Incomplete) 2. Number of Eh? Characteristics of “Eh?” when the listener fails to understand the speaker’s utterance. we say “Eh?(え?)” when we don’t understand what the other person said. “Eh?” as an indicator of conversational difficulty. 3. Response Latency(s) The speech interval between the Describer and the Identifier (using Pyannote.audio) 11
Results Objective indicators of conversation for each condition • Question utterances were 1.6% higher in the Unequal condition. • Number of “Eh?” utterances was 21 higher in the Unequal condition. • Response latency was 0.1 seconds longer in the Unequal condition. But, There were no significant differences in any of the metrics. 12
Results Objective Measures of Recognition Misalignment Time to recognition the time it took for participants to reach a shared understanding • The Equal condition ︓345(s) • The Unequal condition ︓365(s) The Unequal condition requires a longer time for recognition agreement 13
Discussion Detecting Misalignment from Vocal Cues • Conversational metrics tended to change under visual-acuity differences. more Question utterances, more Eh? utterances, longer Response latency • These vocal cues could serve as indicators of recognition misalignment. Limitation • Analysis of conversational features by a single coder raises concerns about reliability. • The presence of prominent landmarks in the environment may have influenced conversation. • Participants were a small, convenience sample from one laboratory, limiting generalizability. 14
Conclusion Differences in visual acuity tend to create “conversational friction” Future work • Verify whether the observed stellar shift can be estimated using conversation metrics. • Explore possible designs for future support systems based on these findings. Response latency viewing direction angle Question utterances Eh? utterances 15