Example of spoken English exhibiting dense spectral energy across a wide frequency
range, reflecting typical characteristics of natural speech. — astro-ph.IM
We present an exploratory framework to test whether noise-like input can induce structured responses in language models.
Instead of assuming that extraterrestrial signals must be decoded, we evaluate whether inputs can trigger linguistic behavior in generative systems. This shifts the focus from decoding to viewing structured output as a sign of underlying regularity in the input.
We tested GPT-2 small, a 117M-parameter model trained on English text, using four types of acoustic input: human speech, humpback whale vocalizations, Phylloscopus trochilus birdsong, and algorithmically generated white noise. All inputs were treated as noise-like, without any assumed symbolic encoding. To assess reactivity, we defined a composite score called Semantic Induction Potential (SIP), combining entropy, syntax coherence, compression gain, and repetition penalty.
Results showed that whale and bird vocalizations had higher SIP scores than white noise, while human speech triggered only moderate responses. This suggests that language models may detect latent structure even in data without conventional semantics.
We propose that this approach could complement traditional SETI methods, especially in cases where communicative intent is unknown. Generative reactivity may offer a different way to identify data worth closer attention.
Po-Chieh Yu
Comments: submitted to the International Journal of Astrobiology
Subjects: Instrumentation and Methods for Astrophysics (astro-ph.IM); Computation and Language (cs.CL)
Cite as: arXiv:2506.02730 [astro-ph.IM] (or arXiv:2506.02730v1 [astro-ph.IM] for this version)
https://doi.org/10.48550/arXiv.2506.02730
Focus to learn more
Submission history
From: Po-Chieh Yu
[v1] Tue, 3 Jun 2025 10:46:57 UTC (202 KB)
https://arxiv.org/abs/2506.02730
Astrobiology, SETI,