Novel strings of letters (i.e., pseudowords) lack established meaning(s), yet they may still evoke systematic, distributional signals that influence human behavior. Here, we tested whether distributional determinants of word memorability generalize to these novel strings. To do so, we leveraged a word-embedding model that was able to represent in a vector space not only attested words but also unmapped strings as bags of character n-grams. A ridge model trained on item-level word memorability norms learned a linear mapping from 300-dimensional embeddings to recognition memorability and achieved strong out-of-fold performance. We then applied this model zero-shot to predict memorability for 2,100 phonotactically legal pseudowords, whose baseline predictability was captured by orthographic and frequency features. Adding the zero-shot distributional score significantly improved the baseline model. These findings show that distributional representations derived from subword statistics carry mnemonic information that is not reducible to orthographic familiarity, and that novel strings are interpreted within a shared representational space learned from language experience. More broadly, they support the view that memorability is an intrinsic attribute predictable from representational information, even in the absence of learned meanings.

Zero-shot pseudowords memorability via representational content analysis / Gatti, D.; Günther, F.. - In: PSYCHONOMIC BULLETIN & REVIEW. - ISSN 1069-9384. - 33:4(2026). [10.3758/s13423-026-02875-x]

Zero-shot pseudowords memorability via representational content analysis

Gatti D.
;
2026-01-01

Abstract

Novel strings of letters (i.e., pseudowords) lack established meaning(s), yet they may still evoke systematic, distributional signals that influence human behavior. Here, we tested whether distributional determinants of word memorability generalize to these novel strings. To do so, we leveraged a word-embedding model that was able to represent in a vector space not only attested words but also unmapped strings as bags of character n-grams. A ridge model trained on item-level word memorability norms learned a linear mapping from 300-dimensional embeddings to recognition memorability and achieved strong out-of-fold performance. We then applied this model zero-shot to predict memorability for 2,100 phonotactically legal pseudowords, whose baseline predictability was captured by orthographic and frequency features. Adding the zero-shot distributional score significantly improved the baseline model. These findings show that distributional representations derived from subword statistics carry mnemonic information that is not reducible to orthographic familiarity, and that novel strings are interpreted within a shared representational space learned from language experience. More broadly, they support the view that memorability is an intrinsic attribute predictable from representational information, even in the absence of learned meanings.
2026
Zero-shot pseudowords memorability via representational content analysis / Gatti, D.; Günther, F.. - In: PSYCHONOMIC BULLETIN & REVIEW. - ISSN 1069-9384. - 33:4(2026). [10.3758/s13423-026-02875-x]
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11381/3053893
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? 0
social impact