In recent years, there has been an increased interest from both academics and practitioners in automatically analyzing the textual part of companies’ financial reports to extract meaning rich in information for future outcomes. In particular, tracking textual changes among companies’ reports can have a large and significant impact on stock prices. This impact happens with a lag implying that investors only gradually realize the implications of the news hinted by document changes. However, the length of these documents as well as their complexity in terms of structure and language have been increasing dramatically making this process more and more difficult to perform. In this paper, we analyzed how to face this complexity by learning arbitrary dimensional vector representations for US corporate filings (10-Ks) from 1998 to 2018, exploiting and comparing different neural network embedding techniques which take into account words’ semantics through vectors proximity. We also compared their ability to capture changes associated with future risk-adjusted abnormal returns with other more commonly used approaches in literature. Finally, we propose a novel investment strategy named Semantic Similarity Portfolio (SSP) that exploits these neural network embeddings. We show that firms that do not change their 10-Ks in a semantically important way from the previous year tend to have large and statistically significant future risk-adjusted abnormal returns. We, also document an amplifying effect when we incorporate a momentum-related criterion, where the companies selected must also have had positive previous year returns. Specifically, a portfolio that buys “non-changers” based on this strategy earns up to 10% in yearly risk-adjusted abnormal returns (alpha).
|Appare nelle tipologie:||1.1 Articolo su rivista|