Comparing frame membership, WordNet-based similarity and distributional similarity
Abstract
Frame co-membership is a relation between lexical units occurring within the same lexico-syntactic environment and representing the same cognitive structure (i.e., frame). This informal relation is valuable to the construction of predicate-modeled language resources, automatic induction of lexical units and semantic role labeling. However, it requires extensive human effort, which slows the progress of FrameNet (FN) and undermines the construction of similar databases. The current study first addresses the challenge of converting frame membership into a numerical similarity relation. This conversion should facilitate the comparison between frame membership, WordNet-based similarity and distributional similarity. The study then identifies the most statistically compatible measures with frame membership. The proposed measure of degree of frame co-membership (DFCM) is entirely based on the FN database. It embraces the unique features of Frame Semantics and does not account for any frame-external data. Accordingly, it preserves the individual approach of the theory and the distinctive criteria for word grouping. Although DFCM does not reflect the lexical or numerical relations between words in WordNet (WN) or distributional semantics, it is compatible with similarity scores obtained from the WN database and through distributional tools. The results may have considerable implications for the enrichment of FrameNet’s lexicon without jeopardizing the precision of the database or maintaining the sole dependence on the manual effort of lexicographers.