Abstract: |
This paper introduces Natural Language Processing for identifying ``true''
green patents from official supporting documents. We start our training on
about 12.4 million patents that had been classified as green from previous
literature. Thus, we train a simple neural network to enlarge a baseline
dictionary through vector representations of expressions related to
environmental technologies. After testing, we find that ``true'' green patents
represent about 20\% of the total of patents classified as green from previous
literature. We show heterogeneity by technological classes, and then check
that `true' green patents are about 1\% less cited by following inventions. In
the second part of the paper, we test the relationship between patenting and a
dashboard of firm-level financial accounts in the European Union. After
controlling for reverse causality, we show that holding at least one ``true''
green patent raises sales, market shares, and productivity. If we restrict the
analysis to high-novelty ``true'' green patents, we find that they also yield
higher profits. Our findings underscore the importance of using text analyses
to gauge finer-grained patent classifications that are useful for policymaking
in different domains. |