2021
EMNLP
EMNLP 2021
Automating Claim Construction in Patent Applications: The CMUmine Dataset
Abstract
AbstractIntellectual Property (IP) in the form of issued patents is a critical and very desirable element of innovation in high-tech. In this position paper, we explore the possibility of automating the legal task of Claim Construction in patent applications via Natural Language Processing (NLP) and Machine Learning (ML). To this end, we first create a large dataset known as CMUmine™and then demonstrate that, using NLP and ML techniques the Claim Construction in patent applications, a crucial legal task currently performed by IP attorneys, can be automated. To the best of our knowledge, this is the first public patent application dataset. Our results look very promising in automating the patent application process.
🌉
Interdisciplinary Bridge
— Computer Science and Machine Learning and Natural Language Processing
🧭
Keyword Pioneer
— patent mining
🐣
Hot Topic Early Bird
— dataset creation
🐝
Cross-Pollinator
— Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio
Authors
Topics
Machine Learning > Application Areas > Data Augmentation
Natural Language Processing > Applications > Information Extraction
Natural Language Processing > Applications > Text Classification
Computer Science > Applications > Document Analysis
Natural Language Processing > Applications > Text Processing
Machine Learning > Learning Types > Machine Learning