2022
COLING
COLING 2022
Visual Recipe Flow: A Dataset for Learning Visual State Changes of Objects with Recipe Flows
Abstract
AbstractWe present a new multimodal dataset called Visual Recipe Flow, which enables us to learn a cooking action result for each object in a recipe text. The dataset consists of object state changes and the workflow of the recipe text. The state change is represented as an image pair, while the workflow is represented as a recipe flow graph. We developed a web interface to reduce human annotation costs. The dataset allows us to try various applications, including multimodal information retrieval.
🌉
Interdisciplinary Bridge
— Artificial Intelligence and Computer Vision and Machine Learning and Natural Language Processing
🧭
Keyword Pioneer
— object state change
🐣
Hot Topic Early Bird
— multimodal dataset
🐝
Cross-Pollinator
— Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio
Authors
Topics
Artificial Intelligence > Core AI > Multimodal Learning
Computer Vision > Analysis > Scene Understanding
Computer Vision > Generation > Image Captioning
Natural Language Processing > Applications > Information Retrieval
Natural Language Processing > Resources & Methods > Multilingual NLP
Machine Learning > Learning Types > Multi-Task Learning
Computer Vision > Core AI > Multimodal Learning
Machine Learning > Learning Types > Multi-Modal Learning