We present Intentable, a mixed-initiative caption authoring system that allows the author to steer an automatic caption generation process to reflect their intents on captions. We first derive grammar for specifying the intents, i.e., a caption recipe, and build a neural network that generates caption sentences given a recipe. Our quantitative evaluation revealed that our intent-based generation system not only allows the author to engage in the generation process but also produces more fluent captions than the previous end-to-end approaches without user intervention. Finally, we demonstrate the versatility of our system such as context adaptation, unit conversion, and sentence reordering.