The theory of image schemas was introduced as a missing link between embodied experience and mental representation. The theory proposes a relatively small number of conceptual building blocks based on spatio-temporal relationships called "image schemas'' upon which reasoning and different forms of communication can be built. While image schemas often are spoken about as spatio-temporal relationships, rather often the temporal dimension is omitted. Identifying and formally discussing image schemas in their static sense is complicated enough, but it is conceptually impossible to discuss the phenomena of image schemas while ignoring the dynamics of temporal change. For instance, the image schema CONTAINMENT is proposed to be learned from the movement of objects in and out of containers rather than the inside-border-outside relationship presented in cognitive linguistics research. It is a prerequisite that an infant understands in and out movement before it can understand concepts such as enclosure and containment. Image schemas have found increased interest in research on artificial intelligence as they offer a cognitively inspired bridge to computational concept comprehension and concept invention. One assumption is that the integration of image schemas will enable artificial intelligence and language comprehension tools to support a better 'understanding' of abstract language, conceptual metaphors, or analogies. However, currently the state of rendering image schemas formally has been primarily restricted to describing them as purely static relationships. In order to have a more accurate formal description, the temporal dimensions need more attention. This abstract is intended to highlight the importance of time and change for image schemas, as these constitute some of the most important aspects of these conceptual building blocks. The theory of image schemas is therefore naturally and closely linked to the fields of multi-modal and qualitative modelling, which we intend to explore further in our work, in particular with attention to the cognitive adequacy of the chosen formalisms. Formalising image schemas qualitatively may then employ e.g. temporal logics, trajectory calculus, and a variety of spatial calculi. Here, the appropriate combination of these formal methods is of the essence for capturing the full multi-modality of the respective image schema.