Thing 13

AI video generation

Last reviewed: March 2026 30–45 minutes

You've now generated images from text, created synthetic speech, edited audio by editing text, and composed entire songs from a short description. In each case, the results were impressive enough to be immediately useful, even on free tiers. Video is the final creative modality, and it's the one where the gap between "that's amazing" and "that's not quite right" is most fascinating to explore.

AI video generation has made extraordinary progress. You can type a sentence describing a scene and, within a minute or two, watch a short video clip that didn't exist before, complete with camera movement, lighting, and convincing physics. A year ago, AI-generated video was mostly a novelty: people shared clips of melting faces and impossible physics as curiosities. Today, the best tools produce clips that could pass for footage shot on a real camera, at least for a few seconds.

But video is also where AI's current limitations are most visible. Hands still do strange things. Objects appear and disappear between frames. A person walking might suddenly have three legs, or the camera might do something physically impossible. The clips are short (typically five to ten seconds) and the longer you ask the AI to maintain a scene, the more likely things are to drift into the uncanny. This is a snapshot of where the technology sits right now: useful for some things, not yet reliable enough for others.

Understanding this matters beyond just video. Video generation is the most computationally demanding creative AI task, which is why it's the most expensive, the most limited on free tiers, and the most obviously imperfect. It's also the fastest-moving. The tools you'll use today are noticeably better than what was available six months ago, and what's available six months from now will likely make today's outputs look rough. Learning to evaluate AI video critically is a skill that transfers to every other AI modality you've explored so far.


How AI video generation works

An abstract illustration representing AI video generation, with visual elements suggesting motion, frames, and digital creation
AI video generation can produce short clips from text descriptions or still images, but maintaining consistency across frames remains a core challenge.

The basic process will feel familiar from image generation. You write a text prompt describing a scene, and the AI generates a short video clip. Some tools also accept a still image as a starting point, animating it into motion. This image-to-video approach often produces more predictable results because the AI has a visual reference to work from rather than building everything from text alone.

Under the surface, video generation models are trained on enormous datasets of video footage. They've learned patterns: how water flows, how fabric moves in wind, how a camera pans across a landscape. When you give them a prompt, they draw on those learned patterns to generate a sequence of frames that, ideally, look like a coherent piece of video.

"Ideally" is doing a lot of work in that sentence, though. Video is far more complex than a still image. An image generator produces one consistent frame. A video generator produces dozens of frames per second, and each one has to be consistent with the ones before and after it. Every object needs to move in a physically plausible way. Lighting needs to stay consistent. If there's a person in the scene, their face, body, and clothing need to look the same in every frame. This temporal consistency (maintaining coherence over time) is the central challenge of video generation, and it's why video AI lags behind image AI in reliability.

This is also why the clips are short. Most free-tier generations produce five to ten seconds of video. That's enough to showcase a mood, demonstrate a concept, or create a social media clip, but it's a long way from generating a full scene or a narrative sequence. Some paid tiers allow extensions up to several minutes, but quality tends to degrade with longer generation, and the costs climb steeply.


The current tool landscape

Video generation is the most expensive AI creative category, which means free tiers are more limited than what you've encountered with image or music tools. That said, several platforms offer enough free access to give you a proper feel for the technology.


Ethics, copyright, and deepfakes

The ethical questions around AI video are the same ones you've encountered with images, voice, and music, but amplified.

Copyright and training data. Like image and music generators, video models are trained on existing video content. The legal status of this training is still being worked out, and the same tensions between AI companies and content creators apply here.

Deepfakes. Video generation makes the creation of convincing fake footage significantly easier. While the tools discussed here have safeguards against generating realistic depictions of identifiable individuals, the underlying capability exists and is a serious concern. Being able to recognise AI-generated video, and knowing that it exists, is an important part of media literacy in 2026.

Provenance and transparency. Many video generation platforms now embed metadata in their outputs to identify them as AI-created. If you share AI-generated video, being transparent about its origin is both good practice and, increasingly, expected.

Environmental cost. Video generation requires significantly more computing power than image generation, which translates to higher energy consumption. This isn't a reason to avoid using the tools, but it's worth being aware that generating dozens of video clips has a real environmental footprint; another reason to be thoughtful rather than indiscriminate about what you generate.


Resources to explore

Pika

Free tier with limited monthly credits. Image-to-video on free tier; text-to-video on paid tiers. Web-based, no download required. The most accessible starting point for this activity.

Open tool
Kling AI

Free tier with 66 daily credits. Impressive realism and longer video capability. Web-based.

Open tool
Runway

Limited one-time free credits (125). Industry standard for professional AI video. Web-based.

Open tool
Luma Dream Machine

Free credits available. Strong physics simulation and high-fidelity output.

Open tool
AI video generation comparison (PXZ AI)

A regularly updated comparison of the major platforms, useful for understanding the current state of the field.

Read article

Activity: your first AI videos

30–45 minutes Pika (free tier) + optionally a second tool

You're going to generate AI video clips using at least two different approaches, evaluate the results critically, and reflect on where this technology is useful today and where it falls short. Think of this as the video version of the image generation exercise from Thing 9: you're learning by making, comparing, and evaluating.

  1. Set up your account. Create a free account at pika.art. If you'd like to try a second tool for comparison (recommended if you have the patience for two sign-ups) create an account at klingai.com as well.
  2. Start with image-to-video. Find or create a still image to use as your starting point. You could use one of the AI-generated images you created in Thing 9, a photograph you've taken yourself (of a landscape, a pet, a still life; avoid photos of other people), or any image you have the right to use. Upload it to Pika and add a short motion prompt describing how you'd like the scene to come alive.
  3. Try text-to-video (if available). If your chosen tool supports text-to-video on its free tier, try generating a clip entirely from a text description. Write two different prompts: one naturalistic, one creative or abstract. If text-to-video isn't available on your free tier, try two more image-to-video generations with different source images and more ambitious motion prompts.
  4. Evaluate what you've made. Watch each clip several times and assess the results against the evaluation criteria below.
  5. Compare tools (optional but recommended). If you signed up for a second tool, run the same prompt or upload the same image and compare the results.
Privacy reminder: use personal images, personal examples, or fictional scenarios. Never use actual work materials, client content, or confidential images.

Your output

A document containing:

  • The prompts (and source images, if used) for each video you generated
  • The video clips themselves (downloaded or screen-recorded), or screenshots from the clips if downloading isn't available on your tier
  • Your evaluation of each clip, covering at least the points listed in the evaluation criteria above
  • A short reflection (a few paragraphs) on your overall impression: what surprised you, what disappointed you, where you can see AI video being useful in its current state, and what would need to improve before you'd use it for something that mattered

Why this matters

Video generation rounds out your tour of AI's creative capabilities. You've now experienced AI working across text, images, speech, audio, music, and video, and the pattern is consistent: AI produces impressive results quickly, but the quality varies and human judgement is essential for evaluating the output.

More importantly, video is where the pace of improvement is fastest. The limitations you notice today are likely to be significantly reduced within months. By forming a clear, honest assessment of where things stand now, you're creating a baseline you can measure future progress against. That habit of critical evaluation is one of the most useful things you can take from this programme.


Claim your Open Badge

Submit your prompts, generated video clips (or screenshots), evaluation notes, and written reflection as evidence for your Thing 13 badge via cred.scot.

Thing 13: AI video generation open badge
Thing 13: AI video generation

Submit your prompts, video clips or screenshots, evaluation notes, and reflection as evidence to claim this badge via cred.scot.

Claim now

What's next

You've now explored AI's creative toolkit across every major modality: text, images, speech, audio, music, and video. In Thing 14, we're changing gear entirely. Instead of exploring new tools on your computer, you're going to pick up your phone and discover the AI features already built into it. Whether you use an iPhone or Android, your phone is almost certainly doing more with AI than you realise, from writing tools and photo editing to real-time translation and visual search. Thing 14 is about finding the AI that's already in your pocket.