ai-powered storyboard

Before OpenAI released their groundbreaking “Sora”, there were already many people creating or using AI to facilitate their production process. It seems that people in this industry are more inclined to embrace the new technology than to repel it, which is a bit contradicting to the overall pessimistic sentiment that I perceive in our class. Though, I could not draw a clear line between AI easing the barriers of entry for filmmaking and people depending on AI to outsource labor and degrading originality, I decided to look at how influencers and popular channels approach it.

The below video “Don’t know how to draw? We created our own AI-powered storyboard” is from Mediastorm, the most influential media channel on Bilibili (Chinese YouTube). They recently released this video explaining the pain points of traditional storyboard artists, how their team was exploring the possibility of using AI to generate storyboards back in 2022, and have already incorporated a beta version of their latest storyboard workflow in the software developed by themselves. Tim (the speaker in this video) explained how they tried different machine-learning models and scenarios to make the storyboard more accurate and coherent. Specifically, they were using a) Generative adversarial networks (GAN) as image input and correction, b) MMPose to generate 3D models for blocking and designing, and c) Stable Diffusion, a deep learning, text-to-image model released in 2022 based on diffusion techniques.

by Mediastorm, China’s largest self-media channel focusing on film and media production; can use auto-translate to watch the video

The 3D modeling process is to accurately control the camera position, especially the angle of the view and object distance. Many shooting angles are difficult to achieve in reality or with text-to-image workflow, so with the 3D model to extract the action data of the front view, the user can see the virtual model in the software. After adjusting the original model in the software the composition and perspective of the final image generated by AI can provide a more accurate reference for the photography director.

GAN, on the other hand, is used in this project to correct the input image to facilitate the extraction of characters’ bones. This occurs in the 3D models’ generation stage. Lastly, the final generation of background and characters credits to Stable Diffusion.

Basically, it is like this:

  • Pre-recorded character’s movement –> 3D human model
  • upload 3D human model –> software
  • make adjustments within the software
  • write a prompt/description on the style of the image
  • generate storyboard

Mediastorm indeed used their self-developed AI-powered storyboard to make a promo video of their merchandise. I have to admit it is looking okay. In the end, they are also looking into editing space where they envision combining AI and cloud-based footage to ease the process of post-production.


Comments

Leave a Reply

Your email address will not be published. Required fields are marked *