Bad AAAAAApple!! - Unmodded Psuedo Video Renderer

Started by KC, February 22, 2026, 11:25:53 PM

Previous topic - Next topic

KC

I put together a psuedo video player using textboxes VED's internal scripting, allowing the video to be played in an unmodded copy of VVVVVV. I naturally used this technology to render the titular Bad Apple!! in VVVVVV. Here's a demo of it:


You can download and play the level yourself and verify that it does, in fact, render in-game; link at the end. But for anyone curious how this works, allow me to go into excruciating detail :)



General Idea:

A frame of video is rendered using textboxes in the following way:

text(transparent,0,0,1)
-your frame here-
backgroundtext
textboxtimer(1)
speak
delay(1)

First we create a textbox, any color is fine but transparent is especially useful for reasons we'll get into later. We use backgroundtext so that player input isn't required to remove it and advance through the script.

We then use textboxtimer(1) to specify that this textbox should be destroyed after 1 frame. This is incredibly important as there is a limit to how many textboxes can exist in memory at the same time (19 iirc), so we need to constantly destroy them to make room for new frames.

Finally, we call speak to queue the textbox for display and delay(1) for the queued textboxes to be displayed, after which they're promptly deleted. Then we repeat for the next frame, and so on.

The frame image itself is essentially ASCII art. The resolution is determined by the number of characters in the textbox.

Window Resolution:

There is no real limit to the number of characters per line, though at values beyond 35 the textbox begins to extend leftwards offscreen and you lose the ability to move the textbox further right to center it, so the practical limit is 35.

The maximum number of lines for a single textbox is 11 characters; any larger values n are truncated to n mod(11). You can align multiple textboxes atop each other to extend the image vertically, allowing you to reach a total height of 25, at which point you cannot move the textboxes any lower.

The maximum centered resolution is thus 35x25 characters using multiple textboxes. This is one reason why transparent textboxes are so useful; the lack of borders allows you to perfectly align the characters to seamlessly extend the image vertically.

To display multiple textboxes on the same frame, simply wait on calling delay(1) until all the textboxes for the frame are queued for display, like so:

text(transparent,0,0,1)
-first image section-
backgroundtext
textboxtimer(1)
speak
text(transparent,0,8,1)
-second image section-
backgroundtext
textboxtimer(1)
speak
delay(1)

Framerate:

A script frame is roughly 1/30th of a second, and we can display an image every frame, so in theory the video should have a maximum framerate of 30fps. In practice, it takes slightly longer than 1/30th of a second to render each image (I'm not certain the exact reason why), so if you're trying to synchronize the video to an audio source you'll need to occasionally skip frames (images) to keep up.

For my purposes, I found that I had to skip every 46th frame except every 460th frame to synchronize it to 30fps. This worked for a 3:39 length video, but I can't guarantee it would work for longer.

Character Encoding:

I'm not certain what character encoding VVVVVV textboxes use, they can render UTF-8 characters but also additional characters beyond that, so you got me. Most characters aren't that useful for video, though, as the majority have their rightmost column and bottommost row transparent, making an undesirable grid pattern when lined up. The most useful characters are the following block characters:

' ','▖','▗','▘','▝','▚','▞','▀','▄','▐','▌','▛','▜','▟','▙','█'
Using these characters, you can break down each character tile into 4 sub-tiles and draw each sub-tile individually. This allows you to double your resolution from 35x25px to 70x50px.

A character tile in VVVVVV is an 8 pixel square for 64 pixels in total, so the maximum theoretical resolution would be 280x200px. That would require 2^64 (~18.4 quintillion) unique characters to render each permutation of pixels, so not realistic.

Another useful group of characters are the gradients (they look better in-game):

'░','▒','▓'
Unfortunately, the gradients are stuck at the 35x25px resolution, but we can take advantage of transparency again to kinda break them into sub-tiles. The trick is to create your frame with only gradients first, then create it with only block characters for the higher resolution, then draw both. The gradients will only be visible under transparent sub-tiles, allowing you to partially cover them with solid sub-tiles. This is very useful in making gradients flow naturally into block characters. You can see a visual example of it at around 1:38-2:06 in the demo video above, and a code example below:

text(transparent,0,0,1)

backgroundtext
textboxtimer(1)
speak
text(transparent,0,0,1)
▄  // solid sub-tiles cover the bottom half of the gradient
backgroundtext
textboxtimer(1)
speak
delay(1)

Script Creation and Other Details:

To actually render a video with this method, you'll need to parse the frames of your video and generate a text representation of the frame. I wrote up a (frankly spaghetti) script for this in Java which I might improve and extend a bit to make it a proper tool if people have interest in it. A VED script longer than 10k lines starts to get quite laggy, so I create a new script file whenever I reach that length and jump to it with a customiftrinkets(0, *).

To make it easier to work with modifying the video contents, I use internal scripts named begin and end to handle any setup and breakdown of the video player, then load into begin via a regular load script entry

The setup in begin is:

hideplayer()
gotoroom(2,0)
play(1)
customiftrinkets(0,0)

This makes the player invisible and moves them to a solid black room (the space the player occupies is painted black in manual mode with a non-solid black tile) so only the video is visible, then starts the audio for the video and jumps to the first video script. Control is taken away from the player automatically by the textbox display.

The breakdown in end is:

stopmusic()
gotoposition(168,168,0)
gotoroom(0,0)
showplayer()
hascontrol()

This ends the audio, returns the player to a desired room and location, makes the player visible again, and returns control to the player.

Also, you might think about using speakactive instead of speak to get rid of the textboxtimer(1) lines while still deleting the textboxes. This was my first approach, and it seems to work until you hit the textbox limit, as speakactive doesn't actually delete the textboxes until you've delayed for roughly 5 frames, resulting in a choppy, inconsistent framerate.



Download Info:

Note: This level was made using functionality available in VVVVVV v2.4.3, and it likely won't work on versions earlier than v2.4. Please make sure you're on the correct version for this level to run properly.

The .zip file contains the level and an asset folder with a custom music track. Simply unzip the contents into your ...\VVVVVV\levels directory and you should be good to go.

The .zip is just barely too large to upload directly, so you can instead find it at this Google Drive:
https://drive.google.com/drive/u/2/folders/1yhXNEzlJHBDstjPeY0nFHr80gtcYhTcN

Ally 🌠

This is really cool!

I actually do a similar thing in my recent level, Vermilion Blows Up A Sun:


Of course, this is 2.5, so you getting a similar thing working in 2.4 is fun!

If you want some ideas for improvements:

  • You don't actually need textboxtimer, as you can just endtextfast before showing the next frame. As both of these are only one line, it doesn't matter which you use. Not using speak_active was the right call, either way.
  • VVVVVV's fonts can be up to 256x256 (maybe 255?) pixels, so you can display a full 320x240 screen with only two characters per textbox. This is what I did in my 2.5 version.
  • In 2.5 (currently unreleased), you can use position(absolute) to make the textbox ignore the screen padding. In 2.4, you could also abuse setroomname instead of using textboxes if you don't care about seeing "through" the video to the tiles below.
  • VVVVVV runs at ~29.412 FPS, so it'll drift a 30 FPS video. This is because the FPS calculation uses 1/34... Also, due to the nature of delay-based main loops, it will drift differently per system... It's a little unfortunate. (Also, the 30+ FPS option does timing differently as well, so that'll also drift...)

Back in the 2.3 days, I also did this:

which uses gravity lines instead to render Bad Apple!!. It was quite suboptimal, though...

Either way! It's really fun to see something like this, and for a first post, you seem to know your way around the game pretty well! Around here is rather dormant lately, so may I suggest our Discord server? https://vsix.dev/discord