The model size is only 3B, which is an early access version.
Instructions: Insert an image to let the person in the image replace the person in the video. Text input: For example, if the image is LeBron James and the video is of a woman dancing, write "Basketball star James is dancing." Avoid complex videos; single-person videos work best as masking technology is applied.