Thi's avatar
HomeAboutNotesBlogTopicsToolsReading
About|My sketches |Cooking |Cafe icon Support Thi
💌 [email protected]

FaceFusion & Vast.ai

Anh-Thi Dinh
Computer VisionAPI & ServicesGenerative AIPythonTools
Left aside
FaceFusion is an excellent tool for swapping faces, colorizing, or enhancing the quality of photos and videos. It's completely free and can be run on your own machine (requires sufficient computing power).
☝
The settings and notes in this guide are based on FaceFusion v3.4.1.

References

  • FaceFusion documentation:
    • The official GitHub repository.
    • The official documentation.
    • The error codes reference. Note that when errors occur, nothing appears in the terminal. To see the error code, run echo "Exit code: $?" immediately after.
  • Vast.ai documentation:
    • Volumes - Vast.ai Documentation
  • My custom versions: facefusion-3.4.1 and facefusion-3.3.2. These versions disable filters and add several new features.

Understanding Key Parameters

  • Face detection parameters
  • Face Swapper Parameters
  • Face enhancer parameters
  • Face masker parameters

Tips for Using FaceFusion

  • Avoid using reference photos with glasses when swapping faces.
  • Use 4-6 reference photos with different angles and expressions for best results.
  • Instead of splitting videos for testing, use the "Trim Frame" option (processes only the selected segment) and "Preview Frame" (displays the generated result at each frame).
  • Models are stored in .assets/models/
  • Temporary files (uploaded references and targets) are stored in /tmp/facefusion/ on Linux systems.
  • Output fps of 35 is sufficient—higher values aren't necessary.
  • To preserve emotional expressions, use the expression_restorer processor.

My Favorite Configurations

Modify facefusion.ini. Below are just the settings that need values in that file.
Note: Configuration may vary depending on your machine's power. The settings above are optimized for systems with powerful GPUs.

Run FaceFusion CLI

With this custom facefusion.ini (some parameters are prefilled), we can use following command to run the facefusion without using its gradio UI,

Using FaceFusion with Vast.ai

Based on excellent work by Singulainthony, this section adapts his approach with a few of my own notes.
I've modified the code with my own improvements. The GitHub repository for version 3.4.1 is available here. In this custom version, I completely disabled the filters, added --share-gradio (useful on remote servers like Vast.ai to access the app via a public URL, but unnecessary locally).
🚨
Remember to stop or remove unused instances to avoid significant charges!

Setting up

  1. Use this template (for version 3.3.2, no customization) or my template for version 3.4.1 (with customization).
  1. Open the instance application, create a new terminal, and run the following commands (you can review the gist file to see what happens behind the scenes):
  1. After the installation completes, close the current terminal, create a new one, and run:
  1. You should see output similar to:
⚠️ The free Gradio URL remains active for only 7 days. Read this official documentation to learn more about this public URL and how to create your own custom URL.

Connect to an instance via SSH

When working with a remote machine like Vast.AI, there are several drawbacks:
  • The connection can be extremely slow through the Gradio tunnel.
  • We can't monitor the upload process, and all uploaded files are only stored temporarily.
For these reasons, it's more efficient to upload files directly from your local machine to the Vast.ai instance and then run the FaceFusion CLI.
First, we need to connect to the instance via SSH.
  1. Check your local machine's public key
  1. On the Vast dashboard, go to Keys and paste your public key. This key will be added to all new instances you create, but it won't work with your current instance.
  1. If the previous step doesn't work, manually copy your public key to the current instance's authorized_keys file:
  1. SSH to the instance. First, check the instance's IP and port in the "Instances" page by clicking the key icon at the bottom bar of the instance.
  1. To upload a file from your local machine to the instance:

Tips for Working with Vast.ai

  • Performance
    • With RTX 5090, 2 processors enabled and the configurations described in the previous section, processing speed reaches about 5 frames/s for HD videos or up to 20 frames/s for medium-resolution videos.
    • With RTX 4080 using the same settings, processing speed is about 2.5 frames/s.
    • With a local Mac M4 Chip, processing speed is about 1.5 frames/s.
  • For heavy files, avoid using the download button in the FaceFusion UI as it downloads via the Gradio public URL, which is very slow. Instead, open Jupyter notebook, navigate to the download folder, and download from there—it's much faster!
  • To empty the trash when you've removed files:
  • FaceFusion's cache in Linux is located at /tmp/facefusion/. Each project has its own folder; use rm -rf folder_name to remove specific folders.
  • Generated videos are typically very large (GBs). Use the following command to reduce file size while maintaining quality:

Useful video editing tips

  • Let's use QuickTime Player or QP (on Mac).
  • Quickly split a clip at the current frame (cmd+Y) as many times as needed, then use del to remove any unsuccessful splits.
  • You can also use QP's built-in trim feature (cmd+T).
  • For merging different video files, use QP → Edit → Add clip to the end. QP produces smaller file sizes compared to ffmpeg or iMovie, which create much larger files.
  • Use the Left/Right arrow keys to move forward or backward frame by frame.
  • Avoid changing the playback speed in QuickTime Player while editing videos, as this will result in poor frame quality.

Useful Commands for Working with Videos

☝
All commands in this section require ffmpeg to be installed. If you encounter a "no file found" error, first run setopt NULL_GLOB.
  • Split a video into segments of 200 ms each
  • Often, FaceFusion cannot process videos that have been modified by certain applications (like QuickTime Player). You can use the command below to "fix" these videos. This process also significantly reduces file size—for example, a 1GB .mov file can be compressed to a 50MB .mp4 while maintaining the same resolution:)
    • If you want to create an alias to quickly use the command, put below in .bashrc or .zshrc
  • Convert .m3u8 (streaming video) to a .mp4
  • Resize all images to max 768px in the current folder (replace current ones)
  • Compress all images in the current folder
  • Compress and resize video to 480p
    • Compress and resize all videos in the current folder to 480p
 
◆References◆Understanding Key Parameters◆Tips for Using FaceFusion◆My Favorite Configurations◆Run FaceFusion CLI◆Using FaceFusion with Vast.ai○Setting up○Connect to an instance via SSH○Tips for Working with Vast.ai◆Useful video editing tips◆Useful Commands for Working with Videos
About|My sketches |Cooking |Cafe icon Support Thi
💌 [email protected]
1output_path = /your/path/
2
3face_detector_model = yolo_face
4face_detector_size = 640x640
5face_detector_angles = 0 90 180 270
6
7face_selector_mode = one
8face_selector_order = large-small
9
10face_mask_types = box occlusion
11
12temp_frame_format = jpeg
13
14output_video_fps = 35
15
16processors = face_swapper face_enhancer expression_restorer
17
18face_enhancer_model = gfpgan_1.4
19
20face_swapper_model = inswapper_128_fp16
21face_swapper_pixel_boost = 768x768
22
23download_providers = github huggingface
24
25execution_providers = cuda
1python facefusion.py headless-run \
2  --source-paths /path/to/your/source1.jpg /path/to/your/source2.jpg \
3  --target-path /path/to/your/target_video.mp4 \
4  --output-path /workspace/ddd/ \
5  --face-detector-model retinaface \
6  --face-detector-size 640x640 \
7  --face-detector-angles 0 90 180 270 \
8  --face-selector-mode one \
9  --face-selector-order large-small \
10  --face-selector-age-start 40 \
11  --face-selector-age-end 70 \
12  --face-selector-gender male \
13  --face-selector-race asian \
14  --face-mask-types box occlusion \
15  --face-mask-blur 0.3 \
16  --temp-frame-format jpeg \
17  --processors face_swapper face_enhancer expression_restorer \
18  --face-enhancer-model gfpgan_1.4 \
19  --face-swapper-model inswapper_128_fp16 \
20  --face-swapper-pixel-boost 768x768 \
21  --execution-providers cuda \
22  --download-providers github huggingface
1curl -s https://gist.githubusercontent.com/dinhanhthi/160a4a3e9c6f54867e3fb6385de0d8b6/raw/fdf94d2251879b3371963ce02c296b602cc10dff/setup_vast.sh | bash
1cd /workspace/facefusion-3.4.1/ && conda activate facefusion && python facefusion.py run --share-gradio
1* Running on local URL:  http://127.0.0.1:7860
2* Running on public URL: https://123c5c24e61e678bb1.gradio.live
1# Check if you have an SSH key
2ls ~/.ssh/id_*.pub
3
4# If not
5ssh-keygen -t ed25519
6
7# Get your public key
8cat ~/.ssh/id_ed25519.pub
9# Copy this
1# In the current instance, open Terminal
2mkdir -p ~/.ssh
3echo "YOUR_PUBLIC_KEY_HERE" >> ~/.ssh/authorized_keys
4chmod 600 ~/.ssh/authorized_keys
5chmod 700 ~/.ssh
1scp -P <PORT_INSTANCE> /path/to/your_video_file.mp4 root@<INSTANCE_IP>:/root/
1rm -rf ~/.local/share/Trash/files/*
2rm -rf ~/.local/share/Trash/info/*
1ffmpeg -i "file_name.mp4" -c:v libx264 -c:a aac "fixed_${basename}.mp4"
1ffmpeg -i input.mp4 -c copy -map 0 -segment_time 200 -f segment output_%03d.mp4
1ffmpeg -i "$input" -c:v libx264 -c:a aac "fixed_${basename}.mp4"
1fix_video() {
2    if [[ -z "$1" ]]; then
3        echo "Usage: fix_video <filename.ext>"
4        return 1
5    fi
6    
7    local input="$1"
8    local basename="${input%.*}"
9    
10    ffmpeg -i "$input" -c:v libx264 -c:a aac "fixed_${basename}.mp4"
11}
Then use fix_video file_name.ext
1m3u8_to_mp4() {
2  ffmpeg -i "$1" -c copy -bsf:a aac_adtstoasc "${1%.m3u8}.mp4"
3}
Use as m3u8_to_mp4 file_name.m3u8
1mogrify -resize '768x768>' *.{jpg,jpeg,png}
1for file in *.jpg *.jpeg *.png; do ffmpeg -i "$file" -q:v 10 -y "${file%.*}_temp.${file##*.}" && mv "${file%.*}_temp.${file##*.}" "$file"; done
1ffmpeg -i file_name.mp4 -vf scale=-2:480 -c:v libx264 -preset medium -crf 23 -c:a aac -b:a 128k -movflags +faststart file_name_compressed.mp4
1for file in *.{mp4,mov,avi,mkv,m4v}; do
2  [ -f "$file" ] || continue
3  ffmpeg -i "$file" -vf "scale=-2:480" -c:v libx264 -crf 28 -preset slow -c:a aac -b:a 96k "480p_${file%.*}.mp4"
4done