Build a file conversion API using FFmpeg, Docker, and Node.js

Alexander Leon
5 min readNov 10, 2018

Hey guys, Alex here. Thought I’d share how I went about building a file conversion API. In this example we’ll convert a webm file to mp3. With this approach, you should be able to convert any video or audio file you want provided it’s listed here: https://ffmpeg.org/ffmpeg-formats.html

Let’s look at the tools we’ll be using.

FFmpeg

If you’ve researched encoding/decoding tools, chances are you’ve come across FFmpeg. After many many years of bug fixes, this framework has become a sort of gold standard in the field. You can convert all sorts of video and audio files, and it’s what we’re going to use today.

Docker

I like to think of it as a slimmed down virtual machine. The proper term for what it builds is a virtual container. We can install FFmpeg inside the virtual container and then anything inside this container will be able to access FFmpeg’s functionality. In this example, we’ll be adding a simple web server that’ll be making FFmpeg requests.

Node.js

We need Node to create our web server and coordinate our different libraries. If you’d rather use another framework, have at it, though the most up-to-date FFmpeg wrapper I could find was made for Node.

With that said, let’s hop to it.

Let’s go to our terminal, create a fresh directory, and set up a node project:

mkdir file-conversion-api && cd file-conversion-api && npm init

Let’s also create the files we’ll need while we’re at it:

touch Dockerfile server.js setup-ffmpeg.sh

Which file shall we start with?

How about the bash file? Inside setup-ffmpeg.sh add the following:

#!/usr/bin/env bash
echo 'deb http://www.deb-multimedia.org jessie main non-free' >> /etc/apt/sources.list
echo 'deb-src http://www.deb-multimedia.org jessie main non-free' >> /etc/apt/sources.list
apt-get update
apt-get install -y --force-yes deb-multimedia-keyring
apt-get remove -y --force-yes ffmpeg
apt-get install -y --force-yes build-essential libmp3lame-dev libvorbis-dev libtheora-dev libspeex-dev yasm pkg-config libfaac-dev libopenjpeg-dev libx264-dev libav-tools ffmpeg

Here we are installing FFmpeg as well as some libraries we’ll need like the mp3 lame encoder.

We’re going to need to execute this file in just a moment, so let’s not forget to make it executable:

chmod +x setup-ffmpeg.sh

Alrighty. That wan’t too bad, right? Let’s setup our docker file next. In Dockerfile add the following:

FROM node:carbonVOLUME ["/root"]ADD setup-ffmpeg.sh /root
RUN /root/setup-ffmpeg.sh
# Create app directory
WORKDIR /usr/src/app
# Install app dependencies
# A wildcard is used to ensure both package.json AND package-lock.json are copied
# where available (npm@5+)
COPY package*.json ./
RUN npm install
# If you are building your code for production
# RUN npm install --only=production
# Bundle app source
COPY . .
EXPOSE 8080CMD [ "npm", "start" ]

So this file we do not need to make executable, but in reality it works quite similarly to our bash file. We’re simply declaring a sequence of tasks to execute. Rather than executing it ourselves though, Docker will execute it for us and in doing so will generate a virtual container with FFmpeg installed and our (soon-to-be-made) server running.

Saving the best for last

Well, sort of. The server configuration is also the least obvious step from an infrastructure perspective. If you want users to submit a file from their machine and you spit out the converted file would require one approach. If you need to batch process a set of files would require another approach. If you’re first processing a file before submitting for conversion would require yet another approach. With that in mind, I’m going to show you a bare-bones API configuration from which point I’ll share with you how I’ve changed it a bit to be usable in a production setting.

Roll up your sleeves

You know what, I’ll just give you the code in one go. Add the following to server.js:

'use strict';var express = require('express');
var cors = require('cors')
var ffmpeg = require('fluent-ffmpeg');
var PORT = 8080;
var HOST = '0.0.0.0';
const app = express();
app.use(cors());
app.get('/', (req, res) => {
res.contentType('audio/mp3');
res.attachment('myfile.mp3');
var pathToAudio = 'https://dl.dropbox.com/s/pc7qp4wrf46t9op/test-clip.webm?dl=0';
ffmpeg(pathToAudio)
.toFormat('mp3')
.on('end', function(err) {
console.log('done!')
})
.on('error', function(err) {
console.log('an error happened: ' + err.message);
})
.pipe(res, {end: true})
});
app.listen(PORT, HOST);
console.log(`Running on http://${HOST}:${PORT}`);

Install the packages:

npm i --save express cors fluent-ffmpeg

Welp, let’s take a look at what’s going on.

  1. We’ve declared our node packages and port/host up top.
  2. We instantiated our express server.
  3. We added the basic CORS setup. By the way, if you load this online right now, anyone will be able to access it. If that’s not what you want, make sure to whitelist the URLs you want. More here: https://github.com/expressjs/cors
  4. We set up a route at‘/’ that reads and processes a file
  5. We turned on our server.

As is, when you hit the server at ‘/’, the server will encode the file listed at pathToAudio to mp3 and pipe out the results. We declared the content-type to mp3 and we set the content-disposition to attachment so it will immediately download to the user’s machine. Maybe that’s useful to you, maybe not, but it’s what we have for now.

Let’s try it out. In package.json I added the following to make my life a little easier:

"scripts": {
"build": "docker build -t alien35/node-web-app .",
"startd": "docker run -p 49160:8080 -d alien35/node-web-app",
"open": "open http://localhost:49160",
"go": "npm run build && npm run startd && docker ps"
},

The build script requests that docker build our Dockerfile and name it alien35/node-web-app.

startd here stands for start docker. Not sure what else to say about it. Well, take a note of the 49160:8080 bit. This says that the express server will be 8080, but the container server will be 49160. If you try to run localhost:8080, it will fail because our computer doesn’t have direct access to that process. We do, however, have access to port 49160, where our docker container is listening.

open is sort of useless to me since my default browser is Brave which isn’t the best for development.

go is the script I run the most. We build the docker container, start it, and then list out what docker containers are currently running. For debugging, I can then run docker logs <CONTAINER ID LISTED BY DOCKER PS> or even docker kill <container ID>

Recommendations for production use

I don’t know your use case, so it’s hard to say too much. A recent use case I had though consisted of a user editing an audio file and then exporting the result to mp3. In that case, I submitted the blob file to Firebase, submitted my request to the web server, setpathToAudio to the Firebase url and then deleted the file once the mp3 file was returned to the user.

Unrelated PSA: Looking for a new high paying software development job? Send me your resume to alexleondeveloper@gmail.com and I’ll get back to you!

That’s it folks

I hope that was useful for some of you. Amazon and others have file conversion APIs, but they’re fairly pricey, so why not roll one out yourself if you have the time? Anyways, that’s all I have. Thanks for reading!

--

--

Alexander Leon

I help developers with easy-to-understand technical articles encompassing System Design, Algorithms, Cutting-Edge tech, and misc.