Building a Document Conversion Tool with Node.js & Angular

Discover the ultimate way to build a document conversion tool that offers an intuitive interface for users through our advanced drag-and-drop and file selection upload features. Our detailed guide explains how to integrate these functionalities with Amazon S3’s secure storage solutions by obtaining pre-signed URLs, ensuring your document conversion tool is not only user-friendly but also fortified with top-tier security protocols.

Build Document Conversion Tool with Node.js

Introduction

Converting documents into various formats can be quite handy in many applications. In this tutorial, we are going to build a document conversion tool using Node.js on the server, Angular on the client side, and MongoDB as the database. We’ll be using NPM packages like bull for job queueing, mongoose for object modeling, uuid for generating unique identifiers, and libreoffice-convert for the actual file conversion.

Below, I outline the steps you might take to develop document conversion tool.

Steps for Back-end (Node.js)

1. Set up the Node.js server:

2. Set up AWS S3:

  • Use the AWS SDK to create a service object that can interact with S3.
  • Set up a route to handle pre-signed URL generation for file uploads.

3. File upload handling:

  • Define the /upload endpoint
  • Generate a unique file ID using uuid when a file upload is initiated.
  • Use the generated ID to create a pre-signed URL for S3 uploads.

4. Database Model:

  • Define a Mongoose schema for storing file metadata, including the unique ID, file status, and any other relevant information.

5. Message Queue with Bull:

  • Configure Bull to manage job queues.
  • Create a job queue for processing file conversion tasks.

6. Process API:

  • Define the /process endpoint to save file metadata in MongoDB and enqueue a conversion job in Bull.

7. File Status Check:

  • Define the /file_status endpoint to query the database for the current status of a file conversion job.

8. Worker Process:

  • Set up a separate Node.js process that will consume tasks from the Bull queue.
  • Use libreoffice-convert to convert the files.
  • Upload the converted files to S3 and update the database with the file’s new status and location.

Steps for Front-end (Angular)

1. Set up the Angular application:

  • Initialize a new Angular project using ng new.
  • Create services to interact with the backend API.

2. File Upload Component:

  • Implement a file upload component with the ability to drag and drop files.
  • Use the Angular service to get a pre-signed URL from the /upload API endpoint.
  • Handle the actual file upload to S3 using the pre-signed URL.

3. Processing and Status Check:

  • After the upload, call the /process API to initiate the conversion process and store the file metadata.
  • Poll the /file_status endpoint to get updates on the conversion status.

4. Download Component:

  • Provide a download link or button that becomes active once the file conversion is complete.
  • Use the Angular service to get the pre-signed URL for downloading the converted file.

 

Setting Up the Server Side

Initiate a new Node.js project by creating a directory and running npm init. After setting up the basic package.json, install the required packages:

npm install express bull mongoose uuid libreoffice-convert aws-sdk

Create your server.js file and begin by setting up an Express server:

const express = require('express');
const app = express();

const path = require('path');
const { Router } = express;
const router = new Router();

// ... code to setup routes and middleware

app.use('/api', router);

app.listen(3000, () => {
    console.log('Server is running on port 3000');
});

Here, we are importing necessary modules and initializing an Express app. We’ll later define routes and middleware for handling file uploads and processing.

Setting Up the Client Side

Use Angular CLI to set up your Angular project:

ng new document-conversion-tool

After creating the project, create the file upload component:

ng generate component file-upload

Below is the implementation of both drag-and-drop and file selection upload, utilizing the service to get pre-signed S3 URLs and to make the API calls for file processing.

Angular Components & Services

This section covers how to create Angular components and services required to upload a document, check the processing status, and download the converted file.

File Upload Component

The FileUploadComponent allows users to upload documents via drag-and-drop or file selection. Here’s how to build it:

// file-upload.component.ts
import { Component,Renderer2 } from '@angular/core';
import { FileService } from '../services/file.service';

@Component({
  selector: 'app-file-upload',
  templateUrl: './file-upload.component.html',
  styleUrls: ['./file-upload.component.css']
})
export class FileUploadComponent {
  constructor(private fileService: FileService, private renderer: Renderer2) {}

  onFileSelected(event: any): void {
    const file = event.target.files[0];
    if (file) {
      this.uploadFile(file);
    }
  }

  onDropFile(event: DragEvent): void {
    event.preventDefault();
    const file = event.dataTransfer?.files[0];
    if (file) {
      this.uploadFile(file);
    }
  }

  onDragOver(event: DragEvent): void {
    event.stopPropagation();
    event.preventDefault();
  }

  private uploadFile(file: File): void {
    this.fileService.getPresignedUrl(file.name,file.type).subscribe((response: any) => {
      this.fileService.uploadToS3(response.presignedUrl, file).subscribe(() => {
        this.fileService.processFile(response.uuid, file.name).subscribe((repsonse) => {
          // Handle the response for file processing 
initiation
          // if the response.status == "completed", response.downloadUrl is the download URL
          //this.downloadDocument(response.downloadUrl,file.name+".pdf")
        });
      });
    });
  }
}

We have methods to handle file selection, file drop, and file drag-over events. We use the FileService to interact with the backend and initiate file uploads and processing.

File Service

The FileService interacts with our server-side API to handle file operations:

// file.service.ts
import { Injectable } from '@angular/core';
import { HttpClient } from '@angular/common/http';
import { Observable } from 'rxjs';

@Injectable({
  providedIn: 'root'
})
export class FileService {
  private apiUrl = '/api'; // the base path of your API

  constructor(private http: HttpClient) {}

  getPresignedUrl(filename:string,type:string): Observable {
    return this.http.post(`${this.apiUrl}/upload`, {filename:filename,type:type});
  }

  uploadToS3(presignedUrl: string, file: File): Observable {
    const formData: FormData = new FormData();
    formData.append('file', file);
    return this.http.put(presignedUrl, formData);
  }

  processFile(uuid: string, filename: string): Observable {
    return this.http.post(`${this.apiUrl}/process`, { uuid, filename});
  }

  checkFileStatus(uuid: string): Observable {
    return this.http.get(`${this.apiUrl}/file_status`, {
      params: { uuid }
    });
  }
}

The FileService uses methods to get a pre-signed S3 URL, upload to S3, process the file, and check the file’s processing status. It is provided for the entire application so that any component can easily use it.

Integrating MongoDB

Using mongoose, define the schema for your document:  Save the below code as ./models/document.js

const mongoose = require('mongoose');
const Schema = mongoose.Schema;

const documentSchema = new Schema({
    uuid: { type: String, required: true },
    filename: { type: String, required: true },
    status: { type: String, required: true, default: 'pending' },
    convertedUrl: { type: String }
});

const Document = mongoose.model('Document', documentSchema);
module.exports = Document;

This schema includes a unique identifier, the original file name, the conversion status, and the URL to the converted document. Ensure you connect to your MongoDB instance at the start of your server:

mongoose.connect('mongodb://localhost/document-conversion', { useNewUrlParser: true, useUnifiedTopology: true });

Handling File Upload

Create an endpoint to handle the S3 pre-signed URL generation. Use the aws-sdk package for this purpose:

const AWS = require("aws-sdk");
const express = require("express");
const router = express.Router();
const path = require("path");
const UUID = require("uuid");
const Document = require("./models/document").
const Queue = require('bull');
const conversionQueue = new Queue('document-conversion');
// Configuring AWS
AWS.config.update({
accessKeyId: 'YOUR_AWS_ACCESS_KEY_ID',
secretAccessKey: 'YOUR_AWS_SECRET_ACCESS_KEY',
region: 'YOUR_AWS_REGION'
});
const s3 = new AWS.S3();
router.post("/upload", async (req, res) => {
  const { filename,type } = req.body;
  const ext = path.extname(filename);

  //Save the file with extension in S3
  const uuid = `${UUID.v4()}${ext}`;
  // Generate a presigned URL for upload
  const presignedUrl = await generatePresignedUrl(uuid,type);

  res.json({ presignedUrl, uuid });
});

function generatePresignedUrl(uuid,type) {

  return new Promise((resolve, reject) => {
    const params = {
      Bucket: "YOUR_BUCKET_NAME",
      Key: uuid,
      Expires: 180, // Time in seconds before the pre-signed URL expires
      ContentType: type,
      ACL: "public-read", // or the appropriate ACL for your use case
    };

    s3.getSignedUrl("putObject", params, (err, url) => {
      if (err) {
        reject(err);
      } else {
        resolve(url);
      }
    });
  });
}

 

The generatePresignedUrl function should interact with AWS S3 to create a unique, temporary URL to which the client can upload a file.

In this tutorial, I am using “bull” module for queue processing which depends on Redis, instead you can use Agenda also which is based on MongoDB.

Processing Files

The /process API endpoint is designed to handle initial processing tasks related to file conversion within the document conversion tool’s architecture. It is responsible for

  • Saving the file’s metadata to the database ( MongoDB), where it can be used to track the status and details of the file’s conversion process.
  • Adding a new job to the message queue for the actual document conversion. The job contains all necessary information for the conversion process to be executed by a separate worker process.
router.post("/process", async (req, res) => {
  const { uuid, filename } = req.body;
  // Save in MongoDB
  const document = new Document({ uuid, filename });
  await document.save();
  // Add a new job to the Bull queue
  await conversionQueue.add({ uuid, filename });
  res.status(200).send({ message: "File processing initialized." });
});

 

Worker (Document Conversion Script)

The worker responsible for converting documents in a Node.js application built with bull and libreoffice-convert is a background process that listens to a job queue for document conversion tasks.

const Queue = require("bull");
const conversionQueue = new Queue("document-conversion");

// Worker
conversionQueue.process(async (job, done) => {
  const { uuid, filename } = job.data;
  const tempInputPath = path.join("/tmp", filename);
  const tempOutputPath = path.join("/tmp", `${uuid}.pdf`); // assuming conversion to PDF

  // Download the file from S3 to the local filesystem
  const s3ReadStream = s3
    .getObject({
      Bucket: "YOUR_S3_BUCKET",
      Key: uuid,
    })
    .createReadStream();

  // Create a write stream to save the downloaded file to the local filesystem
  const localFileWriteStream = fs.createWriteStream(tempInputPath);

  // Pipe the s3 read stream into the local file write stream
  s3ReadStream.pipe(localFileWriteStream);

  localFileWriteStream.on("close", async () => {
    try {
      // Convert the file using libreoffice-convert
      const input = fs.readFileSync(tempInputPath);

      //We are converting the file to PDF
      libre.convert(input, ".pdf", undefined, (err, done) => {
        if (err) {
          console.error(`Error converting file: ${err}`);
          done(err);
        } else {
          // Write the output to a unique file name
          fs.writeFileSync(tempOutputPath, done);
          // Upload the converted file back to S3
          const s3WriteStream = s3
            .upload({
              Bucket: "YOUR_S3_BUCKET",
              Key: `converted/${uuid}.pdf`,
              Body: fs.createReadStream(tempOutputPath),
            })
            .promise();

          s3WriteStream
            .then(() => {
              // Delete the temporary files
              fs.unlinkSync(tempInputPath);
              fs.unlinkSync(tempOutputPath);
              console.log(`Converted file uploaded successfully: ${uuid}.pdf`);

              // Mark the job as done
              done(null, `converted/${uuid}.pdf`);
            })
            .catch((uploadErr) => {
              console.error(`Error uploading converted file: ${uploadErr}`);
              done(uploadErr);
            });
        }
      });
    } catch (convertErr) {
      done(convertErr);
    }
  });

  done();
});

 

The conversionQueue.process function is where the actual file conversion takes place. This should be a separate worker process that calls the libreoffice-convert package functionalities.

Downloading Converted Files

This  /file_status endpoint helps clients check the conversion status of their documents and get the download link once the conversion is complete.

const generatePresignedUrlForDownload = (bucket, key, expiresIn = 180) => {
  return new Promise((resolve, reject) => {
    const params = {
      Bucket: bucket,
      Key: key,
      Expires: expiresIn, // The URL will expire in 'expiresIn' seconds.
    };

    s3.getSignedUrl("getObject", params, (err, url) => {
      if (err) {
        reject(err);
      } else {
        resolve(url);
      }
    });
  });
};

router.get("/file_status", async (req, res) => {
  const { uuid } = req.query;

  // Check the status of the conversion
  const document = await Document.findOne({ uuid });

  if (document.status === "completed") {
    // Generate a pre-signed URL for downloading the converted file
    const downloadUrl = await generatePresignedUrlForDownload(
      YOUR_BUCKET_NAME,
      `converted/${uuid}.pdf`
    );
    res.status(200).send({ status: document.status, downloadUrl:downloadUrl });
  } else {
    res.status(200).send({ status: document.status });
  }
});

Force the user to Download the converted file.

Add the below code in the Angular component and pass the S3 presigned URL and filenames to the function.

downloadDocument(url:string ,filename:string ) {
  const link = this.renderer.createElement('a');
  link.setAttribute('target', '_blank');
  link.setAttribute('href', url);
  link.setAttribute('download', filename);
  link.click();
  link.remove();
}

 

Troubleshooting

Common issues and resolutions surrounding file uploads, data persistence, message queueing, and file conversions will be discussed here. When developing a document conversion tool with Node.js, Angular, and MongoDB, there are several common issues you might encounter. Below, we provide tips on how to troubleshoot some of these common challenges.

Server-Side Troubleshooting

Here are some common server-side issues and how to resolve them:

  1. Issue: Server does not start or crashes on startup.
    Solution: Check your server logs for errors. Common causes include missing environment variables, an unavailable database connection, or port conflicts. Validate your `.env` file and database URI, and ensure the specified port is not already in use.
  2. Issue: Pre-signed S3 URL generation fails.
    Solution: Confirm that your AWS credentials have the necessary permissions to generate pre-signed URLs and that they are correctly configured on the server. Ensure the bucket name and region are correctly set, and that the bucket policy allows the operations you’re attempting.
  3. Issue: Bull queue does not process jobs.
    Solution: Ensure Redis is running and accessible from where your Bull queue is operating. Check for error messages in the console or log files, and verify that your Bull queue configuration is correct.
  4. Issue: Document conversion fails in the worker process.
    Solution: Check the worker logs for any exceptions or errors output by `libreoffice-convert`. Ensure that LibreOffice is properly installed on the machine processing the jobs and that the file format being converted is supported.

Client-Side Troubleshooting

Common issues and solutions on the client-side include:

  1. Issue: File upload does not trigger or fails.
    Solution: Check the Network tab in your browser’s Developer Tools for any failed network requests. Ensure the API endpoint is correctly configured and that the pre-signed URL is valid. Confirm that CORS settings are allowing the client-side requests.
  2. Issue: The Angular service fails to make an HTTP request.
    Solution: Inspect the error message provided by the Angular HttpClient in the browser’s console. Verify the endpoint configuration and any headers or parameters being passed. Ensure any observables are subscribed to where needed.
  3. Issue: Download link for the converted file does not appear, or the file cannot be downloaded.
    Solution: Confirm that the conversion process successfully completed on the server-side and that the file was uploaded back to S3. Ensure the S3 bucket permissions allow for the file download.

Database Troubleshooting

When working with MongoDB, you may encounter the following issues:

  1. Issue: Application fails to connect to MongoDB.
    Solution: Verify that MongoDB is running and that the connection string is correctly configured. Check for any typos in the database name, username, or password, and make sure that MongoDB is accessible from the application’s environment.
  2. Issue: Data is not persisted or updated in the database.
    Solution: Follow the mongoose commands and check for error callbacks or promises rejection. Ensure you handle the async operations properly, and the correct data types and required fields match the schema definitions.

Conclusion

This tutorial walked you through the creation of a document conversion application using Node.js, Angular, and MongoDB, including file handling and document conversion processes. The provided examples and code snippets must serve as a solid foundation for your application development and further customization.

References