This article provides a comprehensive guide on how to create a Python script that converts .docx files to .pdf format while displaying a progress bar for each file. The script also dynamically takes input from the user regarding folder paths and ensures smooth file conversion with feedback at each step.
Overview
- docx2pdf - to convert Word files (.docx) to PDF.
- tqdm - to display a progress bar, giving visual feedback during file conversion.
Additionally, we’ll simulate progress updates to enhance the user experience during file processing.
Prerequisites
Before starting, ensure you have Python installed along with the necessary libraries. Install the required libraries by running:
pip install docx2pdf tqdm
Script Breakdown
Step 1: Import Necessary Libraries
-
os
: To interact with the file system and retrieve file paths. -
docx2pdf
: For converting.docx
files to.pdf
. -
tqdm
: For displaying a progress bar during the conversion process. -
time
: To simulate the conversion progress.
import os from docx2pdf import convert from tqdm import tqdm import time # To simulate progress updates
Step 2: Define Helper Function to List .docx
Files
This function, list_docx_files
, takes a folder path as input and returns a list of all
.docx
files in that folder. This is a simple utility to help
identify which files need to be converted.
def list_docx_files(folder): """List all .docx files in the given folder.""" return [f for f in os.listdir(folder) if f.endswith('.docx')]
Step 3: Ask for Input and Validate Folder Existence
The convert_docx_to_pdf_dynamic()
function starts by
asking the user to input the source folder path where
.docx
files are located. It checks whether the folder
exists, and if it doesn’t, the program terminates with an error message.
def convert_docx_to_pdf_dynamic(): try: # Step 1: Ask for the folder containing Word files source_folder = input("Enter the folder path containing .docx files: ").strip() if not os.path.exists(source_folder): print(f"The folder '{source_folder}' does not exist.") return
Step 4: List and Confirm .docx
Files for Conversion
The script proceeds to list all .docx
files in the source
folder. If no .docx
files are found, it terminates early. The
user is prompted to confirm whether they want to proceed with the conversion
process.
# Step 2: List all .docx files in the source folder docx_files = list_docx_files(source_folder) if not docx_files: print(f"No .docx files found in the folder '{source_folder}'.") return print("The following .docx files were found:") for i, docx_file in enumerate(docx_files, start=1): print(f"{i}. {docx_file}") # Step 3: Ask for confirmation to proceed proceed = input("Do you want to proceed with converting these files to PDF? (yes/no): ").strip().lower() if proceed != 'yes': print("Conversion aborted.") return
Step 5: Define Destination Folder for PDF Files
Next, the script asks the user for the destination folder where the converted
.pdf
files will be saved. If the folder doesn’t exist, it
is created automatically.
# Step 4: Ask for the destination folder to save PDFs output_folder = input("Enter the destination folder to save PDFs: ").strip() if not os.path.exists(output_folder): os.makedirs(output_folder) # Create the folder if it doesn't exist print(f"Created the destination folder: {output_folder}")
Step 6: Convert Each .docx
File with Progress Indicator
For each .docx
file, the script displays the file being converted
and a progress bar using tqdm
. Although the
docx2pdf
library doesn’t offer real-time progress, we simulate
the progress with a for
loop and a small delay using
time.sleep()
. Once the file is converted, the progress bar
reaches 100%.
# Step 5: Convert each .docx file to PDF with individual progress indication for docx_file in docx_files: try: docx_path = os.path.join(source_folder, docx_file) print(f"Converting '{docx_file}' to PDF...") # Simulate progress for each file with tqdm(total=100, desc=f"Processing '{docx_file}'", unit="%", leave=True) as pbar: # Actual conversion happens here convert(docx_path, output_folder) # Simulate progress for _ in range(100): time.sleep(0.05) # Simulate work by sleeping pbar.update(1) print(f"Successfully converted '{docx_file}' to PDF.") except Exception as e: print(f"Error converting '{docx_file}': {e}") print("All conversions completed!") except Exception as e: print(f"An error occurred during the process: {e}")
Step 7: Run the Script
Finally, the script is executed by calling the
convert_docx_to_pdf_dynamic()
function. This initiates the entire
process from taking user inputs to converting and saving the PDFs with visual
progress feedback.
# Run the dynamic converter convert_docx_to_pdf_dynamic()
Script Full
import os from docx2pdf import convert from tqdm import tqdm import time # To simulate progress updates def list_docx_files(folder): """List all .docx files in the given folder.""" return [f for f in os.listdir(folder) if f.endswith('.docx')] def convert_docx_to_pdf_dynamic(): try: # Step 1: Ask for the folder containing Word files source_folder = input("Enter the folder path containing .docx files: ").strip() if not os.path.exists(source_folder): print(f"The folder '{source_folder}' does not exist.") return # Step 2: List all .docx files in the source folder docx_files = list_docx_files(source_folder) if not docx_files: print(f"No .docx files found in the folder '{source_folder}'.") return print("The following .docx files were found:") for i, docx_file in enumerate(docx_files, start=1): print(f"{i}. {docx_file}") # Step 3: Ask for confirmation to proceed proceed = input("Do you want to proceed with converting these files to PDF? (yes/no): ").strip().lower() if proceed != 'yes': print("Conversion aborted.") return # Step 4: Ask for the destination folder to save PDFs output_folder = input("Enter the destination folder to save PDFs: ").strip() if not os.path.exists(output_folder): os.makedirs(output_folder) # Create the folder if it doesn't exist print(f"Created the destination folder: {output_folder}") # Step 5: Convert each .docx file to PDF with individual progress indication for docx_file in docx_files: try: docx_path = os.path.join(source_folder, docx_file) print(f"Converting '{docx_file}' to PDF...") # Simulate progress for each file with tqdm(total=100, desc=f"Processing '{docx_file}'", unit="%", leave=True) as pbar: # Here you can simulate the progress update # Since we don't have internal progress, this is just for display # Actual conversion happens here convert(docx_path, output_folder) # Simulate progress for _ in range(100): time.sleep(0.05) # Simulate work by sleeping pbar.update(1) print(f"Successfully converted '{docx_file}' to PDF.") except Exception as e: print(f"Error converting '{docx_file}': {e}") print("All conversions completed!") except Exception as e: print(f"An error occurred during the process: {e}") # Run the dynamic converter convert_docx_to_pdf_dynamic()
Conclusion
This Python script provides a user-friendly solution for converting
.docx
files to .pdf
, offering visual feedback for
each file processed. The inclusion of a progress bar makes it easier for
users to see the progress of each file being converted. Although the
conversion process is relatively quick, the simulated progress ensures that
users are kept informed throughout the conversion process.