BIBB
PROJECT
•

Power BI AI CV Analysis for Recruitment: Automating Candidate Matching with OpenAI

Oscar MartĂ­nez
Oscar MartĂ­nez Data, BI and AI | Operational Lead | Power BI & Azure Expert | Governance-Driven Strategy | Product Owner

Discover how Power BI and OpenAI work together for AI-driven CV analysis in recruitment, enabling efficient candidate matching and insights for data-driven hiring

Hiring teams increasingly use AI tools like Power BI and OpenAI to streamline CV analysis and enhance recruitment efficiency. This guide covers setting up an AI-driven CV analysis project that integrates Power BI and OpenAI, enabling you to automate candidate matching and deliver structured data visualizations.

This project uses Python and OpenAI to extract and structure CV data into JSON format, which Power BI then processes for visualization and analysis. This approach saves time and delivers powerful insights into candidate-job matching, which is ideal for data-driven recruitment.

Before implementing AI-driven CV analysis, we advise consulting with a legal specialist to ensure compliance with relevant data protection laws. By taking these steps, you can responsibly integrate AI into recruitment while maintaining compliance with legal and ethical standards.

Key considerations include:

graph TD
    A[AI CV Analysis] --> B[Consent & Purpose]
    A --> C[Data Protection]
    A --> D[API Compliance]
    A --> E[Bias Prevention]
    A --> F[Security & Transparency]
    
    style A fill:#ff0020,stroke:#333,stroke-width:2px,color:#fff
    style B fill:#73b9c6,color:#fff
    style C fill:#73b9c6,color:#fff
    style D fill:#73b9c6,color:#fff
    style E fill:#73b9c6,color:#fff
    style F fill:#73b9c6,color:#fff

Key legal considerations for AI-driven CV analysis

Consent and Purpose

Obtain clear, explicit consent from individuals before processing their data.

Limit data usage strictly to the recruitment process, avoiding any secondary purposes.

Data Anonymization

Remove or mask personally identifiable information (PII) before processing CVs to minimize privacy risks.

This includes names, contact details, and other sensitive identifiers.

Review API Policies

Familiarize yourself with OpenAI’s usage policies to avoid transmitting unnecessary sensitive information.

Ensure secure transmission of data to and from the API.

Bias and Fairness

Regularly review AI outputs for potential bias or unfair patterns in candidate evaluations.

Provide additional checks to ensure all candidates are assessed equitably.

Security and Transparency

Use encryption to protect data during storage and transmission.

Clearly inform candidates about how AI is used in the recruitment process and provide options to withdraw their data if needed.

Project Overview

The project consists of a Jupyter notebook that:

  • Extract text from PDF files containing candidate CVs and a job position description
  • Normalize and structure the extracted text using OpenAI’s GPT model
  • Evaluate candidates by matching their CVs against the job position description
  • Output the results in JSON format for analysis in Power BI
Mermaid project diagram

graph TD
    subgraph "VS Code - Jupyter"
        A[Start]
        B[Extract text from CV PDFs]
        A --> B
        B --> C[OCR_Results.json]
        F[Extract text from Position PDF]
        A --> F
        F --> G[OCR_Position.json]
        
        E[LLM_Normalized_CV.json]
        I[LLM_Position.json]
        K[LLM_Analysis.json]
        
        subgraph "OpenAI"
            D[Summarize CVs]
            H[Normalize Position Description]
            J[Evaluate Candidates]
        end
        
        C --> D
        D --> E
        G --> H
        H --> I
        E --> J
        I --> J
        J --> K
    end
    
    E --> L[Analyze with Power BI]
    I --> L
    K --> L
    
    %% Define styles
    classDef openai color:#FFFFFF, fill:#FF0000,stroke:#FF0000,stroke-width:2px;
    classDef vscode color:#FFFFFF, fill:#008000,stroke:#008000,stroke-width:2px;
    classDef powerbi color:#000000, fill:#FFD700,stroke:#FFD700,stroke-width:2px;

    %% Apply styles
    class D,H,J openai
    class A,B,C,E,F,G,I,K vscode
    class L powerbi

Prerequisites

Before you begin, ensure you have the following:

1. Installing Python 3.7 or Higher

Download Python from the official website and follow the installation instructions for your operating system.

2. Installing Visual Studio Code (VS Code)

VS Code Official Download Page: Download Visual Studio Code

Setup Guide: Microsoft: Set up VS Code - Official setup overview from Microsoft.

3. Installing Git

Git for Windows: Download Git

This will enable you to clone the repository and manage version control.

Setting up the project

1. Cloning the GitHub Repository

If you’re unfamiliar with Git, these steps will guide you through cloning the repository using VS Code:

  1. Open VS Code
  2. Open the Command Palette by pressing Ctrl+Shift+P
  3. Type “Git: Clone” and select it
  4. Enter the Repository URL: https://github.com/OscarValerock/BIBB-PBI-CV-AI-Analysis.git
  5. Choose a Local Directory: Select a folder on your computer to store the project
  6. Open the Repository: VS Code will prompt you to open the repository once cloned. Click Open
2. Setting Up a Virtual Environment in VS Code

Once you’ve cloned the repository and have your project open in Visual Studio Code, it’s best practice to create a virtual environment for your project. This isolates the required Python packages, making your setup more stable and organized.

Steps to create the virtual environment:

  • In the Command Palette (Ctrl+Shift+P), type Python: Create Environment and select Create Environment
  • VS Code will prompt you to select a folder. Select your project folder (the cloned repository) and choose venv as the virtual environment type
  • Accept to install the packages from requirements.txt
3. Add the Constants.py File

Create a file named Constants.py in your project’s root directory. This file will store your OpenAI API key.

OpenAIKey = "your-openai-api-key"

Important: Replace "your-openai-api-key" with your actual OpenAI API key. Keep this file secure and avoid sharing it publicly.

Understanding the Code

Here’s an overview of the code included in this project running from a Jupyter Notebook. Each component works together to analyze CVs and match them to job descriptions.

Analyze Candidates.ipynb
1. Extracting Text from CV PDFs

This script uses PyMuPDF to extract text from all PDF files in the specified CV folder. Each CV’s text is saved in a JSON file, where each entry contains the filename and the extracted content. This JSON file will later serve as input for OpenAI processing.

2. Defining the CV JSON Structure

This structure outlines how the CV data will be organized after processing with OpenAI. It includes sections for:

  • Personal information
  • Education
  • Work experience
  • Skills
  • Certifications
  • Languages
  • Systems knowledge

By providing a consistent structure, it enables clear and comparable information across different CVs.

3. Normalizing CV Data with OpenAI

Using OpenAI’s GPT model, this script takes the raw text from each CV and organizes it into the predefined JSON format. The script processes each CV’s content individually, sends it to OpenAI, and then saves the structured output in a new JSON file for further analysis.

4. Processing the Position Description

Like CV processing, this script extracts text from the job position description PDF and organizes it into a structured JSON format. This structure includes fields for:

  • Job title
  • Department
  • Responsibilities
  • Qualifications
  • Required skills

This ensures a comparable format for evaluation against CVs.

5. Evaluating Candidates Against the Job Description

This final script compares each candidate’s normalized CV with the job description, assessing the match across dimensions like:

  • Education
  • Experience
  • Skills
  • Overall fit

It uses OpenAI to evaluate and output the results in JSON format, which can be loaded into Power BI for visualization and further analysis. The script provides a “score” for each dimension to quantify the match.

Each module contributes to a structured, automated process for analyzing and scoring candidate suitability based on the content of their CVs and the job position’s requirements.

Conclusion

By following this guide, you’ve set up a powerful tool that automates the analysis of candidate CVs against job descriptions using AI.

This saves time and provides a standardized assessment framework to help you make informed hiring decisions.

Remember to handle all personal data in compliance with data protection regulations and respect the privacy of the candidates.