Balancing the Scales: Tackling Class Imbalance in AI-Driven Diagnostics (Part 2)

13 min readFeb 10, 2024

Introduction to Class Imbalance in Medical Diagnostics

In the realm of healthcare, the advent of Artificial Intelligence (AI) has been nothing short of revolutionary. From predicting patient outcomes to diagnosing diseases with precision that rivals seasoned practitioners, AI has shown immense potential to transform medical diagnostics. However, this journey is not without its challenges. Among these, the issue of class imbalance in medical datasets stands out as a particularly stubborn obstacle to the effective application of AI.

The Challenge of Class Imbalance in Medical AI

Imagine trying to teach someone to recognize a rare bird among common ones, but almost all the examples you provide are of the common birds. This is analogous to the problem of class imbalance in medical diagnostics. In many medical datasets, certain conditions or diseases (classes) are significantly underrepresented compared to others. This imbalance can skew the training of AI models, leading them to perform well on the majority class while failing to accurately identify the rare, often more critical conditions.

Class imbalance is pervasive in medical datasets for a simple reason: some diseases are naturally less common than others. For instance, rare conditions like spinal muscular atrophy will have fewer instances in a dataset compared to more common conditions like pneumonia. When an AI model is trained on such data, it might learn to overwhelmingly predict the majority class, simply because it encounters it more frequently. This leads to a model that, although seemingly accurate, is practically ineffectual for diagnosing less common, potentially more dangerous conditions.

The Impact of Class Imbalance

The consequences of not addressing class imbalance can be dire. In medical diagnostics, the cost of a false negative (failing to detect a disease when it is present) can be significantly higher than a false positive (detecting a disease when it is not present). A model biased towards the majority class could overlook critical conditions, delaying treatment for patients with rare diseases. This bias not only undermines the model’s clinical utility but also raises ethical concerns about equitable healthcare.

Beyond the ethical implications, class imbalance also poses a technical challenge. Traditional performance metrics like accuracy can be misleading in the presence of class imbalance. A model might achieve high accuracy by simply predicting the majority class, masking its inability to detect the classes of interest. Hence, there’s a pressing need for strategies that allow models to learn from imbalanced data without inheriting its biases.

Addressing Class Imbalance with a Custom Weighted Loss Function

This series aims to tackle the class imbalance head-on by introducing a custom weighted loss function. Loss functions are at the heart of training AI models, guiding them towards accurate predictions by quantifying the difference between the predicted and actual values. By weighting the loss function, we can adjust the model’s sensitivity to underrepresented classes, ensuring that rare conditions receive due emphasis during training.

The journey we’re embarking on will not only delve into the technicalities of implementing a custom weighted loss function but also explore the broader implications of making AI models in healthcare more equitable and effective. Through a combination of code demonstrations and insightful analysis, we’ll uncover how to balance the scales (quite literally) and make strides towards AI models that can diagnose with fairness and precision.

Exploring Class Imbalance in Medical Datasets

The Reality of Class Imbalance

Class imbalance occurs when the number of instances across different classes in a dataset is disproportionately distributed. In the context of medical diagnostics, this means that certain diseases or conditions (classes) are represented by a significantly smaller number of examples compared to others. The implications of this imbalance are profound. AI models, particularly deep learning models, learn by example. When presented with a dataset where one class dominates, these models tend to develop a bias towards the majority class, often at the expense of accurately identifying or diagnosing rarer conditions.

Consider the following visualization code snippet, which brings the issue of class imbalance into sharp relief:

plt.xticks(rotation=90)
plt.bar(x=labels, height=np.mean(train_generator.labels, axis=0))
plt.title("Frequency of Each Class")
plt.show()

This simple yet powerful visualization underscores the varying prevalence of different pathologies within a dataset. It starkly illustrates the extent of class imbalance, revealing how it can undermine the model’s learning process.

A Closer Look at the Data

Upon plotting the frequency of each class, a clear picture of imbalance emerges. Conditions such as `Hernia` are extremely underrepresented, with positive cases constituting about 0.2% of the training dataset. On the other end, pathologies like `Infiltration`, though relatively better represented, still only account for 17.5% of the cases. Such disparities pose significant challenges for training AI models, as the skewed distribution can lead to a bias towards predicting the majority class, reducing the model’s effectiveness in identifying rarer, potentially more critical conditions.

The impact of this imbalance is not just theoretical. Using a standard cross-entropy loss function in such scenarios often results in models that prioritize minimizing errors on the majority class, essentially ignoring the minority class. This is because the loss function’s contribution from each class is not equally weighted, allowing the majority class to dominate the learning process.

Addressing the Imbalance

The challenge of class imbalance in medical datasets is a reminder of the complexities involved in applying AI to healthcare. It underscores the need for careful consideration and adaptation of machine learning techniques to ensure that they serve all patients equitably, not just those with common conditions. The next steps in addressing this imbalance involve innovative approaches such as custom weighted loss functions, which aim to equalize the influence of each class on the model’s learning process. By doing so, we can begin to tip the scales back towards balance, ensuring that AI models in healthcare are as fair as they are accurate.

As we delve deeper into the nuances of class imbalance, it becomes evident that addressing this issue is not just about improving model performance — it’s about ensuring that AI in healthcare lives up to its promise of enhancing diagnosis and treatment for everyone, regardless of how rare or common their condition may be.

Addressing Class Imbalance with Weighted Loss

The Role of Weighted Loss

Weighted loss is an innovative solution designed to counteract the disproportionate influence of prevalent classes in datasets. By assigning different weights to classes based on their frequency, this approach ensures that rare conditions are given due attention during the model training process. The essence of weighted loss lies in its ability to balance the scales, making the model equally sensitive to both common and rare conditions. This is pivotal in medical diagnostics, where the cost of overlooking a rare condition can be life-threatening.

Computing Class Frequencies

The first step towards implementing weighted loss is computing the class frequencies. This involves determining the proportion of positive and negative examples for each condition within the dataset. Here’s how it can be done:

def compute_class_freqs(labels):
 N = labels.shape[0]
 positive_frequencies = np.sum(labels, axis=0) / N
 negative_frequencies = 1 - positive_frequencies
 return positive_frequencies, negative_frequencies

This code snippet calculates the frequency of positive and negative cases for each class, which is essential for understanding the extent of class imbalance.

Visualizing Contributions

To visualize the disparity in class contributions, we can plot the frequencies of positive and negative cases across different conditions:

data = pd.DataFrame({"Class": labels, "Label": "Positive", "Value": freq_pos})
data = data.append([{"Class": labels[l], "Label": "Negative", "Value": v} for l,v in enumerate(freq_neg)], ignore_index=True)
plt.xticks(rotation=90)
sns.barplot(x="Class", y="Value", hue="Label" ,data=data)

This visualization starkly highlights the imbalance in contributions, where positive cases are often significantly outnumbered by negative ones.

Balancing Contributions with Weighted Loss

To counter this imbalance, we introduce class-specific weight factors for each example, aiming to equalize the overall contribution of each class to the loss. By adjusting the weights of positive and negative cases inversely proportional to their frequencies, we ensure a balanced representation:

pos_weights = freq_neg
neg_weights = freq_pos
pos_contribution = freq_pos * pos_weights 
neg_contribution = freq_neg * neg_weights

Verification of this approach through visualization demonstrates that the contributions of positive and negative labels within each class can indeed be balanced:

data = pd.DataFrame({"Class": labels, "Label": "Positive", "Value": pos_contribution})
data = data.append([{"Class": labels[l], "Label": "Negative", "Value": v} for l,v in enumerate(neg_contribution)], ignore_index=True)
plt.xticks(rotation=90)
sns.barplot(x="Class", y="Value", hue="Label" ,data=data);

This equality in contributions is a significant milestone in creating AI models that can accurately identify both common and rare conditions, making weighted loss a powerful tool in the arsenal against class imbalance.

By implementing weighted loss, we take a significant step toward AI models that offer not just high accuracy but also fairness and inclusivity. This approach ensures that every patient, regardless of the rarity of their condition, stands an equal chance of being correctly diagnosed, bringing us closer to the ideal of equitable healthcare through technology.

Implementing Weighted Loss in Model Training

Deriving Weights from Class Frequencies

The cornerstone of weighted loss is the calculation of positive and negative weights based on class frequencies. The objective is straightforward: assign higher weights to rarer classes and lower weights to more common ones, thereby ensuring that each class contributes equally to the model’s learning process. This balancing act is achieved by inversely proportioning the weights to class frequencies. For instance, if a particular condition is underrepresented in the training data, it receives a higher weight, making its correct prediction more impactful on the overall loss and, consequently, on the model’s updates during training.

The Weighted Loss Function Explained

The `weighted loss function` is a custom function designed to adjust the model’s sensitivity to classes based on their prevalence. The provided code snippet outlines how to implement such a function:

def get_weighted_loss(pos_weights, neg_weights, epsilon=1e-7):
 def weighted_loss(y_true, y_pred):
 loss = 0.0
 for i in range(len(pos_weights)):
 loss += -(K.mean(pos_weights[i] * y_true[:,i] * K.log(y_pred[:,i] + epsilon) + \
 neg_weights[i] * (1 - y_true[:,i]) * K.log(1 - y_pred[:,i] + epsilon), axis = 0))
 return loss
 return weighted_loss

This function calculates the weighted loss for each class and sums them to get the total loss. It employs a small value, epsilon, to prevent numerical errors when computing the logarithm of predicted probabilities. By doing so, it ensures stability in the loss computation process, which is crucial for the successful training of deep learning models.

Impact on Training

Incorporating weighted loss into model training fundamentally changes how the model learns from data. Traditionally, models might gravitate towards maximizing accuracy on the majority class, often at the expense of minority classes. However, by applying weighted loss, the model is penalized more for misclassifying underrepresented classes, compelling it to pay equal attention to all classes.

This adjustment leads to a more balanced model that performs better across the board, not just on the conditions it sees most often. Such models are inherently more fair and useful in clinical settings, where missing a rare condition can have dire consequences. Moreover, models trained with weighted loss are likely to exhibit improved generalization capabilities, as they learn to recognize patterns across a more diverse set of examples.

In conclusion, implementing weighted loss in model training is a powerful strategy for combating class imbalance in medical diagnostics. It not only promotes the development of more equitable and accurate AI models but also aligns closely with the overarching goal of healthcare: to deliver personalized, precise, and inclusive medical diagnostics and treatment. As we continue to push the boundaries of what AI can achieve in healthcare, techniques like weighted loss underscore the importance of thoughtful and ethical AI development.

Building and Training the Model with DenseNet121

In the realm of medical image analysis, the accuracy and efficiency of AI models are paramount. Enter DenseNet121, a convolutional neural network that has become a cornerstone for developing advanced diagnostic tools. Its intricate architecture, designed for high precision and minimal error, makes DenseNet121 an ideal backbone for medical image analysis models.

Why DenseNet121?

DenseNet121 stands out for its unique structure, which facilitates feature reuse, making it exceptionally efficient in terms of computational resources and memory. Each layer in DenseNet121 is directly connected to every other layer in a feed-forward fashion. For medical image analysis, this architecture translates into a model capable of capturing complex patterns and details in medical images, crucial for accurate diagnosis. The depth and sophistication of DenseNet121 allow it to learn detailed features from medical images, making it a preferred choice for professionals in the field.

Customizing with GlobalAveragePooling2D and Dense Layers

Adapting DenseNet121 for a specific medical diagnostic task involves adding custom layers to the pre-trained model. The addition of a `GlobalAveragePooling2D` layer serves a critical function: it converts the feature maps obtained from DenseNet121 into a single 1D vector per map. This transformation is vital for reducing the dimensionality of the model’s output, making it manageable and suitable for classification tasks without sacrificing spatial information.

Following the GlobalAveragePooling2D layer, a `Dense` layer with a `sigmoid` activation function is added. This layer acts as the logistic layer, responsible for producing the final prediction output. The sigmoid activation ensures that the model’s predictions are in the form of probabilities, ranging between 0 and 1, for each class in the dataset. This setup is particularly beneficial for medical diagnostics, where the model must distinguish between various conditions with high accuracy.

Compiling the Model with Weighted Loss

The final step in preparing the model for training involves compiling it with a custom weighted loss function. This function is crucial for addressing the challenge of class imbalance in medical datasets. By assigning different weights to classes based on their frequency, the weighted loss function ensures that rare conditions are adequately represented during the training process.

# create the base pre-trained model
base_model = DenseNet121(weights='./nih/densenet.hdf5', include_top=False)
x = base_model.output
# add a global spatial average pooling layer
x = GlobalAveragePooling2D()(x)
# and a logistic layer
predictions = Dense(len(labels), activation="sigmoid")(x)
model = Model(inputs=base_model.input, outputs=predictions)
model.compile(optimizer='adam', loss=get_weighted_loss(pos_weights, neg_weights))

This code snippet encapsulates the process of building and compiling the model. By utilizing the DenseNet121 architecture, enriching it with custom layers for specific diagnostic tasks, and addressing class imbalance with a weighted loss function, we create a powerful tool capable of transforming medical diagnostics. This approach not only leverages the strengths of DenseNet121 but also customizes it to meet the nuanced needs of medical image analysis, paving the way for more accurate, reliable, and equitable healthcare solutions.

Insights and Applications of the Model

The journey of developing an AI model for medical diagnostics, particularly one built on the robust architecture of DenseNet121 and trained with a custom weighted loss function, is fraught with challenges and learning opportunities. Yet, the potential applications and future directions of such a model are vast and inspiring.

Training Insights: Overcoming Challenges

Training an AI model for medical image analysis is no small feat. The first hurdle we encountered was the class imbalance prevalent in medical datasets. Rare conditions were overshadowed by more common ones, skewing the model’s learning process. By implementing a weighted loss function, we were able to give due importance to each class, ensuring a balanced learning approach. This adjustment was crucial in making the model more sensitive to less prevalent conditions, thus enhancing its diagnostic accuracy across a broader spectrum of diseases.

Another challenge was the sheer size and complexity of medical images. DenseNet121, with its efficient feature reuse mechanism, proved instrumental in extracting meaningful patterns from these images without an exorbitant computational cost. Integrating GlobalAveragePooling2D and Dense layers further refined the model’s output, making it suitable for classification tasks. Through iterative testing and refinement, we optimized the model’s architecture for the nuanced needs of medical diagnostics.

Applications in Medical Diagnostics

The trained model opens up new horizons in the field of medical diagnostics. Its ability to accurately classify various conditions from medical images has several practical applications:

Early Detection: For diseases where early detection significantly improves prognosis, such as cancer, the model can serve as a vital tool in screening processes, potentially saving lives by identifying conditions before they become too advanced.
Diagnostic Support: The model can assist healthcare professionals by providing a second opinion, thus reducing diagnostic errors and improving patient outcomes.
Remote Diagnostics: In regions with limited access to medical specialists, the model can provide reliable diagnostics, bridging the gap in healthcare access.

Future Directions

While the achievements of the current model are significant, the field of medical AI is ever-evolving, with ample room for improvement and innovation. Future directions might include:

Exploring Other Architectures: Beyond DenseNet121, experimenting with other architectures like EfficientNet or custom models could yield even more efficient and accurate diagnostic tools.
Expanding Datasets: Training the model on more diverse datasets, including those from different demographics and geographical locations, could enhance its generalizability and accuracy across a wider range of conditions.
Addressing Other Medical AI Challenges: Beyond class imbalance, tackling issues like data privacy, model interpretability, and integration into clinical workflows will be crucial for the widespread adoption of AI in healthcare.

In conclusion, the development and training of a DenseNet121-based model for medical diagnostics represent a significant step forward in the application of AI in healthcare. The insights gained from this process, the model’s potential applications, and the envisioned future improvements highlight the transformative power of AI in enhancing medical diagnostics. As we continue to refine these models, we move closer to a future where AI-driven tools are integral to delivering accurate, efficient, and accessible healthcare worldwide.

📒 Compiled by — Sigrid Chen, Rehabilitation Medicine Resident Physician, Occupational Therapist, Personal Trainer of the American College of Sports Medicine.