Skip to content

Clustering Analysis of HIV Prevention Strategies on Magnetic Couples Study

Team

  • Yuexuan Ban
  • Lishan Gao
  • Xubin Lou
  • Yuwei Shao

Sponsor

Professor James M. McMahon

Introduction

Background

The HIV epidemic continues to be a major health burden in the U.S., with more than 30,000 new HIV diagnoses annually. A majority of heterosexually transmitted cases of HIV occur among intimate HIV sero-different couples, and there are an estimated 200,000 HIV sero-different couples living in the U.S. There is a lack of data and knowledge regarding the factors associated with HIV transmission in this group.

Mission

Our goal is to help clinicians find the main predictors that are associated with HIV prevention methods and examine how the magnetic couples’ prevention strategies change over time.

Data

A couple-level dataset that contains the survey/biometric data of 185 variables and 1432 records. Each record contains the survey responses for both the HIV-positive and HIV-negative partners in a given wave.

Method

We first used t-SNE as the dimensionality reduction method, and then we applied K-Means clustering to explore the predictor variables in waves 1&2 and waves 5&6. Then we used TukeyHSD and Mann Whitney to check the significance.

Data Preprocessing

  • Eliminated the KABST outcome variable.
  • Separated data into waves 1&2 and waves 5&6.
  • Calculated the mean for non-missing values.
  • Dropped couples who have missing values in both waves 1&2 or waves 5&6.  
  • Dichotomized the remaining 3 outcome variables: 
    • IF TFV_DP <500 then TFV_DP_d=0, ELSE TFV_DP_d=1;
    • IF PCT_COND_mean=0 then COND_d=, ELSE CON_d=1;
    • IF VL_C<200 THEN VL_d=1 ELSE VL_d=0.

Descriptive Analysis

Clustering Model

Results

Age

For waves 1&2, couples who at least use condoms and PrEP medication are older, and couples who do not use condoms or use only condoms are younger. 

For waves 5&6, couples who at least use PrEP are older, and couples who do not use condoms or PrEPare younger. Also, couples who do not use condoms or PrEP  are younger. Couples who use condoms and viral load, or at least use PrEP, are older.

Sexual Relationship Power 

For HIV-negative partners’ sexual relationship in waves 1&2, couples who do not use condoms will have weaker partner sexual relationship power. Couples who use condoms and viral load or use at least PrEPhave relatively healthier partner sexual relationship. 

For waves 5&6, couples who at least use PrEP would want to maintain a longer relationship with their partner. Couples who use condoms and viral load or use only condoms would have less willingness to maintain their relationship. Care Satisfaction predictor variable: In waves 1&2, couples with a higher care satisfaction about medical providers use PrEP, contrasting with couples who do not use PrEP. In waves 5&6, couples who do not use condoms or PrEP are HIV-negative partners who have lower care satisfaction with medical providers. Couples who at least use PrEP have higher care satisfaction with medical providers.

Conclusion

This study focused on using unsupervised learning algorithms to examine the main predictors associated with prevention strategies. We used t-SNE as the dimensionality reduction strategy and K-means as the clustering model. Then we utilized the Tukey HSD test and Mann-Whitney Wilcoxon test to analyze the statistical significance that exists between clusters in different wave comparisons. Our study shows that the main predictors in our study have somewhat significant power in determining magnetic heterosexual couples’ choices when choosing prevention strategies. We would want to further invest in the model by performing testing on more crucial predictors.

Acknowledgements

We want to thank Professor James M. McMahon for the dataset, information, and insights. We also want to thank Professor Ajay Anand and Professor Cantay Caliskan for their support and suggestions on our project.