Data Analysis and Visualization: Worldwide Female Access to Democracy
Can you actually represent women’s worldwide oppression over the past 150 years? Do women enjoy equal, secure, and effective access to justice? I analyzed Big Data related to women's access to democracy in order to investigate the relationship between geography, country of origin, and women's access to democratic institutions over the past 150 years. I used the Varieties of Democracy Dataset Version 6.2 for this analysis.
Data Overview
The Varieties of Democracy (V-Dem) dataset works to conceptualize the level of democracy in countries around the world as a reaction to the popular Freedom House rankings which are important to a functioning democracy. Although this dataset has information about democracy, categorized into principles concerning elections, liberty, participation, deliberation, equality, the power of the majority, and consent in governance, for this analysis I only focused on female related indicators. I wanted to break down this broad dataset on democracy in countries around the world and focus on the access to such democracy and democratic rights that women may or may not enjoy depending on where they live. The VOD relies on primarily survey data to inform their studies (Read more about V-Dem Here).
To prepare the data we eliminated all columns with data not related to women’s issues and added latitude and longitude values for all the countries to the dataset so that we could graph based on geography (and yes I did google the longitude and latitude of every country in the world).
I built out a data visualization system using python that had options to plot 3d data, make a linear regression, perform a PCA analysis and graph, find clusters and perform classifications. I plotted my V-Dem data using longitude (x), latitude (y), and the Female access to justice (z). I used color and size of points to portray access to democracy.
Results


To generate these images I ran a k-means clustering analysis on our dataset to group to find the countries with the greatest similarities in female access to democracy. The first image has three clusters and the 2nd has just two clusters and is graphed by longitude(x) and latitude(y). The size is a representation of female empowerment data. The colors represent the clusters and the size of the dots represents the political empowerment ranking in each country. From these graphical representations we are able to make the conclusion that Northern Europe, Australia, New Zealand, and North America are locations with a high sense of female empowerment (based on our indicators from our access to democracy data). The graph with 2 clusters clearly differentiates the developing countries from the non-developed countries, showing a correlation between female access to democracy/freedom and the political and economic level of a country.
PCA Analysis
These are the results of running a PCA analysis on our data and plots the corresponding eigenvectors with their data features (Political empowerment, Suffrage in practice, etc.). A Principal component analysis (PCA) portrays which of our many data features vary together and therefore which features are the most important. We were able to tell from this data that countries with a higher female empowerment score also tended to have higher values for female representation in government on the local and high government level (ie representation and diversity in politics are NECESSARY to create change; these are the receipts)
Linear Regression
This regression graphs the relationship between political empowerment and an even power distribution. This graph shows a strong positive relationship (r = .74) between the two features, which indicates that in countries where women have higher empowerment level there is a more even power distribution between males and females.
150 years of Emerging Female Empowerment in 4.2 sec
The above gif measures 110 years of female empowerment using 10 year chunks (starting in 1900 and ending in 2010). To do this I ran a method for all 110 years of data points to export data to a csv for each individual year and then ran our display system on these files in 10 years chunks. The 2 clusters (blue and red) differentiate the developing from the developed countries with x and y values of longitude and latitude and a size/z value of political_empowerment. This visualization shows the world map as clustered and visualized according to female access to democracy and the gradual improvement to worldwide female empowerment over time.
In Conclusion…
So what does this mean? From our analysis we can see that Northern Europe is a center for high levels of democracy access for women. This is contrasted with the lack of access in areas of the world that are typically considered less developed. In addition, we can see from our time series analysis gif that conditions for women worldwide are improving around the world. Specifically, parts of South America, central Asia, and the coast of Africa have changed from red to blue during our cluster analysis. This leads us to conclude that these geographic locations have had more dramatic change relating to access to democratic institutions for females over the course of a 110 year period. This is important to note because this implies that conditions are getting better on a large scale and the fight for equality and equal access to democratic principles is experiencing some success.
Special thanks to my coding partner Helen Chavey