Accessing Economic Data Using Python and the World Bank API with wbdata

Analyzing economic data is critical for a range of applications, from governmental policy-making to financial forecasting and academic research. The World Bank API provides a rich dataset, offering access to thousands of economic indicators from over 200 countries. In this post, we will delve into using the wbdata Python library to query the API efficiently, retrieving GDP data for the last 50 years, comparing Brazil, Argentina, and other top global economies.

1. Introduction to the World Bank API and wbdata

The World Bank API allows developers to retrieve a wealth of data on topics such as GDP, inflation, life expectancy, and poverty rates. The Python library wbdata simplifies the process of accessing this API, allowing users to focus on data analysis without needing to manage raw HTTP requests manually.

Features of wbdata:

  • Over 16,000 economic indicators.
  • Filtering by country, date, and specific indicators.
  • Easy integration with pandas for data manipulation.
  • Returns data in JSON format, ideal for programmatic processing.

2. Installation of wbdata

To start working with the API using Python, install the necessary libraries:

pip install wbdata pandas matplotlib

This command installs:

  • wbdata: For interacting with the World Bank API.
  • pandas: For data manipulation.
  • matplotlib: For visualizing data.

3. Understanding the Structure of the wbdata Library

The wbdata library abstracts API calls, making it easier to query World Bank datasets with a few lines of code. It uses Python dictionaries for indicators and country codes (ISO 3166-1 alpha-2 format), and supports date-based querying.

Basic Structure of a Query:

import wbdata
import pandas as pd
import datetime

# Set the date range
start_date = datetime.datetime(1974, 1, 1)
end_date = datetime.datetime(2023, 1, 1)

# Select countries (ISO codes for Brazil, Argentina, and major global economies)
countries = ['BR', 'AR', 'US', 'CN', 'DE', 'JP', 'IN']

# Select the indicator (GDP in current US dollars)
indicator = {'NY.GDP.MKTP.CD': 'GDP (current US$)'}

# Retrieve the data from the World Bank API
df = wbdata.get_dataframe(indicator, country=countries, data_date=(start_date, end_date), convert_date=False)

# Display the first few rows
print(df.head())

Explanation of Key Parameters:

  • indicator: This dictionary maps World Bank indicators to human-readable names (e.g., GDP in current US dollars).
  • country: A list of country codes (ISO 3166-1 format), representing the countries you want to query.
  • data_date: A tuple defining the time range for the data query.
  • convert_date: Converts the date into a more readable format when set to True.

4. Downloading GDP Data for the Last 50 Years

The following example retrieves GDP data (in current US dollars) for Brazil, Argentina, and 30 of the largest global economies over the past 50 years. We will also format the data into trillions of dollars.

import wbdata
import pandas as pd
import datetime

# Set date range: Last 50 years (1974-2023)
start_date = datetime.datetime(1974, 1, 1)
end_date = datetime.datetime(2023, 1, 1)

# Select countries (Brazil, Argentina, and top economies)
countries = ['BR', 'AR', 'US', 'CN', 'DE', 'JP', 'IN', 'FR', 'GB', 'IT', 'CA', 
             'KR', 'RU', 'ES', 'AU', 'MX', 'ID', 'TR', 'NL', 'CH', 'SA', 'NG', 
             'PL', 'SE', 'BE', 'TH', 'ZA', 'MY', 'PH', 'EG']

# Select indicator: GDP in current US dollars
indicator = {'NY.GDP.MKTP.CD': 'GDP (current US$)'}

# Retrieve the data
df = wbdata.get_dataframe(indicator, country=countries, data_date=(start_date, end_date), convert_date=False)

# Reshape the data
df = df.reset_index()
df_pivot = df.pivot(index='date', columns='country', values='GDP (current US$)')

# Convert GDP from US dollars to trillions
df_pivot = df_pivot / 1e12

# Move Brazil and Argentina to the front of the DataFrame
df_pivot = df_pivot[['BR', 'AR'] + [col for col in df_pivot.columns if col not in ['BR', 'AR']]]

# Display the last 10 years of data
print(df_pivot.tail(10))

# Save to CSV
df_pivot.to_csv('gdp_comparison.csv')

This code block fetches the GDP data from the World Bank API, reshapes it using pandas, and converts the results into trillions of US dollars.

5. Visualizing the Data

After retrieving and formatting the data, you may want to visualize it. The following example uses matplotlib to create a simple line chart comparing the GDP of Brazil, Argentina, the US, and China over the past 50 years.

import matplotlib.pyplot as plt

# Plot GDP comparison
df_pivot[['BR', 'AR', 'US', 'CN']].plot(figsize=(10, 6))

# Customize the plot
plt.title('GDP Comparison (1974-2023)')
plt.ylabel('GDP (Trillions of US$)')
plt.xlabel('Year')
plt.legend(title="Countries")
plt.grid(True)

# Show the plot
plt.show()

6. Key World Bank API Endpoints and Contracts

The World Bank API provides several key endpoints to access different types of data. Below is a summary of the most relevant endpoints along with their request formats and response structures.

Key Endpoints:

EndpointDescriptionRequest FormatResponse Format
/v2/country/{country}/indicator/{indicator}Retrieves time series data for a specific country and indicator./v2/country/BR/indicator/NY.GDP.MKTP.CDJSON, XML
/v2/countryRetrieves metadata for countries or country groupings./v2/countryJSON, XML
/v2/indicatorRetrieves metadata for all available indicators./v2/indicatorJSON, XML
/v2/sourceRetrieves metadata for data sources available in the API./v2/sourceJSON, XML
/v2/topicRetrieves metadata for topics and sectors./v2/topicJSON, XML

Example Contracts for Key Endpoints:

  1. Time Series Data for a Specific Indicator:
  • Request URL: /v2/country/BR/indicator/NY.GDP.MKTP.CD
  • Response Example (JSON):
   [
       {
           "indicator": {"id": "NY.GDP.MKTP.CD", "value": "GDP (current US$)"},
           "country": {"id": "BR", "value": "Brazil"},
           "date": "2022",
           "value": 1.609383e12
       },
       {
           "indicator": {"id": "NY.GDP.MKTP.CD", "value": "GDP (current US$)"},
           "country": {"id": "BR", "value": "Brazil"},
           "date": "2021",
           "value": 1.520223e12
       }
   ]
  1. List of Countries:
  • Request URL: /v2/country
  • Response Example (JSON):
   [
       {
           "id": "BR",
           "iso2Code": "BR",
           "name": "Brazil",
           "region": {"id": "LCN", "value": "Latin America & Caribbean"},
           "incomeLevel": {"id": "UMC", "value": "Upper middle income"},
           "capitalCity": "Brasilia"
       },
       {
           "id": "AR",
           "iso2Code": "AR",
           "name": "Argentina",
           "region": {"id": "LCN", "value": "Latin America & Caribbean"},
           "incomeLevel": {"id": "UMC", "value": "Upper middle income"},
           "capitalCity": "Buenos Aires"
       }
   ]

7. Summary Table of Commands

Here’s a summary of key wbdata commands and their purposes:

CommandDescription
wbdata.get_dataframe(indicator, country)Fetches data for specific indicators and countries in DataFrame format.
wbdata.get_source()Retrieves all available data sources from the World Bank

API. |
| wbdata.search_indicators() | Searches for indicators by keyword (e.g., GDP, unemployment). |
| wbdata.get_country() | Retrieves metadata about countries and regions. |
| wbdata.get_topic() | Retrieves metadata on specific topics or sectors (e.g., Education). |

Conclusion

The wbdata library provides an efficient way to access vast datasets from the World Bank API. Whether you are comparing the economic growth of nations or performing in-depth statistical analysis, this tool, combined with Python’s powerful data manipulation libraries, opens up numerous possibilities. You can further explore the World Bank’s vast array of economic indicators by modifying the country codes, date ranges, and indicators in your queries.

Edvaldo Guimrães Filho Avatar

Published by

Categories:

Leave a comment