2  Getting the data

The purpose of the workshop is to:

In the presentation we want to see:

Data can be both spatial and non spatial. The former is some kind of object (often a shapefile) the latter is a dataset file (such as a csv). Typically these files will have a unique ID that we can join them on - such as a district or ward.

2.1 Datasets

  • Open Data City
    • Search Hyderabad
    • Points for: Schools, health care facilities, parks and playgrounds, toilets, slums, Annapurna meals
    • Polygons for: Wards
    • Non spatial data: Census, Education data (number of places)
  • Open Street Map
    • Roads
    • Points of interest (e.g. schools, tourist attractions, shops, health care facilities, restaurants, bakeries)
    • Buildings
    • Railways
  • Population per 1km grid cell
    • For a variety of years

Census data

I have found a variety of sources that contain census data. Some make it difficult to join the data as there is no matching column. Last year the Development Data Lab complied this data for us!

You can read more about their work in their medium article

As village boundaries can change between census years it is challenging to compare values. In response they have developed a standard zone called a shrid that keeps the boundaries the same across all census years! You can decide what boundary is most appropriate for your question - do you care about past data or is the most recent census data sufficient?

In the 2011 village data the columns stand for:

  • pc11_s_id = state
  • pc11_d_id = district
  • pc11_sd_id = sub district
  • pc11_tv_id = village

Within this site there is:

  • Election data
  • Facebook population and estimated wealth
  • 2012-2021 Night time lights (used for monitoring GPD / urban growth)
  • Pollution data (from satellites)
  • Consumption and poverty estimates