The changing energy landscape requires rigorous analysis to support robust investment and policy decisions. Power systems are complex, hence researchers and analysts often rely on large numerical computer models for a variety of purposes, ranging from price projections to policy advice and system planning. Such models include unit commitment, dispatch, and generation expansion models.

These models require a large amount of input data, such as information about existing power stations, interconnector capacity, yearly electricity consumption, and ancillary service requirements, but also (hourly) time series of load, wind and solar power generation, and heat demand. Fortunately, most of these data are publicly available, from sources such as transmission system operators, regulators, or industry associations.

However, data collection is tedious. The bits and pieces of data are sometimes hard to find, often poorly documented, and almost always tedious to process: files are provided in different formats; downloading requires repetitive manual clicking; data structures between different sources are incompatible; daylight savings time and leap years are treated differently; URLs change frequently; and older data are updated without informing users (and sometimes deleted altogether).

Double work is inefficient. Currently dozens, if not hundreds, of modelling teams in Europe spend significant resources gathering and processing data, all doing essentially the same thing. Highly skilled people waste a lot of time gathering data, time that would be better spent doing actual research. We, the project team members, are power system modellers and have gone through this process ourselves, a sometimes quite frustrating experience. Double-work is a waste of resources. Providing aggregated and aligned data at a central place is a public good that had been short in supply.

Moreover, the licenses and conditions under which data can be used are often unclear. Regularly, data owners exclude commercial use of their data, putting energy companies and consulting firms in a situation of legal uncertainty.

Double work, poor documentation and legal uncertainty is in an unsatisfactory state of affairs. This is why we set up the project Open Power System Data.

The idea to set up an open platform for data required by energy system models was born in 2014, around the same time when a number of energy system researchers interested in advancing open source and open data in their field gathered for the first time in Berlin and formed what is now the Open Energy Modelling Initiative (short: openmod initiative). Frauke Wiese, then at Europa-Universität Flensburg and Lion Hirth, as founder of Neon Neue Energieökonomik took the lead and formed a team around four institutions to apply for funding by the German Federal Ministry for Economic Affairs and Energy to develop and create an open platform for power system modelling: OPSD.
The first project phase ran from August 2015 to July 2017 and was carried out by Europa-Universität FlensburgDIW Berlin, Technical University of Berlin, and Neon Neue Energieökonomik. It was funded by the German Federal Ministry for Economic Affairs and Energy. Over the course of the project, different people have worked on the project. We are very thankful to the former team members and student assistants who have supported us during the course of the project. Also, we wish to thank all our stakeholders who have provided valuable input to the project.
The current project phase runs from January 2018 until December 2020 and is carried out by Neon Neue Energieökonomik, Technical University of BerlinDIW Berlin and ETH Zürich. There will be regular, at least yearly releases of all existing data packages and additions of new variables in the data packages.