The production and use of development data have undergone significant transformation over the past two decades. The shift from paper-based records to digital formats has made data more accessible and easier to share. The open data movement has dramatically increased the availability of government and institutional datasets, which in turn catalyzed greater opportunities for analysis, transparency, and innovation. And major advances in big data and data science have further expanded both the volume and diversity of information guiding development policy.
Amid rapid advances in artificial intelligence (AI), development data has now reached another pivotal juncture: the evolution to AI-ready development data—data that is readily discoverable, comprehensible, accessible, and usable by both humans and AI applications.
AI, particularly large language models (LLMs), is completely transforming the way people interact with data. Data users at all levels of experience and expertise—from first-timers to power users—are now able to pose complex questions in natural language to chatbots, to which they expect to promptly find, interpret, and present data-driven insights packaged as pithy, accurate responses.
For this evolution to be successful, AI systems need to get it right. This means the data being accessed and interpreted by AI systems must first be evaluated, validated, structured, governed, and shared in ways that support the responsible and effective use of AI. In short, the data must be “AI-ready.”
AI-ready data does not supplant earlier advancements, foundational concepts, or standards—such as the Fundamental Principles of Official Statistics, open data frameworks, or the FAIR (Findable, Accessible, Interoperable, and Reusable) principles—but rather it builds on them. By extending established foundations and standards, AI-ready data means that development data is continuously open, discoverable, and reusable, while ensuring that it is systematically organized and well-documented, to facilitate seamless use by both people and AI systems. Ensuring AI-readiness can thus shorten the distance between development data and decision-making for better policies and faster innovation, democratizing development insights. The World Bank, in its efforts to become a bigger, better “Data Bank,” is already working to make this happen, in partnership with country partners and the global development community.
Generative AI has emerged as a key interface for individuals seeking information, including on development-related topics. Platforms such as Google’s AI Overviews, Microsoft’s Bing, Perplexity.AI, and OpenAI’s ChatGPT comb through the internet and combine different sources of information to generate responses to user queries. The challenge, of course, is that AI responses are only as authoritative as the data that feed them. And the reality is that these systems frequently draw upon general internet content (including unproven sources) or web search results, rather than prioritizing authoritative data sources like the World Bank or national statistical offices.
Since current AI systems often select suboptimal development data sources, users regularly encounter outdated or incorrect responses, even when accurate information is otherwise available. This is problematic, since most AI responses have the appearance of providing authoritative information, even as they hallucinate.
It is important to emphasize that high-quality, authoritative development data is not scarce. In other words, AI tools do not need to rely on suboptimal data sources to form responses to queries about development topics. What is missing is a standardized framework and robust infrastructure to enable AI tools to consistently find, access, and use reliable development data from trusted sources to deliver accurate answers to user questions.
AI-ready development data can help overcome this information integrity problem. It is possible to enable seamless AI access to and use of trusted development data through the adoption of interoperability protocols and standards by governments, international organizations, and the private sector. Doing so will help support evidence-based decision-making, enhance public access to reliable information, and can promote trust in authoritative sources of development data and statistics.
AI-ready development data is systematically organized and thoroughly documented to ensure its meaning and context are clear not only for subject matter experts, but also for general users and AI systems.
Three core pillars define AI-ready development data:
By leveraging these foundational elements, development data becomes an accessible asset to all stakeholders. AI-ready data is positioned to enhance public access, enable advanced insights through AI, and facilitate more rapid and informed decision-making throughout society.
To operationalize these foundational pillars, we must translate principles into actionable steps. Development data encompasses several forms, including indicators, microdata, and geographic datasets. While the following recommendations can be adapted to different types of data, they are especially tailored for indicators.
The World Bank’s Development Data Group and Office of the Chief Statistician is actively making investments in these domains, including the piloting of advanced search tools, developing embedding models for low-resource contexts, integrating APIs, and the development of an MCP server to support the new Data360 platform and other selected datasets.
The World Bank, through its Data Quality and AI for Data / Data for AI work programs, advances these initiatives by providing open-source resources, including the Metadata Editor, comprehensive guidelines for creating high-quality metadata, and pilot frameworks that leverage AI to efficiently assess and enhance the quality of metadata.
The World Bank is establishing partnerships among international organizations, including the United Nations Statistical Commission, the IMF, the OECD, and the African Development Bank (AfDB), countries, and the private sector to promote governance and the adoption of global standards and mechanisms for effectively managing and using development data to work with AI systems.
Development data differs from most private sector data as it must meet the needs of diverse users, including governments, organizations, researchers, civil society, businesses, and the public. Treated as public intent data, it requires openness, transparency, and accountability. Since development data influences policy and investment decisions across countries and systems, interoperability and thorough documentation are essential.
The continuous use and reuse of development data generate compounding value. By making development data AI-ready and accessible to AI-powered solutions across both public and private sectors, we will increase its impact, promote more equitable sharing of benefits, and strengthen trust that data will be used responsibly. AI can help us unlock broader and potentially transformative economic and social value from data, enhancing our efforts to improve lives, drive economic development, and end poverty.
The transition to AI-ready development data is both urgent and extensive. Realizing this objective will necessitate:
We encourage national statistical offices, data producers, policymakers, and technology partners to participate in this initiative. Through collaborative effort and the necessary adoption of global data quality standards, we can ensure that development data continues to serve as a reliable, inclusive, and robust resource for the public good as we progress into the Age of AI.
Let us work collectively to prepare development data for the future and ensure its benefits are accessible to all.
Source: blogs.worldbank.org
Artificial intelligence differs from other technological advancements in finance, such as the initial adoption of…
Industrial raw materials such as nickel, cobalt, and rare earths are critical inputs in countless…
As European governments scale up investment, bond market stability is more critical than ever. This…
Economists have long warned of the negative consequences of excessive US public debt (e.g. Friedman…
Financial distress affects roughly one in five adults in OECD countries (OECD 2024). It constrains…
Until 2018, the US-China trade data gap was in line with the discrepancies found in…