Dec 28, 2022
8 minute read

List of Open Datasets That Can Be Used As JSON Sample Data


Where to find free and open dataset for your next JSON data-required project? There are some influential organizations and businesses that publicly provide their data for educational and research purposes.

Say, for example, you are a data scientist searching for free and open data sources for your academic project. Or need some mock data to test your software system, and you need a head start. Here you'll find a good start from these open and free data sources to look up.

In this post, we'll give you an organized and diverse list of these resources, which we picked based on important and useful necessities. We specifically picked those that offer JSON format.

NameCategoryTopicAccess right
WorldBankFinanceGDP and financial challenges and solutions.Free
CoinMarketCapFinancecryptocurrencyFree
Open NotifyData ScienceOpen APIs from space and spacecraftFree and open
PokeAPIData SciencePokémon game dataFree and open
DstlData ScienceMachine learning datasetsFree and open
Kaggle datasetsData ScienceEducational datasetsFree and open
yelpData ScienceEducational, databases, and NLP learningFree and open
PileData Sciencelanguage modeling datasetFree and open
iSAIDEducationalInstance Segmentation in Aerial ImagesFree and open
SpaceNetChallengeEducationalopen-data-registry for researchers, businesses, and individualsFree and open
Australian governmentGovernmentNation’s public studiesFree and open
FDAGovernmentPublic data about food and drugFree and open
data.police.ukGovernmentCrime and policing in EnglandOpen Government License
Scottish ParliamentGovernmentParliament produced open datasetsFree and open
httpbinMock dataHTTP ServiceFree and open
MockarooMock dataMock data generatorLimited free usage, pricing for bigger datasets
Nasa earth dataPublic dataYour Gateway to NASA Earth Observation DataFree and open
Archive.orgPublic dataPublic datasets from websites or individualsFree and open
VizgrStatisticStatistic data on a variety of topicsFree and open, protected by copyright.
Controlled Vocabulary ServicesOther datasetsCountry and politics-related datasetsFree and open
OSIOther datasetsThe leading voice on the policies and principles of open sourceFree
Hugging FaceOther datasetsAI-related datasetsFree for basic usage, and with license pricing for professionals.
TVMazeOther datasetsTV information public APIFree

In the finance field, data is a basic need to create a trusted business environment, although some challenges are in the way. For each company, it is critical to analyze and specify the ideas, providing the datasets that help you join the dots across edges to locate insights is a helpful asset. Accessing transparent and legal data for you to build your tech, data analysis, or SaaS product is an essential need.

  • Worldbank This institution works in developing financial products and provides technical assistance to help countries find resolutions to the challenges they encounter.

  • CoinMarketCap This resource is offering cryptocurrency-related data with advanced API and is the leader in providing crypto market data.

In the process of developing an academic or educational project, finding the right data resources is a major step. Data science projects include wide use cases. Training data to validate machine learning techniques and natural language processing (NLP) are among the most notable ones.

An index of publicly available data for examination or study might be a good start. The following list of websites generates or collects high-quality data and offers them as open datasets or APIs.

  • Open Notify Is an open-source project that offers a programming interface as a restful API for some of NASA's raw data.

  • PokeAPI On this website, you can access data from a game named Pokémon provided by a restful API. The API offers the video game information on the main character's game plays.

  • Dstl This data provider gathers a standard dataset that could be used to train and validate machine learning strategies and NLP methods.

  • Kaggle JSON sample datasets In this data resource, you can explore and examine academic datasets available in different data formats, including JSON.

  • Yelp Open Dataset The Yelp dataset is a subset of user data to be used for private or academic purposes. These datasets are available as JSON files, and anyone can use them to analyze their databases, NLP methods, or even sample production data.

  • Pile The Pile is a large-scale, mixed-language modeling dataset that includes smaller, high-quality datasets. All of them are open source and publicly available.

When data is accessible, it enables learners to find more suitable solutions to real-life academic and scientific problems. These datasets must have certain qualities to be fulfilling these needs, including totality, consistency, boundaries, and uniformity.

  • iSAID This resource provides a huge-scale dataset for "Instance Segmentation in Aerial images".

  • SpaceNetChallenge This resource contains publicly available datasets which are hosted on the AWS cloud service. The cloud service does not generate these datasets itself.

Here, we created a varied list of publicly available government datasets. These are daily real-world data from people's life on topics related to government, like policy, food, and medicine. These datasets offer an extended range of topics, schemas, and volumes to be used in correctly analyzing or testing related projects.

  • Australian government This public data provider is the Australian government data resource. Anyone can access the anonymized public data published by them.

  • FDA This resource provides datasets about nourishment and pharmaceutical. Indicating that its mission is public protection concerning medicine and foodstuffs preparation.

  • data.police.uk Open datasets in criminality and policing in England. All are downloadable in CSV format and accessible through open API as well.

  • Scottish Parliament The Scottish Parliament is publishing the data it is producing as open datasets.

Mock data is important in testing software systems by examining the consistency of the data in a real-life schema and scale.

  • httpbin This provider is an HTTP Service that generates random and dynamic data for mock data usage.

  • Mockaroo This project is a mock data generator, with a custom schema and in a variety of export choices, including JSON and CSV.

Public open data empowers anyone who is looking for suitable resources to check their systems and project outputs in real-world conditions. These publicly available datasets may contain any topic from country and population-related subjects to spacecraft and the game industry, for example.

  • Archive.org This site gathers big datasets in numerous areas and topics from other websites and individuals.

  • Nasa Earth data Here you can find all of NASA's Earth Observations publicly available as open data. Tens of thousands of datasets are available for any use case.

When conducting research on a large-scale problem, with different angles and its solution, having demographic data plays an important role. To reach precise and valid results, having resources that contain a vast range of topics and scales is essential.

  • Vizgr This resource provides statistical data on different topics, from historical events, research papers, economy to scientific projects. Visualization and data access are freely available and protected by copyright.

Here we brought any other proper dataset or service that offers open data.

  • Controlled Vocabulary Services Country and politics-related datasets, and more.

  • OSI open source datasets Is the leading voice on the policies and principles of open source, by implementing the foundation for the open-source software ecosystem. The data is accessible as a public restful API.

  • Hugging Face This resource offers AI-related datasets, and a scalable machine learning computing system is available directly from its hub.

  • TVMaze This resource shows TV information, from the show industry through a free public REST API, with search and documentation.

We carefully picked and described these websites to give you a thorough overview of open-data providers out there. We focused on JSON because it is almost the de facto and chosen format to exchange data everyplace in the tech industry. JSON comes in handy, particularly when facing the challenge of opening large-scale data files. And on your journey, if you had a JSON file to explore, you can use our Dadroit JSON Viewer, it is fast, and free for academics!