List of Open Datasets That Can Be Used As JSON Sample Data
Where to find free and open dataset for your next JSON data-required project? There are some influential organizations and businesses that publicly provide their data for educational and research purposes.
Say, for example, you are a data scientist searching for free and open data sources for your academic project. Or need some mock data to test your software system, and you need a head start. Here you'll find a good start from these open and free data sources to look up.
In this post, we'll give you an organized and diverse list of these resources, which we picked based on important and useful necessities. We specifically picked those that offer JSON format.
Name | Category | Topic | Access right |
---|---|---|---|
WorldBank | Finance | GDP and financial challenges and solutions. | Free |
CoinMarketCap | Finance | cryptocurrency | Free |
Open Notify | Data Science | Open APIs from space and spacecraft | Free and open |
PokeAPI | Data Science | Pokémon game data | Free and open |
Dstl | Data Science | Machine learning datasets | Free and open |
Kaggle datasets | Data Science | Educational datasets | Free and open |
yelp | Data Science | Educational, databases, and NLP learning | Free and open |
Pile | Data Science | language modeling dataset | Free and open |
iSAID | Educational | Instance Segmentation in Aerial Images | Free and open |
SpaceNetChallenge | Educational | open-data-registry for researchers, businesses, and individuals | Free and open |
Australian government | Government | Nation’s public studies | Free and open |
FDA | Government | Public data about food and drug | Free and open |
data.police.uk | Government | Crime and policing in England | Open Government License |
Scottish Parliament | Government | Parliament produced open datasets | Free and open |
httpbin | Mock data | HTTP Service | Free and open |
Mockaroo | Mock data | Mock data generator | Limited free usage, pricing for bigger datasets |
Nasa earth data | Public data | Your Gateway to NASA Earth Observation Data | Free and open |
Archive.org | Public data | Public datasets from websites or individuals | Free and open |
Vizgr | Statistic | Statistic data on a variety of topics | Free and open, protected by copyright. |
Controlled Vocabulary Services | Other datasets | Country and politics-related datasets | Free and open |
OSI | Other datasets | The leading voice on the policies and principles of open source | Free |
Hugging Face | Other datasets | AI-related datasets | Free for basic usage, and with license pricing for professionals. |
TVMaze | Other datasets | TV information public API | Free |
Finance public data providers
In the finance field, data is a basic need to create a trusted business environment, although some challenges are in the way. For each company, it is critical to analyze and specify the ideas, providing the datasets that help you join the dots across edges to locate insights is a helpful asset. Accessing transparent and legal data for you to build your tech, data analysis, or SaaS product is an essential need.
-
Worldbank This institution works in developing financial products and provides technical assistance to help countries find resolutions to the challenges they encounter.
-
CoinMarketCap This resource is offering cryptocurrency-related data with advanced API and is the leader in providing crypto market data.
Data science free public datasets
In the process of developing an academic or educational project, finding the right data resources is a major step. Data science projects include wide use cases. Training data to validate machine learning techniques and natural language processing (NLP) are among the most notable ones.
An index of publicly available data for examination or study might be a good start. The following list of websites generates or collects high-quality data and offers them as open datasets or APIs.
-
Open Notify Is an open-source project that offers a programming interface as a restful API for some of NASA's raw data.
-
PokeAPI On this website, you can access data from a game named Pokémon provided by a restful API. The API offers the video game information on the main character's game plays.
-
Dstl This data provider gathers a standard dataset that could be used to train and validate machine learning strategies and NLP methods.
-
Kaggle JSON sample datasets In this data resource, you can explore and examine academic datasets available in different data formats, including JSON.
-
Yelp Open Dataset The Yelp dataset is a subset of user data to be used for private or academic purposes. These datasets are available as JSON files, and anyone can use them to analyze their databases, NLP methods, or even sample production data.
-
Pile The Pile is a large-scale, mixed-language modeling dataset that includes smaller, high-quality datasets. All of them are open source and publicly available.
Educational open data providers
When data is accessible, it enables learners to find more suitable solutions to real-life academic and scientific problems. These datasets must have certain qualities to be fulfilling these needs, including totality, consistency, boundaries, and uniformity.
-
iSAID This resource provides a huge-scale dataset for "Instance Segmentation in Aerial images".
-
SpaceNetChallenge This resource contains publicly available datasets which are hosted on the AWS cloud service. The cloud service does not generate these datasets itself.
Government public data providers
Here, we created a varied list of publicly available government datasets. These are daily real-world data from people's life on topics related to government, like policy, food, and medicine. These datasets offer an extended range of topics, schemas, and volumes to be used in correctly analyzing or testing related projects.
-
Australian government This public data provider is the Australian government data resource. Anyone can access the anonymized public data published by them.
-
FDA This resource provides datasets about nourishment and pharmaceutical. Indicating that its mission is public protection concerning medicine and foodstuffs preparation.
-
data.police.uk Open datasets in criminality and policing in England. All are downloadable in CSV format and accessible through open API as well.
-
Scottish Parliament The Scottish Parliament is publishing the data it is producing as open datasets.
Mock JSON datasets
Mock data is important in testing software systems by examining the consistency of the data in a real-life schema and scale.
-
httpbin This provider is an HTTP Service that generates random and dynamic data for mock data usage.
-
Mockaroo This project is a mock data generator, with a custom schema and in a variety of export choices, including JSON and CSV.
Public dataset resources
Public open data empowers anyone who is looking for suitable resources to check their systems and project outputs in real-world conditions. These publicly available datasets may contain any topic from country and population-related subjects to spacecraft and the game industry, for example.
-
Archive.org This site gathers big datasets in numerous areas and topics from other websites and individuals.
-
Nasa Earth data Here you can find all of NASA's Earth Observations publicly available as open data. Tens of thousands of datasets are available for any use case.
Statistic open dataset providers
When conducting research on a large-scale problem, with different angles and its solution, having demographic data plays an important role. To reach precise and valid results, having resources that contain a vast range of topics and scales is essential.
- Vizgr This resource provides statistical data on different topics, from historical events, research papers, economy to scientific projects. Visualization and data access are freely available and protected by copyright.
Other public datasets
Here we brought any other proper dataset or service that offers open data.
-
Controlled Vocabulary Services Country and politics-related datasets, and more.
-
OSI open source datasets Is the leading voice on the policies and principles of open source, by implementing the foundation for the open-source software ecosystem. The data is accessible as a public restful API.
-
Hugging Face This resource offers AI-related datasets, and a scalable machine learning computing system is available directly from its hub.
-
TVMaze This resource shows TV information, from the show industry through a free public REST API, with search and documentation.
So, to wrap it up.
We carefully picked and described these websites to give you a thorough overview of open-data providers out there. We focused on JSON because it is almost the de facto and chosen format to exchange data everyplace in the tech industry. JSON comes in handy, particularly when facing the challenge of opening large-scale data files. And on your journey, if you had a JSON file to explore, you can use our Dadroit JSON Viewer, it is fast, and free for academics!