Places to get data from
2024 February 26
Over the past year, I have acumulated a list of different places where you can get high quality datasets. This is the list in no particular order on where to get data for different type of problems.
-
Dataset Research Engine by Google: You might never have heard of it, but Google has a search engine specifically for datasets. The results vary, but it can be a good starting point.
-
huggingface: A hub from anything from models to super intricate datasets. In general, I use this more for Deep Learning than for ML.
-
kaggle: It is a community / competition playground from Google. Some of the datasets are really high quality but it can be really mixed. They have some high quality ML datasets.
-
paperswithcode: In here, there are a lot of what could be considered the classic datasets. The datasets that you normally use to demonstrate State of the Art on a new method. It is more academic. Normally it is high quality.
-
Data in Brief: This is way more academic. It is an open access journal that is about open data.
-
Data world: It feels a bit spammy, and the results can be mixed.
-
Health Data: Health data from the US.
Last updated: 2024-02-26
Back to website