Skip to Main Content
It looks like you're using Internet Explorer 11 or older. This website works best with modern browsers such as the latest versions of Chrome, Firefox, Safari, and Edge. If you continue with this browser, you may see unexpected results.
This guide includes strategies and resources for locating and evaluating datasets, including free and openly available datasets. You will also find guidance for citing data. Consider starting with UAlbany Data & Statistics Databases. Finally, visit the UAlbany Data Services tab to learn about other data services the University Libraries offer.
Finding Datasets and Statistics
Things to Consider
- Consider who might collect the data you're interested in. A government agency? A nonprofit organization? A company in the private industry? An academic researcher?
- Search scholarly publications or government reports. Some might cite the dataset and include a link.
- Search data or statistics databases that the University Libraries subscribe to.
- Search repositories that contain openly available datasets.
- Consider combining multiple datasets if one doesn't meet all of your needs.
- Look for data appropriate for your needs, e.g. geospatial data for creating a map.
Before using a dataset, consider the following:
- Is the source of the data reputable?
- Is there documentation, such as metadata, for the dataset? For example, a metadata record or a data dictionary should be provided to explain the structure of the dataset, missing data, abbreviations used, etc.
- Is the file format usable? Do you need additional software to use it?
- Are there restrictions on the use of the dataset?
Quantitative data: are usually numeric, can be counted or measured, and answer questions such as "how many" or "how much".
Qualitative data: are usually categorical and can describe characteristics.
Geospatial data: describe objects or events with a location on the surface of the earth and usually include coordinates, zip codes, etc.
Primary data: are original data collected or created by the researcher themselves while conducting research.
Secondary data: are data that were not collected or created by the researcher themselves, e.g. data from government agencies.
Search Engines for Secondary Datasets
Attribution and Authorship
Guide Authors: Kathleen Flynn, Senior Assistant Librarian and Subject Librarian for Physical Sciences, Math & Statistics, Computer Sci.,& Engineering: Elaine M. Lasda, Librarian, Coordinator of Scholarly Communication and Subject Librarian for Social Welfare