6 Unofficial Rules of Data Management for ArcGIS
Keep your data clean, up to date and organized with the following suggested best practices. Learn how to reduce corruption, improve your workflow and explore new analysis options.
Data management and organization is critical for success when using ArcGIS products, whether you’re performing spatial analysis, creating a map or simply storing your data. Different organizations and users each have their own unique methods for data management and there are no standardized methods for doing so. However, there are data management best practices that can improve workflow and reduce the chance of corrupting or losing files.
We often receive support cases from customers that revolve around data management, including corrupted map documents, layers that are missing spatial reference, geoprocessing tools that will not run or attribute tables that do not display correctly. Often, these cases can be solved, and the data restored. But there are rare occurrences where the data is not salvageable, which can be frustrating to both the customer and the analyst who’s working on the case. Following are some recommended best practices for data management that will hopefully minimize the chance of encountering unsalvageable data corruption.
Leverage the Power of Geodatabases
Geodatabases are the native data structure for ArcGIS. Organizing and storing your data in geodatabases allows you to explore new aspects of your data through relationship classes, topology, geometric networks, address locators, terrain datasets, etc. Working with geodatabases also allows you to implement multi-user editing through versioning and archiving. Geodatabases can take your data to the next level of functionality while maintaining an organized data structure.
Give your Map Document a Purpose
When possible, limit the number of feature classes and shapefiles that are included in a map document (.mxd). Only include the data you want to use or display. Give the document a purpose and only include data that is relevant to that purpose. Often, map documents with too many feature classes, perform poorly or unintentionally get corrupted.
Work in Workspaces
If you’re planning on doing a lot of testing or analysis, try setting up a workspace structure that will store your intermediate data, keeping it separate from the final results. This is especially useful for storing outputs from model iterations that are not intended for the final output. Keep individual projects separated in their own folders and geodatabases to maintain data integrity. Scratch and Current Workspaces can be assigned and changed in the Environment settings of your map document, script or model.
Examples of scratch and current workspaces.
Practice Proper Naming Conventions
While this may seem obvious, take the time to practice proper naming convention. Name your feature classes and tool outputs so that they accurately describe the data. This is beneficial when you have to revisit the project at a later time. Things you may want to include in your file name are: the geoprocessing tool or parameters used, or the type of projection. Do not include spaces or special characters. Instead, use underscores and Camel Case. Keep your naming conventions descriptive, but short, as issues can arise if the name of the feature class is too long.
Implement Quality Control Procedures
Ensuring that your data is accurate, consistent and up to date is one of the most important aspects of data management, whether you’re collecting it yourself or using someone else’s. This will improve query results, geocoding accuracy and geoprocessing tool outputs. If you’re collecting your own data, try implementing attribute domains to reduce error when inputting the data.
Example of using attribute domains.
Create a backup
A good first step is to create a backup of your original data. This helps if the original data is accidently deleted, moved or altered incorrectly. In the case where it is changed, you can always revert back to the original.
In addition, periodically make backups of the .mxd, the models, the scripts and anything else you're actively working on. It's best to store them somewhere secure and accessible if your hard drive happens to crash unexpectedly. Examples of this include the ArcGIS Data Store on ArcGIS for Server, or publishing your files to ArcGIS Online.
There are many other data management practices available, and it's best to find the ones that are the most applicable for your organization and workflow. If you're just starting out in ArcGIS, the above suggestions are a good foundation for building your own data management practices. Additional resources about the topics I've discussed can be found below.
If you experience issues with data corruption, performance or have further questions regarding data management, do not hesitate to contact Esri Canada Technical Support.
Further Reading