Data organization

For many scientists, dealing with research data is the basis of their daily work. It therefore saves time and effort if this data is efficiently structured, documented and backed up from the outset.

Most of the data is initially stored in files. Files have different types or file formats that are sometimes identified by the file name extension, for example in the Windows operating system. Furthermore, files are stored in directories (folders). Naming files and directories systematically is very important. For example, the Stanford File Naming Handout.

Alternatively, data can also be stored in databases. Here, the effort is greater, because a database management system such as MySQL must first be set up. A database schema needs to be defined which provides a structure for storing data. Here, too, naming is of great importance. Databases support managed shared access to data much better than data stored in files. There are different types of databases: relational, hierarchical, graph-based, RDF triple stores and a few more.

The following animated video clearly summarizes the topic of data organization.

Index

Data organization