Knowledge clip: Metadata

UGent Open Science
22 Feb 202108:20

Summary

TLDRThis video script emphasizes the critical role of metadata and documentation in research data management (RDM), highlighting how they enable data discovery and reuse. It distinguishes between metadata and documentation, with metadata being structured data descriptors, essential for making data FAIR. The script covers various types of metadata, including descriptive, technical, administrative, and structural, and discusses their creation, storage, and importance in data repositories. It also touches on metadata standards, which facilitate data exchange and interoperability across different research domains.

Takeaways

  • 📄 **Metadata Importance**: Metadata is crucial for Research Data Management (RDM), aiding in data discovery and reuse.
  • 🔍 **Metadata Definition**: Metadata is data that describes other data, structured for machine readability.
  • 🔑 **Types of Metadata**: Descriptive, technical, administrative, and structural metadata serve different purposes in data management.
  • 🔎 **Descriptive Metadata**: Includes elements like title, author, and keywords to facilitate data discovery.
  • đŸ› ïž **Technical Metadata**: Covers technical aspects such as file type, size, and access methods.
  • đŸ›ïž **Administrative Metadata**: Deals with intellectual property rights, licenses, and access restrictions.
  • 🌐 **Structural Metadata**: Indicates how datasets relate to other online resources.
  • đŸ€– **Metadata Creation**: Can be generated automatically by instruments or manually by researchers.
  • đŸ’Ÿ **Metadata Storage**: May be embedded within files, stored separately, or provided via data repositories.
  • 📚 **Metadata in Files**: Day-to-day digital files and discipline-specific formats often include embedded metadata fields.
  • 📊 **Metadata Standards**: Standards like Dublin Core and DDI ensure interoperability and consistency across different systems.

Q & A

  • What is metadata in the context of research data management (RDM)?

    -Metadata in RDM is data that describes other data, providing essential information about data in a structured way to make it machine-readable, facilitating its discovery, assessment, and reuse.

  • Why is metadata considered crucial for making data FAIR?

    -Metadata is essential for making data FAIR because it allows data to be found, accessed, and reused by providing information on how to locate and utilize the data without needing to download it first.

  • What are the different types of metadata mentioned in the script?

    -The script mentions four types of metadata: Descriptive, Technical, Administrative, and Structural metadata.

  • What does Descriptive metadata include and why is it important?

    -Descriptive metadata includes elements like title, author, and keywords that help in discovering the data. It is important for making data easily searchable and understandable.

  • Can you explain the role of Technical metadata in research?

    -Technical metadata provides information about the technical aspects of data or files, such as file type, size, and access methods, which are crucial for data processing and analysis.

  • How does Administrative metadata differ from other types of metadata?

    -Administrative metadata focuses on intellectual property rights, including license, access rights, and restrictions, which are essential for managing data usage and permissions.

  • What is the significance of Structural metadata in data management?

    -Structural metadata indicates how a dataset relates to other online resources, helping in understanding the data's context and its integration with other datasets.

  • How can metadata be generated and where can it be found?

    -Metadata can be generated automatically by instruments or software, or manually by researchers. It can be found embedded within files, in separate files, or provided when uploading data to a repository.

  • What are the challenges associated with manually created metadata?

    -Manually created metadata can face challenges like maintaining the link between metadata and data, ensuring machine readability, and adhering to standard formats for easy data discovery and reuse.

  • Why is it recommended to use metadata standards when documenting research data?

    -Using metadata standards ensures consistency and interoperability across different systems and applications, making research data more accessible and reusable.

  • How do data repositories facilitate the FAIRness of data?

    -Data repositories provide functionalities to create and manage machine-readable metadata, which increases the findability, accessibility, interoperability, and reusability of the data.

Outlines

00:00

📄 The Importance of Metadata in Research Data Management

This paragraph emphasizes the critical role of metadata and documentation in Research Data Management (RDM), ensuring data discoverability and reusability. Metadata, defined as data that describes other data, is structured to be machine-readable, thus facilitating data search, assessment, and reuse. The paragraph introduces different types of metadata: descriptive (e.g., title, author, keywords), technical (e.g., file type, size), administrative (e.g., license, access rights), and structural (indicating data set relations). It also discusses how metadata can be generated, either automatically by instruments or manually, and can be stored within files or as separate files. The importance of maintaining the link between metadata and data is highlighted, with examples of metadata in everyday digital files and the challenges of custom metadata approaches.

05:01

🗂 Metadata Creation, Storage, and Standards

The second paragraph delves into the creation and storage of metadata, explaining that metadata can be generated by research instruments, software, or manually. It mentions the risk of losing the link between metadata and data, especially when files are moved. The paragraph then transitions to discussing metadata standards, which are sets of elements used to describe resources, with examples of generic standards like Dublin Core and discipline-specific standards like Ecological Metadata Language (EML). The use of metadata standards is crucial for data exchange and interoperability. The paragraph concludes by encouraging researchers to familiarize themselves with metadata requirements of data repositories, which aid in making data more FAIR (Findable, Accessible, Interoperable, Reusable), and to document metadata throughout the research process for ease of data management and sharing.

Mindmap

Keywords

💡Metadata

Metadata refers to data that provides information about other data. In the context of the video, metadata is crucial for research data management (RDM) as it enables data to be found and reused. The video emphasizes that metadata is structured in a way that is machine-readable, which facilitates searching, assessing the usefulness of data without downloading it, and understanding how the data can be accessed and reused. Examples from the script include metadata fields like title, author, and keywords that help in discovering the data.

💡Machine Readability

Machine readability is the property of metadata that allows it to be read and processed by computers. The video explains that metadata is highly structured to ensure it is machine-readable, which is essential for efficient data search and retrieval. This concept is integral to the video's message about making data findable and reusable, as it ensures that metadata can be easily processed by various systems and tools used in RDM.

💡Descriptive Metadata

Descriptive metadata includes elements that aid in the discovery of data. The video mentions that this type of metadata encompasses common elements like the title of the dataset, the author, and keywords. These elements are essential for users to understand what the data is about and to locate relevant datasets, as they provide a basic description that helps in the initial assessment of the data's relevance.

💡Technical Metadata

Technical metadata pertains to the technical aspects of data or files. According to the video, this could include information on how to access the data, the file type used, or the size of the file. This type of metadata is vital for understanding the format and accessibility of the data, ensuring that users can effectively work with the data once found.

💡Administrative Metadata

Administrative metadata deals with the management and legal aspects of data, such as intellectual property rights, licenses, and access rights or restrictions. The video underscores the importance of this metadata in ensuring that data is used appropriately and in compliance with relevant laws and policies. It is a key component in making data FAIR (Findable, Accessible, Interoperable, Reusable).

💡Structural Metadata

Structural metadata indicates how a dataset relates to other online resources. The video explains that this type of metadata is important for understanding the context and relationships between different data sets. It helps in navigating complex data landscapes and ensures that data can be integrated and analyzed effectively.

💡Data Repository

A data repository is a digital archive where research data is stored and managed. The video discusses how uploading data to a repository can increase the FAIRness of data by providing functionalities to create and manage machine-readable metadata. Repositories are essential for data sharing and long-term preservation, ensuring that data remains accessible and reusable.

💡Metadata Standards

Metadata standards, also known as metadata schemas, define the set of elements used to describe a resource and the format in which they should be presented. The video mentions standards like Dublin Core and Data Documentation Initiative (DDI), which facilitate data exchange by making metadata interoperable across different systems and applications. Adhering to metadata standards is crucial for ensuring that data can be easily understood and used by a wide range of users.

💡FAIR Data Principles

The FAIR Data Principles are a set of guidelines aimed at making data more findable, accessible, interoperable, and reusable. The video discusses how metadata and data repositories contribute to these principles, particularly by making data more discoverable and usable. The FAIR principles are a key theme in the video, as they represent the ideal state for research data to be in for maximum impact and utility.

💡Readme Files

Readme files are documents that provide information about other files, often used to record metadata in a structured way. The video mentions that readme files can be a useful method for collecting metadata during a project. They help maintain the link between the data and its metadata, which is essential for data management and sharing. However, the video also cautions that custom-made approaches can make metadata less machine-readable if not done correctly.

Highlights

Metadata and documentation are crucial for Research Data Management (RDM).

Metadata is data that describes other data, often in a highly structured format for machine readability.

Metadata facilitates data searchability, assesses data usefulness, and clarifies data access and reuse.

Descriptive metadata helps in data discovery, including elements like title, author, and keywords.

Technical metadata provides information about data or file access, file type, and size.

Administrative metadata deals with intellectual property rights, licenses, and access restrictions.

Structural metadata indicates how a dataset relates to other online resources.

Metadata can be generated automatically by instruments like microscopes or manually by researchers.

Metadata can be stored embedded within files or as separate files.

Day-to-day digital files often include metadata fields for sorting and searching.

Discipline-specific file formats may have additional embedded metadata fields.

Metadata can be generated by processing or analysis software, such as statistical packages.

Metadata headers in files often follow agreed conventions or standards.

Separate files for metadata are common, especially for configuration or calibration data.

Readme files can be used to collect metadata during a project but have risks of losing data-linkage.

Data repositories provide functionalities to create and manage machine-readable metadata.

Metadata standards define the elements used to describe a resource and their required formats.

Generic metadata standards like Dublin Core can be used across different scientific domains.

Discipline-specific standards contain additional elements to meet the needs of particular scientific domains.

Using metadata standards facilitates data exchange and interoperability.

Familiarizing with metadata requirements of repositories is essential for making data FAIR.

Transcripts

play00:05

[Music]

play00:06

metadata and documentation play an

play00:08

important role in rdm

play00:10

enabling data to be found and reused

play00:14

metadata is often defined as data that

play00:16

describes other data

play00:17

if you have seen our knowledge clip

play00:19

about documentation you might remember

play00:21

that the key difference between them

play00:22

is that metadata records essential

play00:24

information about data in a highly

play00:26

structured way

play00:27

using a set of defined information

play00:28

fields or elements

play00:30

the reason why metadata is highly

play00:32

structured is because it is meant to be

play00:34

readable and exchangeable by computers

play00:36

something often referred to as machine

play00:37

readability

play00:40

metadata is needed for many things it

play00:42

facilitates the process

play00:44

of searching and finding data metadata

play00:46

can help us to assess whether the data

play00:48

we find is useful for us or not

play00:50

without having to download it first it

play00:53

also lets us know how the data can be

play00:54

accessed and how can it be reused

play00:57

because of this

play00:58

metadata is essential to make your data

play01:00

fair let's now have a look at some

play01:02

metadata concepts to understand why it

play01:04

is so important

play01:06

first of all there are different types

play01:08

of metadata

play01:10

a first type is called descriptive

play01:11

metadata this type includes common

play01:13

elements or fields that help us to

play01:15

discover the data

play01:16

this can be for instance things like

play01:18

title of the data set the author

play01:20

keywords describing the subject and so

play01:22

on

play01:24

when we talk about technical metadata we

play01:26

mean information about technical aspects

play01:28

of the data or files

play01:30

this could be for instance information

play01:31

about how to access the data

play01:33

the file type used or the size of the

play01:35

file

play01:37

administrative metadata contains

play01:39

elements or fields that deal with

play01:40

intellectual property rights such as the

play01:42

license

play01:42

or access rights or restrictions

play01:46

finally there is also structural

play01:48

metadata this type of metadata indicates

play01:50

how the data set relates to other online

play01:52

resources

play01:55

so how is metadata created and where can

play01:57

we find it

play01:59

metadata can be associated to many

play02:00

different research objects and appear in

play02:02

many different ways

play02:04

sometimes metadata is generated

play02:06

automatically

play02:07

some instruments such as microscopes

play02:09

telescopes or digital cameras create

play02:11

metadata when data is collected

play02:14

but this is not always the case other

play02:17

times metadata needs to be manually

play02:19

created

play02:20

for instance by taking notes in a

play02:21

laboratory notebook or by filling out a

play02:23

form or data listing

play02:26

the second question is how is metadata

play02:29

stored

play02:30

metadata can be stored embedded within

play02:32

the files or it can be stored as

play02:34

separate files

play02:35

and another way to provide metadata

play02:37

comes when you upload your data to a

play02:38

data repository or archive

play02:42

let's have a look at some details and

play02:43

examples

play02:45

most day-to-day digital files include a

play02:47

range of metadata fields

play02:49

these allow you for example to search

play02:51

and sort files according to date created

play02:53

file type author size etc

play02:57

often discipline specific file formats

play02:59

might also have additional embedded

play03:00

metadata fields

play03:02

for example microscopy images normally

play03:04

include the objective settings within

play03:06

the file

play03:07

besides research instrumentation

play03:09

metadata can also be generated by

play03:11

processing or analysis software

play03:14

for example statistical packages such as

play03:16

spss

play03:17

embed rich metadata within the file like

play03:19

formats or additional variable

play03:21

information

play03:23

it is important to find out whether the

play03:25

file formats you use of metadata fields

play03:27

embedded

play03:28

and if these are needed to use the data

play03:30

if you plan to convert a file with

play03:32

embedded metadata to a different file

play03:33

format you should check whether these

play03:35

metadata will also be present in the new

play03:37

format

play03:39

in some domains another place where

play03:41

metadata can be found is in the header

play03:43

of the files

play03:44

typically this is a section at the top

play03:45

of the document preceding the data

play03:47

containing a summary of the data or

play03:49

information about the instrumentation

play03:51

settings

play03:52

about the variables etc

play03:55

often this metadata header follows

play03:57

agreed conventions or standards

play03:59

and the information it contains can be

play04:00

read by applications processing software

play04:03

or algorithms

play04:05

in other cases a metadata header can be

play04:08

manually created by a researcher

play04:10

for example to provide contextual

play04:12

details about an interview in the

play04:13

transcription file

play04:16

when metadata is generated by research

play04:18

instrumentation or software

play04:20

it might also be stored on a separate

play04:22

file

play04:23

for example sensors and measurement

play04:25

devices often provide configuration or

play04:27

calibration files

play04:29

and software used to process

play04:30

geographical data might store

play04:32

geospatial metadata such as the

play04:34

coordinate system in separate files

play04:37

but these separate files can also be

play04:39

manually generated by the researcher

play04:42

for example in a readme file or a

play04:44

spreadsheet

play04:45

recording metadata in such way can also

play04:47

be done in a structured way

play04:48

and often templates are available to

play04:50

help you

play04:52

using readme files can be a useful way

play04:54

to collect metadata during the course of

play04:56

the project

play04:57

however this approach has some downsides

play05:00

for example

play05:00

there is a risk that the link between

play05:02

metadata and the data they represent is

play05:04

lost

play05:04

for example when files are moved

play05:08

keeping some kind of metadata is

play05:09

certainly better than collecting no

play05:11

metadata at all

play05:12

but as a general rule custom-made

play05:14

approaches make difficult for metadata

play05:16

to be machine readable

play05:17

and your data become less findable and

play05:19

reusable

play05:21

and this takes us to our last point

play05:23

providing metadata on a data repository

play05:26

or archive

play05:28

depositing your data on a repository

play05:30

might be required by your institution or

play05:32

research fund or policies

play05:33

or by the journal in which you want to

play05:35

publish your results even if not

play05:37

required

play05:38

it is a good research practice and will

play05:39

increase the fairness of your data

play05:41

because data repositories provide

play05:43

functionalities to make your data more

play05:44

fair

play05:45

including services to create and manage

play05:47

metadata

play05:49

to upload your data to a repository you

play05:52

will be required to fill in a

play05:53

user-friendly form to describe your data

play05:55

all the fields in this form are in fact

play05:57

metadata fields pre-configured to meet a

play05:59

specific metadata standard

play06:01

allowing the result to become machine

play06:04

readable then

play06:06

what are metadata standards when the

play06:09

information fields captured within a

play06:10

specific metadata set become widely used

play06:13

and accepted

play06:13

it often evolves into a metadata

play06:15

standard

play06:17

to put it simply a metadata standard or

play06:19

metadata schema defines the set of

play06:21

elements that can or must be used to

play06:23

describe a resource

play06:25

the standard also tells you how these

play06:27

elements should be named

play06:28

and also which values are allowed or

play06:31

what the required format is for each of

play06:33

the elements

play06:35

some metadata standards are designed to

play06:36

be used across different scientific

play06:38

domains

play06:39

examples of such generic standards are

play06:41

the dublin core standard

play06:42

or data site but there are also

play06:46

discipline specific standards which

play06:48

typically contain additional elements

play06:50

to satisfy the needs of a particular

play06:51

scientific domain

play06:54

for example the ecological metadata

play06:56

language is used in ecology research and

play06:59

has additional elements

play07:00

such as taxonomic coverage to indicate

play07:02

which species are included in the data

play07:04

set

play07:06

another example of a specialized

play07:08

metadata standard is the data

play07:09

documentation initiative

play07:10

or ddi this standard contains elements

play07:14

such as questionnaire specification for

play07:16

research that involves surveys

play07:19

the use of metadata standards

play07:20

facilitates data exchange by different

play07:22

systems or applications

play07:24

in other words it makes research

play07:26

metadata interoperable

play07:28

one of the fair data principles

play07:31

to recap during the research process

play07:33

metadata can be created in different

play07:35

ways and appear in multiple forms

play07:38

an important use of metadata is to make

play07:40

your data findable and let others know

play07:42

how they can access it

play07:43

and reuse it data repositories provide

play07:47

you the functionalities to create and

play07:48

manage machine-readable metadata

play07:50

and therefore make your data more fair

play07:54

that is why it is a good idea to

play07:56

familiarize yourself with the kind of

play07:57

metadata that repositories require

play08:01

during the course of the project make

play08:03

sure to document this information

play08:04

so that when the time comes to provide

play08:06

the metadata you are not only relying on

play08:08

your fading memory

play08:11

for more information about metadata and

play08:12

data repositories

play08:14

have a look at our website

Rate This
★
★
★
★
★

5.0 / 5 (0 votes)

Étiquettes Connexes
MetadataData ManagementResearchMachine ReadabilityData RepositoriesData StandardsDescriptive MetadataTechnical MetadataAdministrative MetadataStructural MetadataInteroperability
Besoin d'un résumé en anglais ?