ATLAS Tutorial: Data Sources - Person

OHDSI
23 May 201903:26

Summary

TLDRThis video introduces the data sources capability in the Atlas platform, demonstrating how to select and analyze various reports, such as the person report. It showcases the ability to visualize data through graphs, like the year of birth distribution and gender, race, and ethnicity composition. The script highlights the platform's functionality to identify anomalies and characterize data, providing insights into data source quality and composition.

Takeaways

  • 📊 The Atlas platform offers a 'person reporting' feature that provides insights into data sources through various graphs and reports.
  • 🔍 Users can select specific data sources to study and generate reports, such as the 'person report', which includes multiple graphs.
  • 📈 The first graph in the person report shows the distribution of the year of birth, with the x-axis representing the year of birth and the y-axis showing the number of persons born in that year.
  • 👀 Hovering over a specific year on the graph reveals the number of persons born in that year, providing detailed insights into the data distribution.
  • 📊 An example from the script shows a data source with 1.3 million persons born in 1990 and an unusual spike in 1929 with 1 million patients, indicating potential anomalies.
  • 👤 The person report also includes donut plots for gender distribution, revealing the proportion of male and female patients, such as 50.5% female and 49.5% male in the example provided.
  • 🌐 The script mentions that the race and ethnicity data may vary depending on the data source, with some sources lacking matched concepts for these attributes.
  • 🔄 Changing the data source can result in different year of birth distributions and compositions by gender, race, and ethnicity, as illustrated in the script.
  • 📊 The script provides an example of a different data source where 59% of the patients are white, 9.2% are black or African-American, and 2% are Asian, with 29% of patients' race unidentified.
  • 🌐 Similarly, for ethnicity, 93% of patients in the example do not have a match concept, while 6.3% are identified as Hispanic or Latino.
  • 🔗 For more information about the Odyssey platform, including details on Atlas and additional data sources and reports, the script directs viewers to visit odyssey.org.

Q & A

  • What is the primary focus of the video?

    -The primary focus of the video is to provide an introduction to the person reporting capability within the data sources capability in the Atlas platform.

  • How can users select a data source in the Atlas platform?

    -Users can select a data source by choosing the data source they are interested in studying from the available options in the Atlas platform.

  • What information does the person report provide?

    -The person report provides a series of graphs showing the year of birth distribution, gender, race, and ethnicity of the persons in the selected data source.

  • What does the first graph in the person report display?

    -The first graph displays the year of birth distribution, with the x-axis showing the year of birth and the y-axis showing the number of persons born in that year.

  • How can anomalies in the year of birth distribution be identified?

    -Anomalies in the year of birth distribution can be identified by looking for unusual spikes or drops in the graph. For example, a spike in 1929 indicates an unusual number of persons born that year.

  • What does the gender donut plot show?

    -The gender donut plot shows the distribution of gender within the data source, classifying persons as either male or female and displaying their number and proportion.

  • What information is missing in the race distribution of the first data source?

    -The first data source does not have any matched concepts for race, meaning that race is unknown for this population.

  • How does the ethnicity information appear in the first data source?

    -The ethnicity information in the first data source is unknown.

  • What changes can be observed when switching data sources in the Atlas platform?

    -When switching data sources, the person report will update to show different year of birth distributions and compositions by gender, race, and ethnicity based on the new data source.

  • How does the second data source differ in terms of race and ethnicity information?

    -In the second data source, race information is available with 59% identified as white, 9.2% as black or African American, 2% as Asian, and 29% unknown. Ethnicity information shows 93% of patients without a matched concept, and 6.3% identified as Hispanic or Latino.

Outlines

00:00

📊 Introduction to Data Reporting in Atlas Platform

This paragraph introduces the data reporting feature within the Atlas platform. It explains how to select a data source and access various reports, specifically the 'person report'. The person report is highlighted for its ability to display graphs that represent data such as the year of birth distribution, with the ability to hover for detailed numbers, like the 1.3 million persons born in 1990. Anomalies, such as the spike in 1929 with 1 million patients, are also pointed out as valuable insights. The paragraph also mentions donut plots for attributes like gender, which in this example shows a near-even split between males and females, and notes the absence of race and ethnicity data in the current data source. Changing the data source alters the demographic distributions presented.

Mindmap

Keywords

💡Atlas platform

The Atlas platform is the central focus of the video, serving as a data management system where users can interact with various data sources. It is the environment where the person reporting capability is demonstrated, allowing users to analyze and visualize data. In the script, the platform is used to select data sources and generate reports, such as the person report, which provides insights into demographic distributions.

💡Data sources

Data sources refer to the origins of the data that are being analyzed within the Atlas platform. They are crucial as they provide the raw data needed for generating reports and insights. The script mentions selecting a data source to study, emphasizing the importance of choosing the right source for accurate reporting and analysis.

💡Person report

A person report is a specific type of report generated within the Atlas platform that focuses on demographic data related to individuals. It includes various attributes such as year of birth, gender, race, and ethnicity. The script describes the person report as showing graphs and donut plots that visualize this data, providing a comprehensive view of the population's characteristics.

💡Year of birth

The year of birth is a demographic attribute that categorizes individuals based on the year they were born. In the script, a graph is used to display the distribution of the year of birth, allowing viewers to see trends and anomalies, such as the spike in the number of persons born in 1929, which is highlighted as a potential area of interest for further investigation.

💡Graphs

Graphs in the context of the video are visual representations of data used to illustrate patterns and trends within the data. The script describes a graph showing the year of birth with the x-axis representing the year and the y-axis showing the number of persons, providing a clear visual analysis tool within the Atlas platform.

💡Donut plots

Donut plots are a type of chart used to represent proportions of a whole, often used for categorical data. In the script, donut plots are used to depict the distribution of gender, race, and ethnicity, offering a quick visual summary of the composition of the population within the selected data source.

💡Gender

Gender is a demographic attribute that classifies individuals based on their sex. The script uses gender as an example to show how the person report can provide the number and proportion of individuals belonging to each gender category, with the specific example given of 50.5% of the population being female and 49.5% being male.

💡Race

Race is another demographic attribute that categorizes individuals based on shared physical characteristics. The script discusses how the person report can show the distribution of race within a data source, noting that in one example, the race is unknown, while in another, specific percentages of the population are identified by race.

💡Ethnicity

Ethnicity refers to the cultural or national identity of a group of people, often distinct from race. The script explains how the person report can identify the ethnicity of individuals within a data source, with one example showing that while most patients do not have a match concept, a significant percentage is identified as Hispanic or Latino.

💡Anomalies

Anomalies in the context of data analysis refer to unusual patterns or outliers that deviate from the expected distribution. The script points out an anomaly in the year of birth data for 1929, where there is an unexpected spike in the number of patients, which could indicate data entry errors or other issues worth investigating.

💡Odyssey

Odyssey is mentioned in the script as the organization or platform that provides the Atlas platform and additional data sources. It is the entity that offers the tools and resources for data analysis and reporting, and the script encourages viewers to visit their website for more information.

Highlights

Introduction to the person reporting within the data sources capability in the Atlas platform.

Ability to select data sources and specific reports for analysis.

Visualization of a year of birth graph with x-axis as year of birth and y-axis as number of persons.

Interactive feature to hover over years to see the number of persons born in that year.

Identification of 1.3 million persons born in 1990 from the data source.

Detection of unusual data spikes, such as the 1 million patients born in 1929.

Use of donut plots to represent person level attributes.

Gender classification with percentages and proportions.

50.5% of the population identified as female and 49.5% as male.

Lack of matched concepts for race in the data source.

Ethnicity data is also unknown for the population.

Change in data source leads to different year of birth distribution and demographic composition.

59% of the population identified as white in the new data source.

9.2% of the population identified as black or African-American.

2% of the population identified as Asian.

29% of patients without an identified race in the new data source.

Ethnicity identified in the new data source with 6.3% identified as Hispanic or Latino.

93% of patients do not have a match concept for ethnicity.

Invitation to explore more information about Odyssey, Atlas, and additional data sources.

Transcripts

play00:01

[Music]

play00:08

today we'll provide an introduction to

play00:12

the person reporting within the data

play00:15

sources capability in the Atlas platform

play00:18

I select data sources we can select our

play00:22

data source that we're interested in

play00:23

studying and select any of a series of

play00:26

reports here I've selected the person

play00:28

report the person report shows a series

play00:31

of graphs down below the first graph

play00:34

shows a year of birth on this graph were

play00:38

showing the x-axis being the year of

play00:41

birth and the y-axis showing the number

play00:44

of persons within that year of birth so

play00:47

this source here if I hover over any

play00:50

particular year I can see the number of

play00:52

persons with that year of birth here for

play00:55

example we can see that in 1990 this

play00:57

data source had 1.3 million persons born

play01:01

in that year this graph allows you to

play01:05

see the distribution of year of birth

play01:07

and also identify and characterize

play01:11

potential anomalies for example here we

play01:14

can see that this data source has an

play01:16

unusual spike in 1929 where there are 1

play01:19

million patients this would be good

play01:21

information to be able to understand a

play01:23

source better below we have donut plots

play01:28

for a series of person level attributes

play01:30

the first is gender we see that gender

play01:34

is classified here by male and female

play01:37

and if I hover over any segments of the

play01:40

graph we can see the number and

play01:43

proportion of patients belonging to that

play01:45

gender here we can see that 50 point 5

play01:48

percent of the population or 43 million

play01:51

are female whereas forty nine point five

play01:55

percent or forty two million are male

play01:58

the middle graph here shows the

play02:00

distribution by race here we can see

play02:03

this particular source does not have any

play02:06

matched concepts for race so race is

play02:09

unknown for this population

play02:12

and here in this graph we see ethnicity

play02:14

also unknown if I change data sources we

play02:20

will see the same person report appear

play02:22

for a different data source in this

play02:25

particular case we can see that the year

play02:27

of birth distribution has changed and we

play02:30

also see a different composition by

play02:32

gender race and ethnicity here we can

play02:36

now see in the race distribution that

play02:38

this source has information where 59

play02:42

percent or fifty five billion patients

play02:44

are white nine point two percent of the

play02:48

population are black or african-american

play02:50

and two percent of the population are

play02:52

Asian there are still 29 percent of the

play02:56

patients who do not have an own race

play02:58

identified in this data source ethnicity

play03:02

is also identified in this particular

play03:04

data source

play03:04

most patients 93 percent do not have a

play03:08

match concept however six point three

play03:11

percent of the population is identified

play03:14

as Hispanic or Latino for more

play03:18

information about odyssey including

play03:20

details on atlas and the additional data

play03:22

sources reports check us out at odyssey

play03:25

org

Rate This

5.0 / 5 (0 votes)

Related Tags
Data AnalysisAtlas PlatformDemographic TrendsYear of BirthGender DistributionRace DataEthnicity InsightsData VisualizationHealthcare DataReport GenerationData Source Comparison