By Ellen-Rae Cachola
On Nov. 1, 2017, Stanislava Gardasevic shared her training on ontologies and the Semantic Web at her talk "Opening Archives for General Public--a data modelling approach," for the UH Manoa Library Information Science Colloquium.
What is ontologies and the Semantic Web? I suggest asking Dr. Quiroga for more details.
But anyway, Stanislava, or Stasha, as she likes to be called, learned these skills in the International Masterʻs Digital Library Learning program in Oslo, Norway. Through guidance of a professor, she conducted her Masterʻs research with the Europeana database, which uses the Europeana Data Model (EDM).
What’s that?
Well, what I gathered was that the Europeana Database is infrastructurally similar to the Digital Public Library of America (DPLA). So the “back end” of the database is not one-of-a-kind; but instead, similar to another database, thereby allowing more people who know one system, like the DPLA, to also be able to navigate the Europeana database.
Anyway, Stasha’s focus was how to connect different repositories of collections across many cultural heritage institutions, to centrally present their stuff in one database that could search across all of them. A similar database in Hawai'i that can be used to search across multiple Native Hawaiian repositories is Papakilo.
What are those acronyms about?
But Stasha’s talk was important because she discussed how to go about translating the archival data that was in the Encoded Archival Description (EAD) into the EDM.
EAD is based on XML (eXtensible Markup Language), which looks similar to HTML. XML has specific tags to mark different parts of the finding aid, such as <control>, <abstract>, <filedesc>, <did>. Check out this Library of Congress resource to understand more about EAD and its XML structure.
EAD to EDM
But don’t get too lost down that rabbit hole because archivists have been transitioning away from the EAD model. It is found to be not user friendly for non-archivist people, therefore repelling them from accessing information about archival collections. EAD was, after all, created only for archives to communicate with other archives, to transfer metadata about their collections to one another.
The EDM Model is a schema for how the classes and properties of a data set can be configured to be integrated into the Europeana database. The EDM model, like other machine readable finding aids, are also constructed using XML.
Like I said before, regular folk don’t have that same passion for archival intelligence as archivists. Archival vocabularies and standards that we often use, such as “finding aids,” "scope note," or "description," that archivists use to describe their collections, are not part of the everyday vocabulary of researchers or history buffs.
Lastly, EAD is very hierarchical. Item level information is often at the bottom of the very long XML document. It’s so far down the XML page that people give up before finding out the treasure that exists in the repository.
Data Mapping
Stasha’s journey to translate from EAD to EDM including mapping out data sets to understand how data grouped and related to one another. The EDM model also uses dublin core fields that can hold specific descriptive data points about an item or a group.
Her research to move toward the EDM model was to find a way to "flatten" EAD’s hierarchy. She created a data modeling scheme to visualize how the EAD model tags could be mapped onto the EDM model, in order to conceptualize the relationships between groupings, so that we wouldn’t see the descriptive information from a top-down point of view, but rather, a more lateral perspective.
The EDM model also allowed records to start including linked data, or linking records from other databases that aggregate information about a particular subject, place or person. That way, databases could build their own documentation of a subject by linking to information being aggregated about it in another database. Examples of repositories considered as resource for linked data are viaf.org and wikipedia.
In addition, the EDM provided a search engine in order for users to just type in a word in a search box, to bring up a web page in the Europeana Database, that has the information they need. This solved the issue of users having to dig through a finding aid’s hierarchy in order to find a particular item.
EDM is an example of a database that tries to embody the semantic web by linking across multiple repositories. Rather than trying to recreate a central database and change everything to fit into this new model, Stasha’s research showed how we can manipulate and reconceptualize our data sets so that they can be read, across a singular database.
Feel free to add your understanding of this topic below.
On Nov. 1, 2017, Stanislava Gardasevic shared her training on ontologies and the Semantic Web at her talk "Opening Archives for General Public--a data modelling approach," for the UH Manoa Library Information Science Colloquium.
What is ontologies and the Semantic Web? I suggest asking Dr. Quiroga for more details.
But anyway, Stanislava, or Stasha, as she likes to be called, learned these skills in the International Masterʻs Digital Library Learning program in Oslo, Norway. Through guidance of a professor, she conducted her Masterʻs research with the Europeana database, which uses the Europeana Data Model (EDM).
What’s that?
Well, what I gathered was that the Europeana Database is infrastructurally similar to the Digital Public Library of America (DPLA). So the “back end” of the database is not one-of-a-kind; but instead, similar to another database, thereby allowing more people who know one system, like the DPLA, to also be able to navigate the Europeana database.
Anyway, Stasha’s focus was how to connect different repositories of collections across many cultural heritage institutions, to centrally present their stuff in one database that could search across all of them. A similar database in Hawai'i that can be used to search across multiple Native Hawaiian repositories is Papakilo.
What are those acronyms about?
But Stasha’s talk was important because she discussed how to go about translating the archival data that was in the Encoded Archival Description (EAD) into the EDM.
EAD is based on XML (eXtensible Markup Language), which looks similar to HTML. XML has specific tags to mark different parts of the finding aid, such as <control>, <abstract>, <filedesc>, <did>. Check out this Library of Congress resource to understand more about EAD and its XML structure.
EAD to EDM
But don’t get too lost down that rabbit hole because archivists have been transitioning away from the EAD model. It is found to be not user friendly for non-archivist people, therefore repelling them from accessing information about archival collections. EAD was, after all, created only for archives to communicate with other archives, to transfer metadata about their collections to one another.
The EDM Model is a schema for how the classes and properties of a data set can be configured to be integrated into the Europeana database. The EDM model, like other machine readable finding aids, are also constructed using XML.
Like I said before, regular folk don’t have that same passion for archival intelligence as archivists. Archival vocabularies and standards that we often use, such as “finding aids,” "scope note," or "description," that archivists use to describe their collections, are not part of the everyday vocabulary of researchers or history buffs.
Lastly, EAD is very hierarchical. Item level information is often at the bottom of the very long XML document. It’s so far down the XML page that people give up before finding out the treasure that exists in the repository.
Data Mapping
Stasha’s journey to translate from EAD to EDM including mapping out data sets to understand how data grouped and related to one another. The EDM model also uses dublin core fields that can hold specific descriptive data points about an item or a group.
Her research to move toward the EDM model was to find a way to "flatten" EAD’s hierarchy. She created a data modeling scheme to visualize how the EAD model tags could be mapped onto the EDM model, in order to conceptualize the relationships between groupings, so that we wouldn’t see the descriptive information from a top-down point of view, but rather, a more lateral perspective.
The EDM model also allowed records to start including linked data, or linking records from other databases that aggregate information about a particular subject, place or person. That way, databases could build their own documentation of a subject by linking to information being aggregated about it in another database. Examples of repositories considered as resource for linked data are viaf.org and wikipedia.
In addition, the EDM provided a search engine in order for users to just type in a word in a search box, to bring up a web page in the Europeana Database, that has the information they need. This solved the issue of users having to dig through a finding aid’s hierarchy in order to find a particular item.
EDM is an example of a database that tries to embody the semantic web by linking across multiple repositories. Rather than trying to recreate a central database and change everything to fit into this new model, Stasha’s research showed how we can manipulate and reconceptualize our data sets so that they can be read, across a singular database.
Feel free to add your understanding of this topic below.