Tuesday, October 16, 2007

Dublin Core: How it happened and how to see it

I first heard of Dublin Core [officially Dublin Core Meta Data Initiative] in Carol Simon's class last summer. One of my classmates gave a short PowerPoint presentation on it. I left with more questions than I started with. OK, I know it is a classification system, but why another one? You have to go to a source outside of their "About the initiative" web page to find that the invitational workshop initiated by OCLC (Online Computer Library Center) was in 1995. OCLC itself was started in 1967 by some very progressively thinking librarians. Fred Kilgour's (president of OCLC from 1967 to 1980) thinking was shaped during his years in WWII as Wikipedia reports:
Kilgour served during World War II as a lieutenant in the U.S. Naval Reserve and was Executive Secretary and Acting Chairman of the U.S. government’s Interdepartmental Committee for the Acquisition of Foreign Publications (IDC), which developed a system for obtaining publications from enemy and enemy-occupied areas. This organization of 150 persons in outposts around the world microfilmed newspapers and other printed information items and sent them back to Washington, DC.
Kilgore was a graduate of Harvard whose first job in 1935 was as assistant to the Director of the Harvard University Library. As an aside, he met his wife at the Harvard library where she was also a librarian. It was at Harvard that he first began collecting microfilm of foreign newspapers for students at Harvard, which led to his being sought for the work he did in the military. After his military service (1942-1945), he continued working for the State Department (1946-1948) as deputy director in the Office of Intelligence Collection and Dissemination (OCLC, news release).

In 1948, Fred became a Librarian for the Yale Medical Library. He made empirical studies of various categories of people using sets of books at the library to judge what needed to be purchased to serve the educational needs of Yale medical students.

In 1967, Fred was hired by the Ohio College Association in its first project of amassing a union catalog of 54 Ohio Universities. After four years of development, the "world's first computerized library network, the Ohio College Library Center, on the campus of The Ohio State University in Columbus" (OCLC, News release) was born in 1971. He was instrumental in refocusing the organization to its current global thinking and form as the Online Computer Library Center during his tenure as its president from 1967 to 1980. The original Ohio Computer Library Center was expanded and is now the WorldCat (on Wikipedia). The Wikipedia entry for Fred gives more information about his experimenting with digital cataloging long before 1967:
"While at the Harvard University Library, he began experimenting in automating library procedures, primarily the use of punched cards for a circulation system. He also studied [graduate school] under George Sarton, a pioneer in the new discipline of the history of science, and began publishing scholarly papers. He also launched a project to build a collection of microfilmed foreign newspapers to help scholars have access to newspapers from abroad. This activity quickly came to the attention of government officials in Washington, D.C.

"In 1961, he was one of the leaders in the development of a prototype computerized library catalog system for the medical libraries at Columbia, Harvard and Yale Universities that was funded by the National Science Foundation. In 1965, Kilgour was named associate librarian for research and development at Yale University. He continued to conduct experiments in library automation and to promote their potential benefits in the professional literature."

It is an honor to be seeking a profession with people of the caliber of Fred Kilgour! Knowing more about OCLC, which initiated Dublin Core, I am now even more curious about DC. By the early 70's I had my first Master's degree (M.Ed.). I was always curious about computers and took all the math courses that were required for a computer degree in case I decided to leave education. I knew people working on computers who let me use them to keep track of a mailing list of the members of an organization that I worked with. In that way, I got my experience using punch cards. In the early 80s I moved to New York.

By 1995, when the first workshop on the standard that became Dublin Core was held, I was making web sites for several businesses. I knew that these businesses needed more than a web site. I checked their competition and advised them that they needed to differentiate what they were offering from what turned out to be hundreds of other similar businesses. I studied search engines and how to set up key terms to push my clients' web sites higher in the search order. In another situation as a volunteer, I reviewed and added to an index for a book. I knew then that someday I wanted to study indexing to find out how a professional indexer would work. I remembered a comment of Kurt Vonnegut's in Cat's Cradle about how indexers think differently from all other people. But since I had never studied library science, I was missing much of the knowledge of how to organize data.

What was missing for me in my colleague's presentation on Dublin Core was that DC was an effort to bring to the Internet and the World Wide Web the librarian's knowledge of how to catalog data so that it can be found by researchers. A new system had to be created because no existing system could do the job and the new system needed to be an integral part of every document/audio, video, and data file on the Internet. How could we as beginning library science students in our first course even begin to understand what bringing librarians' knowledge to the task meant! My first experience using a controlled vocabulary occurred in another Library and Information Science summer school course after the Dublin Core presentation where I learned how to create a subject entry using the Library of Congress subject headings (LCSH).

I remember first encountering the New York yellow pages after my move from Texas. There seemed to be a disconnect from my Texas thinking to what New Yorkers must think as I had to keep trying different terms to find businesses in those yellow pages. Well, reading through and looking up subjects in the Library of Congress Subject Headings (LCSH) felt similar to my first using those NY yellow pages. The process of learning how to use the LCSH took extensive time and did not at first seem to be the time saver that it actually is. But learning to use LCSH did not happen before my colleague gave her presentation on Dublin Core.

As I read more about Dublin Core, I found the information related to building a database. My husband works in the computer field and in the summers, I traveled with him as he delivered his seminars. I learned the notation required for building a computer system reflecting business rocesses, data flows, and entity relationship diagrams (ERD - see Crow's feet). One-and-only-one association between elements not only avoids ambiguity and redundancy, but also is a characteristic of a DB key. Clearly, Dublin Core's mission "to facilitate the finding, sharing and management of information" (DCMI About the initiative) is accomplished by bringing the proven method of controlled vocabulary to the world of Internet search necessarily expressed as machine readable language. This is an ambitious mission given how many people must cooperate to do it! I must share with you a web page by Cory Doctorow, called, Metacrap: Putting the torch to seven straw-men of the meta-utopia. It is delightful as well as insightful to read. Despite the title, Doctorow is very practical and supportive of efforts such as Dublin Core to add metadata to the Internet.

Participating with standards organizations is one way to improve metadata on the Internet. Remember that Dublin Core was started by OCLC, which is a member of W3C, the Internet standards consortium (founded October 1991, predating Dublin Core by 4 years). A programmer can check to be sure his HTML code is W3C compliant. I don't know if the test also looks for Dublin Core. I can test for it later. The National Information Standards Organization (NISO, founded in 1939) has Z39.85 where Dublin Core is alive and well. OCLC is a voting member of NISO. Due to the importance of contributions made by librarians to standards, NISO has a special category of membership for librarians which does not preclude their also being a voting member($$).

If you use the Firefox browser, consider adding the Dublin Core Viewer Plugin. There is also a second add-on for viewing Dublin Core, called Dublin Core NeViewer. While I finally got both viewers installed, neither is working properly. I left a note for help. Making Dublin Core more visible has to help in understanding it. Of course there are developers of many web sites who do not even know of Dublin Core, just as I did not (although my work was at the time DC was being developed and probably it was not yet in the W3C standards, which I paid attention to). But as a library student, it would be nice to be able to track what of DC is out there in that vast world of the Internet. Think of all the possible users; can you imagine some of them adding DC to their sites? When I succeed in getting one of these Firefox plug-ins (add-ons seems to be their new name) to work, I will write an addendum.

Now that the 15 simple elements are established and the qualified elements, the Dublin Core Metadata Initiative (DCMI) is focusing through open forum on developing a list of terms for describing an item. If you are going to have a standard across a large group of people, then someone has to look at various cases of term usage. Even a cursory glance at the list of terms shows that a lot of thought has been put forth in designing these terms. Developing the metadata terms gives librarians the chance to avoid any weaknesses of other classification systems. It is also interesting to see how the human and the machine readers are both included in their work.

Some additional links that explain how to use Dublin Core
Web Developer Resource Index: Dublin Core
O'Reilly.com xml from the inside out: An introduction to Dublin Core
National Information Standards Organization (NISO) Z39.85 -2007
DMCI Metadata Terms
Dublin Core Metadata Userguide (2005)

No comments: