Digitizing Hawaiian Language Newspapers on the World Wide WebPhase II - FINAL REPORT AND EVALUATION
Martha Chantiny and Joan Hori, UH Manoa Library |
| Table of Contents Introduction Description of Project Summary of Accomplishments Evaluation Final Expenditure Report Continuation of Project Image Files Processed In February 1997, a Diversity and Equity grant of $7,188.00 was received by librarians at the University of Hawaii at Manoa, University of Hawai'i at Hilo, and Honolulu Community College to digitize selected Hawaiian language newspaper articles currently stored on microfilm, enhance them optically, and mount them on the World Wide Web (Web). In November 1997, an additional amount was granted to fund Phase II to continue the cooperative effort of libraries at the University of Hawaii to provide electronic access to primary Hawaiian language archives and complete the processing of more than 3,800 images that were scanned from microfilm in Phase I, as well as to present a hands-on workshop for librarians and Hawaiian language scholars to demonstrate the process of digitizing newspaper archives for the Web. In Phase I we identified problems and issues related to making microfilmed newspapers more widely available through the use of digital technology. A web site -- http://libweb.hawaii.edu/ -- was established on the server of the UH Manoa School of Library and Information Studies. Progress in Phase I was severely limited during most of the project by our having use of only a 386 personal computer for post-scanning processing. Access to a Pentium computer was made available for Phase II. Libraries at the University of Hawaii have historically endeavored to both preserve and make accessible materials on diversity issues. Approximately eighty Hawaiian language newspapers were published in Hawaii from 1834 to 1948. These primary historical documents of Hawaii are currently stored on microfilm which was produced from newspapers that had deteriorated through the years. Subsequent use by students and scholars have deteriorated the microfilm. Some heavily used microfilm also disappeared, and replacement copies were purchased. New technology of digital scanning enables access to these resources without destroying the resource itself, and allows for the enhancement of microfilmed images, so that a more readable image than that which is available on microfilm can be produced. In addition to promoting diversity by presenting Hawaiian language resources in an electronic format, this project also advances the diversity goal of the University of Hawaii by making Hawaiian language newspapers accessible wherever there is access to the World Wide Web. Few libraries have collections of the microfilm, but increasing numbers of individuals and educational institutions are connecting to the Web. The newspapers present the Hawaiian view of historical events, genealogy, stories, and culture. They will serve as primary resources for future research by a growing population of scholars, thereby advancing the Universitys goal of educational excellence. Complete microfilmed holdings of six newspapers were scanned in entirety during Phase I. They are Ka Hoku o ka Pakipika, Ke Au Hou, Ka Manawa, Ka Lama, Ka Lei Momi, Ka Lanakila. Processing and mounting of Ke Au Hou on the web was completed during Phase I. However, better microcomputer equipment and software was required to perform the far more complex file manipulation of the graphic images of Ka Hoku o ka Pakipika. The software purchased for Phase II allowed the completion of the processing of the Hoku scanned files. A list of the software programs purchased, tested and used is included in the attached expenditure report. Processing and mounting of Ka Hokuas well as Ka Manawa, Ka Lama, Ka Lei Momi, and Ke Au Okoa were completed during Phase II. This work was performed by the graduate student funded by the Phase II grant. A rough calculation of image files scanned and processed, as well as web 'html' files is included. Twenty-two series and articles selected from Hawaiian language newspapers listed above as well as others (Ka Nupepa Kuokoa, Ke Alaula, Ke Au Okoa, Ka Puuhonua o na Hawaii) were printed from microfilm and prepared for scanning during Phase I. All of the articles from Ka Nupepa Kuokoa were scanned and preliminary processing was completed during Phase II. This work was done by a graduate student intern from the Library and Information Studies Program as part of an unpaid, for-credit independent study course. The intern also investigated and identified the procedures for using the Adobe Capture software to produce PDF documents1 from the scanned story files. Some story files were converted to PDF format and mounted. Even though all of the newspapers had not been completely processed and made available on the web at the time, the utility of the materials that were available and the increasing 'word of mouth' combined to make March the date of public unveiling of the site. The web site 'went public' by way of an email announcement sent out to a number of Hawaiian studies related email lists by an early user. The site was demonstrated to a Library and Information Studies class in early March, at a workshop in Hilo a few days later. The project will be demonstrated to the UH Manoa Hawaiian Language Discussion Group in Fall 1998. The Hawaiian Newspaper digitizing project was demonstrated to 7 UH Manoa Hawaiian language classes, 1 Hawaiian Studies class and a class of 7th and 8th grade Hawaiian immersion students during the Spring 1998 semester. The project was also demonstrated as part of a panel presentation at the 15th Annual Conference of the Association for Asian-American Studies held in Honolulu in June. In March 1998, a link to the site was added on the UH Hilo Library Hawaiian Collection page; in May 1998, a link to the site was added on the UH Manoa Hamilton Library Hawaiian Collection's Hawaiian Studies Web Sites page in the 'Language' section. A link from the Kualono web page was added in late March; in June 1998 a link in the 'Education' section was made on the Hawai'i Home Page. An e-mail comments/survey form was put in place on the web page in late February 1998. A counter to measure number of 'hits' on the main page was also installed. By the end of June, thirty surveys (as well as a few messages sent directly to the web page maintainer) have been submitted and the usage counter had reached over 1700. A Workshop was presented at the University of Hawaii at Hilo on March 12, 1998. The workshop audience numbered approximately 15 persons and consisted of librarians from UHH and Hilo Public Library as well as an Instructor from the Hawaiian language program. Unfortunately most of the Hawaiian Studies faculty were called away unexpectedly and at the last minute to meet with Tglinkit Indian representatives visiting the campus that day. An evaluation form was sent to participants after the session to solicit comments and feedback concerning the value of the web contents and delivery. Six evaluations were returned. One hundred ninety-two hours of graduate student assistant work was funded by SEED and the Library funded an additional 183 hours from the Hawaiian Collection allotment of student funds so that work could continue through June 19, 1998. Ongoing work that remains unfinished includes completion of the final processing and mounting of the scanned images from the single unfinished title, Ka Lanakila; creation of fuller 'indexing' and the addition of 'metadata tags' to the HTML documents. Completion of the scanning, processing and creation of PDF files for the selected stories as well as OCR of image files still needs to be carried out. In April 1998, a grant request was submitted to the first ever Institute of Museum and Library Services (IMLS) National Leadership Grant competition in the hopes of receiving a little over $100,000 in funding. The Library asked to be considered in two of the 4 competition categories:
Sixty-six proposals were received by IMLS in the first category, with requests totaling: $10,330,723. Sixty-five proposals were received in the second category, with requests totaling: $8,758,224. In mid-September, IMLS will announce the results of the competition for nearly $6,500,000 to libraries and library/museum collaboratives. If this grant is awarded to the University of Hawaii, a portion of the funds will be used to continue digitizing and processing Hawaiian language newspapers. The grant request proposes to buy the microfilm scanning equipment which was leased in Phase I of our SEED grant and to digitize an additional 9-13 titles. Note: 'Tiff' files listed for the full newspaper runs Ka Hoku o ka Pakipika, Ke Au Hou, Ka the images that were scanned during Phase I. 'Tiff' files listed for the stories (selected from the newspapers Ka Nupepa Kuokoa, Ke Alaula, Ke Au Okoa, and Ka Puuhonua o na Hawaii) were scanned during Phase II. Production of 'gif' files (images viewable on the web) and 'htm' files (web page documents) were begun in Phase I and continued during Phase II. Production of 'pdf' (Acrobat Reader) files began in Phase II. Ka Lanakila was not completed partly because processing of the Ka Lei Momi files took longer than expected. The Lei Momi f microfilm images were significantly skewed and required 5-8 minutes of manipulation per page to clean up the GIF image files. The procedure involved the following steps:
Notes(1) 'PDF' stands for 'Portable Document Format' - a document storage and viewing format created by the Adobe software company and now used extensively on the Web to display or download documents with all their special formatting intact. (2) 'Metadata' is a method of describing Web documents within the HTML encoding. The descriptions can include information about the relationship between image files as well as 'cataloging' information such as author, title and subject headings. The metadata is not visible on the Web page but can be used by search systems to select web pages to display. (3) 'OCR' stands for 'Optical Character Recognition' - a mechanism used by software to turn images of documents into text documents. It usually involves a program which examines every image of a letter and 'decides' what text letter to convert the image to. The conversion can be complicated and slow if the images of the letters are not completely crisp and clear, or if the fonts are not 'modern' such as in the Hawaiian language newspapers. (4) 'TIFF' and 'GIF' files are types of image storage formats used and manipulatable by computer software. Web browsers (such as Netscape and Internet Explorer) do not usually display TIFF files automatically - browsers are set up to seamlessly (and relatively rapidly) display GIF images . However, most scanning programs save image files in some kind of TIFF format because the resolution of the image is better - therefore these images must be converted to GIF format to insure that they can be easily viewed on Web pages. 'HTM' files are the basis of almost all web pages - they are the 'invisible' structure that causes words and images to display on a computer screen when viewed by a Web browser. Send questions and comments to: speccoll@hawaii.edu |