…and we’re back!

It’s been so long since I’ve updated this blog that I’m not even sure where to start.  I suppose I’ll get business out the way first and note that this website has a new home at shawnaverkamp.com. About a month ago, the SLIS server crashed (on the day of a job interview, of course…), and my website was out of commission for 3 weeks.  Since I’ve graduated, I decided it was a good time to make the move from the library school server to a more permanent home.  I’ve also been wanting to try out Omeka, a free and open-source web publishing platform created by the George Mason Center for History and New Media that enables libraries, archives, and museums to manage and display online collections.  I’m hoping to develop some experience working with the software using a large collection of digitized family photos as a test collection.

Working backwards, In January, Joanna and I took temporary half-time positions as Digital Project Librarians in DLS to help keep the momentum going on the instutional repository implementation.  With the software in place, we’re finally starting to get the repository seeded, searching for new content, and trying to set up workflows for when we leave in July.  This has been an incredible learning experience so far.  Even though there’s lots of literature cropping up about first-hand encounters with IR-building, it’s still impossible to avoid reinventing the wheel to some degree.

In December, I graduated! With no job lined up, I was lucky to have student work in DLS to hold me over for a month. For my MLIS poster presentation, I explored some of the metadata issues I ran into on the Presidential Campaign Videos Collection project described in an earlier post.  The collection itself is up and running and can be viewed here.  You can check out my poster here.

Check back in the coming months for updates on IR and Omeka progress…

Posted in Uncategorized | Leave a comment

Fall 2008 (early weeks and background)

I realize I haven’t yet talked about my fall project, mostly because it was held up by paperwork for the first month or so. But now that it’s in full swing, I should probably introduce it. This summer, the University Libraries and DLS chose bepress to be their institutional repository management and e-journal hosting platform. Friend/fellow Joanna (who I shall no longer link out to because of her extreme awesomeness and our unfortunate but unavoidable race to find a job in the cutthroat field of digital librarianship) and I attended an excellent conference on IRs and open access this summer and were chomping at the bit to participate in this fall’s bepress implementation project. The original goal for the project was to build faculty web pages for three departments using bepress’ Selected Works service, seed the university’s new Digital Commons repository with publications posted on those faculty pages, and set up two or three local e-journal prototypes using bepress’ EdiKit service. (More about all of these services in a later post.) We would also be exploring possible workflows for all of these tasks as well as researching IR policies and faculty outreach.

Before I get ahead of myself, I want to give a little background on institutional repositories, Open Access, and the conference from this summer. The Open Access (OA) movement grew out of the increasing demand to break the monetary restrictions on online scholarly research articles and other materials and to make them freely accessible to everyone, the main benefit being to share knowledge for the advancement of scientific and scholarly research, especially in developing countries.  OA is delivered in at least one of two methods, the “gold road” or the “green road.” On the “gold road” journals themselves are open access, making articles freely accessible from the time of publication.  On the “green road” authors self-archive their publications in an OA online repository, usually disciplinary or institutional. Institutional repositories (IR) collect, preserve, and provide access to the intellectual output of an institution, usually a research university.  The University of Iowa is one of many institutions around the world starting to build repositories.

ELPUB 2008, aka the International Conference on Electronic Publishing, was held this June in Toronto (the first time ever held in North America). This year’s theme was “Open Scholarship: Authority, Community and Sustainability in the Age of Web 2.0.” Repository administrators, librarians, researchers, and publishers from over 20 countries met to share expertise, frustrations, and ideas for solutions to the challenges facing OA and repositories.  Joanna and I arrived early to catch two great workshops, one on eXtensible Text Framework (XTF), an open source indexing and querying tool, presented by the amiable folks from the California Digital Library, and a brainstorming session with repository managers from around the world on attacking the obstacles standing in the way of successful repositories.  (It was amazing to find how many of these issues could be remedied by a standard for persistent digital identities.) Keynotes by John Willinsky and Stevan Harnad, OA champions, were inspirational and provided context for a field which I up until that point knew little about.  In the conference sessions we learned lots about added-value services, repositories in developing countries, and sustainable OA scholarly publishing models.  Everyone we met was genuinely interested in sharing their professional experiences with newbies like us and seemed excited that two library students came to the conference to learn about OA and IRs.

Still glowing with post-conference buzz at the beginning of the semester, Joanna and I met with our mentor Wendy to start preparing for the arrival of bepress, which we were told was to happen very soon… (to be continued)

Posted in Conferences, Fall 2008, Graduate Fellowship Projects | Leave a comment

Fall 2008 — Week 2

This summer, in addition to wrestling with the Chautauqua migration, I started working on the Political Campaign Video Collection. A UI Political Science professor had collected from the Internet nearly two thousand presidential campaign-related videos for eight of the candidates (four from each party) in order to research how political campaigns use online social media tools to reach voters. The Iowa Digital Library seemed like an ideal place to house these videos for a variety of reasons. The most obvious is that many of these videos were shot in Iowa during the never-ending pre-caucus season, but probably most important is the guarantee of permanence the IDL can provide. Because of the ephemeral nature of online videos, citing them in research can be risky. By parking the videos at an established digital repository, the UI can support and encourage researchers to explore issues in online social media without risking their reputations to broken links.

Another immediate benefit is the opportunity to value-enhance the videos with descriptive metadata. For part of his research, our Poli-Sci professor is recording for each video the subject, the policies addressed, the setting, the purpose, and the type of media outlet used.  Part of our job in DLS has been to convert this information to metadata that is understandable to the layperson and interoperable with existing schemas in the Iowa Digital Library and beyond.  Central Technical Services (CTS) will add descriptive titles, original dates, and location information for the intellectual content of each video.  Also, I’ve finally gotten to try my hand at writing some PHP to harvest duration, upload date, description, and identifier information from online video websites.  Additionally, all records will get some general templated metadata for the candidate, political party, topical subject, type, and collection fields.  All of this metadata put together will make for some very full records. 

This level of detail is probably not feasible for most large collections of this type, but this collection is somewhat of an experiment.  I’m looking at it as an exploration into how we can efficiently harvest metadata and batch populate metadata fields, how we can preserve some of the original context of online social media while placing it in the context of the digital library, how we can transform minimal original descriptive metadata to meet library standards, how we can collocate online social media with other items in the vast library collection, and how we can negotiate fair use to ensure the long-term educational value of these records.  This project has raised a whole host of issues and questions that I’ve been dealing with, and hopefully I’ll get a chance to elaborate more on them in the next few weeks. 

 

Posted in Fall 2008, Graduate Fellowship Projects | Leave a comment

Fall 2008 — Week 1

Well, it’s back to school and back to blogging. Luckily, this summer was packed with both exciting and frustrating digital library experiences, so I have quite a backlog (backblog?) of topics to fill this space with for the next few weeks. I think I’ll kick it off by washing my hands of the Chautauqua project, an object lesson in patient persistance.

I believe when I blogged Chautuaqua last a few months ago, we were still waiting for an answer from ContentDM support. After a few more weeks of gentle nagging, they finally replied to tell us they couldn’t help until we restructured our compound object file hierarchy to their liking. (Which I can kind of understand, but at the same time, I really don’t think that was the problem.) I confess that we had been trying to trick the system by cramming two levels of hierarchy into the object names. Each performer had their own folder and within that folder was each of that performer’s brochures, titled with a number. So you would get an object name like “abbey/2″ which would be referenced by ContentDM during uploading to find and retrieve the images. We had chosen this route because of the way the files had been structured during round 1 back in 2000, and in our test runs, it really seemed to work! After getting busted by ContentDM, we restructured and renamed the folder structure and started trying again. We spent a month or so trying different permutations of the upload process, hoping one of them would let us upload one complete batch without crashing. Long story short, we finally deduced that something buggy was happening with the image banding, possible in combination with something else. With the summer drawing to a close, it was decided that we would just work around the bugginess rather than try to pursue the perfect solution, so we could get everything uploaded before I started my fall semester project. “Working around the bugginess” involved batching up the objects by image width in MS Access, finding the appropriate banding size for each image width range, then uploading the 40 batches into ContentDM. There was a small amount of crashing, but for the most part it worked, and everything is up, and you can look at it here. (I don’t want to look at it for a while.)

Looking back at this whole migration process has made me think alot about what goes into choosing a digital asset management system, or software in general for that matter. ContentDM does alot of things really well, but once you run into a problem, you’re very dependent on their support for a solution. How would we have dealt with this problem if we were using an open source system instead? We may have been able to hack some code to suit our needs, but would we have had any support at all? I feel like I’ve been trained to associate open source software with active and supportive user communities, but just how active and supportive are those communities? Does it depend on the software? What about the open source vendor support companies (like LibLime) that are cropping up on the library landscape? I really don’t have any answers, but I hope to sometime get some experience with open source software, so I can form an opinion.

Posted in Fall 2008, Graduate Fellowship Projects | Leave a comment

Q. Where does my time go?

A. Not blogging.

In the past month or so, I’ve been to two incredible conferences, started a new project for DLS, and almost finished my last one.  So why can’t I just sit down and write about it already?  Every time I get on the computer, I get sucked into reading blogs (professional and trashy, but mostly trashy) watching YouTube, or wasting time on Facebook.  Yes, it’s summer, but I’d hoped to be a little more productive than this.  So, the other day I signed up for RescueTime, a free application that tracks your computer usage by website or program. You can tag websites you frequent to see just how much time per day/week/month, etc. you spend on school, work, scrabulous, whatever. I’m always on the lookout for promising apps to help me manage my information online, but I haven’t found many that manage my time for me.  Hopefully after a week of seeing how I really spend my time online, my guilt will write my conference reports for me…

rescue me

Posted in Uncategorized | 1 Comment

(weeks 12, etc., etc. and beyond)

Well, this project isn’t over yet, but since the semester is, I should probably sum up a little before my summer blogging hiatus. Status on the Chautauqua migration is the same, still waiting for a solution to our software problems before we can start uploading the brochures. Hopefully either this summer or the newest version of ContentDM will bring us a remedy. Last week I gave a presentation on my project experience to SLIS faculty and the project mentors along with the other nine digital fellows. Because I’m now in full summer mode, and I just don’t have the energy to write up a decent summary of the semester, here’s a link to my powerpoint presentation. Maybe I’ll annotate it later on, maybe not. I personally enjoy viewing graphics-heavy online powerpoints and making up my own narrative to go along…

This summer, in between taking a class and working for Digital Library Services, I’ll be attending two conferences, the Digital Humanities Summer Institute in Victoria, BC in May, and The International Conference in Electronic Publishing (ELPUB) in Toronto in June. I’m also planning to use the web skills I learned this semester to renovate my homepage and this blog sometime this summer, and if something exciting happens or if I feel so inclined, I may just post a little update or two.

Posted in Graduate Fellowship Projects, Spring 2008 | Leave a comment

(weeks 9,10,11)

Progress on the Chautauqua collection migration has slowed to a screeching halt, as we’ve encountered major problems batch uploading into ContentDM on a larger scale. I discovered early on in the migrations that uploading more than 50 compound objects with metadata, images, and associated full-text transcription files, all at one time will crash the program. Over the course of about a week Mark, Wendy, and I tried many different combinations of batch uploads–some without metadata, some without text files, some to different ContentDM collections, etc.–to try to narrow down the problem. I’ve realized that I don’t really like this aspect of carrying out a project. (Who does?) It would be so much easier to just pass it off to ITS or ContentDM. But, alas, it’s part of the process, and there’s probably some great learning experience in there somewhere. So, after trying everything, we think we’ve identified the full-text transcription files as the culprit. Having a better grasp on the problem, we’ve sent it out to ContentDM support for help. (One of the perks of proprietary software…)

In the meantime, I’ve been scanning in and editing photos in for the Iowa Women’s Archives UI Women’s Physical Education collection. It’s been a nice break doing semi-mindless work and working with physical artifacts again. Hopefully this coming week will bring a solution to our problem, so we can get migration back into full swing.

Posted in Graduate Fellowship Projects, Spring 2008 | Leave a comment

(week 8)

With the metadata now finished, we are so close to having the fully migrated test batch ready for Special Collections’ approval. This past week was spent fine-tuning the metadata and image display with Jen and Mark and test migrating about 25 of the brochures into ContentDM. Later in the week we met with John O. in ITS to discuss the feasibility of renaming and stripping the TEI markup from the full-text files for import into the transcription field in the metadata. John wrote up a quick script and got back to us mere hours later with a few sample files for us to play around with. This week I’ll try uploading the images, metadata, and transcriptions all together into ContentDM. The limitations of this software, however, are really starting to get to me. For instance, to add an image rights band to the bottom of an uploaded image, ContentDM will only let you choose a standard font size rather than allowing you to set font size relative to image size. When batch uploading multiple objects of different sizes, some images end up with excessively large banding text while others have tiny illegible banding text. We discussed resizing all of the images to make the banding text more uniform, but decided against it in the end. I personally feel that it’s a bad idea in the long run to change our digital objects just to conform to immature technology.

I’ve also been thinking a lot about the issues of permanence involved in conducting a migration like this. In my research for a paper I’m writing for my electronic publishing class, I’ve been reading a lot about institutional repositories and the need to provide permanent access (read: URLs) to digital objects in order to guarantee authoritativeness to the sources and support the credibility of the researcher referencing those objects in their scholarly work. How will we guarantee this permanent access for researchers who have already linked to brochures in the old system? Will we leave the old system in place? Will we redirect the URLs? If we redirect, will we redirect to the object referenced or just to the new homepage of the digital collection? These are questions I will have to bring up in the weeks ahead.

Posted in Graduate Fellowship Projects, Spring 2008 | Leave a comment

(weeks 6,7)

Lesson learned: metadata manipulation will always take you longer than planned. After another two weeks of subject metadata massaging and reformatting, the bulk of our metadata is finally ready for the migration into ContentDM. Why has this taken so long? Answer: Humans rule, computers drool. While much of our metadata reformatting could be automated to some extent, subject metadata is finicky and requires human brains to manipulate it semantically. Our original subject metadata was a mishmash of Library of Congress and locally created subject headings with a little LCTGM (Library of Congress Thesaurus for Graphic Materials) thrown in. Headings and LCSH/local subdivisions were combined syntactically into ‘heading — subdivision’ strings a la LCSH rules (ex. ‘Puppet theater — History and criticism’). This method of cataloging was good practice eight years ago when the collection was first digitized, but as digital libraries have matured, it’s become apparent that subject metadata can give us more precise search results when it’s split out into its narrower subject types. So, the decision was made to include ‘Geographic subject,’ ‘Personal name subject,’ ‘Corporate name subject,’ ‘Chronological subject,’ ‘LCSH,’ and ‘LCTGM’ to our metadata (all mapped to Dublin Core ‘Subject’) and weed through our original subject headings to populate these fields. It was also important to preserve this ‘legacy’ cataloging, so the original subject heading strings are being left mostly as-is in a field called ‘Local subject.’ Creating these new fields will also help future cataloging efforts when new Chautauqua material is added to the digital collection, as the controlled vocabularies will already be in place. The major time suck occurred when I had to filter through all of these subject headings to find the geographic headings and manually check every unique heading against the current LCSH authorities list. But seeing those super-rich metadata fields in ContentDM? Totally worth it, right?

Okay, so I wasn’t being entirely truthful when I said I spent two weeks on this metadata. For a few days while our metadata decisions were stuck in workflow, I checked out the TEI markup that was done for each brochure. One of the requirements for inclusion in the Library of Congress American Memory project was that text of all items be marked up in TEI-lite. Now I do love text encoding (in fact, I just found out I get to go here this summer to learn some more TEI), but LC’s decision to require their American Memory partners to do this may have been a bit premature and ambitious, not to mention an unnecessary expenditure for the participating institutions (and essentially, the American taxpayers). I wonder if LC ever did anything with these TEI files because, looking at the markup, I can’t really see potential for any useful text mining. Seeing how much time and money went into the manual keying (the brochures’ presentation did not allow for accurate OCR) and text encoding, I’d be interested to find out if any participating institutions did anything with their encoded files. Anyone?

We will, however, be looking this week at the possibility of mining our TEI files to get full-text transcriptions into the ContentDM metadata. I’ll be excited if this works out, since I hate to see investments go to waste. I’ll also be importing the controlled vocabularies into ContentDM in preparation for the migration. If all goes smoothly, we might be ready for the test migration batch by the end of the week.

Posted in Graduate Fellowship Projects, Spring 2008 | Leave a comment

(weeks 3, 4, 5)

It appears I have some catching up to do. Unfortunately, I don’t have much time for that. So, this installment will be short and sweet and told through screenshots.

After the public launch of the African American Women Students at the University of Iowa collection, I was surpised at all of the press it received, including “featured digital library” in American Libraries Direct.

aldirect

Two weekends ago, I attended an intensive TEI XML (text-encoding) workshop at the University of Illinois Urbana-Champaign. Here’s a screenshot of the Dorothy Parker poem I worked on.
tei

I was very intriuged by the idea of “personography”–adding biographical information to the TEI “header” about any persons referenced in the text. I’m hoping to learn a lot more about TEI, since this workshop, while very informative, gave me only a basic grasp of TEI’s potential.

The metadata is unfortunately still under construction, but almost completed.

metadata

This week we’ll hopefully finish up with the subject fields and have the metadata ready for migration.

Posted in Graduate Fellowship Projects, Spring 2008 | Leave a comment