Notes from Teaching DH breakout group

THATCamp AHA 2018

Session 2 – Teaching DH breakout group – professional development for scholars & researchers

 

Questions:

  1. What are models that have worked for scaling up/building capacity in a department or organization?
  • Make alliances with computer science departments; build a team of people who can work on different parts of projects and contribute expertise
  • See what colleges & departments have funding for student research to become research fellows and work with faculty that way
  • Skill share – bring together people working on specific projects with people with specific technical expertise – CC uses summer and break between semesters for workshops, skill shares usually best if short (1.5 hours) and woven into programming already offered in department, once or twice a semester is usually enough
  • Collect list of different types of projects and ideas so people can get an idea of what it possible and what is available when writing scopes – ashleyrsanders.com/digital-storytelling-project-ideas/ – template
  • Library of Congress GIS day had an afternoon workshop where all the curators just experimented with StoryMaps
  1. How do you start from 0 in DH?
  • dhatccl101.com – training course for faculty and librarians – lots of resources, helpful to show projects of different scale and time commitments; now have 3 3-day intensive workshops on specific topics (text analysis, GIS, data viz; last summer was digital storytelling; 2 day crash course to those new to DH)
  • work backward from people’s syllabi and approach based on topics – e.g. this course on 15th century Italy would be a good mapping project, want us to work with you? This worked ~3 months ahead of time, started out by creating guides for students and faculty about how to use tools before class starts, especially for smaller projects; especially good for seminar level topical classes rather than big intros
  • Palladio is an open source tool from Stanford that is useful for visualizing data and social networks – one you can start with when you have a clean, structured data set that’s ready to go; Geffi (sp?) is a free open source one for research
  • It can be interesting to use multiple tools to look at the same data (e.g. networks, maps, text analysis, etc.) and see what different patterns emerge
  1. What are people looking for, or had success with, from libraries?
  • “Buddy system” where you have a simple tool, ready to go data sets, and a maker day where people who are experienced work with others who aren’t
  • “The Big Idea” – chapter in Exhibit Labels by Beverly Serrell – helps in finding main theme in writing for the public
  • “Collections as data” – libraries presenting collections as data for scholarly use
  1. How to convince departments that digital scholarship IS scholarship?
  • Many faculty are more open to using it in classroom than in their own research because it is not considered equal in tenure decisions – some use digital tools in research but publish traditionally, other option is traditional research but digital/classroom dissemination
  • Structuring data may be a continuing challenge since platforms keep evolving to have different requirements; learning different skills/tools in general is transferable but very specific how-tos for all circumstances are less so

 

Omeka-S

Omeka-S

www.omeka.org

Background

  • Omeka-S is the enterprise version of Omeka
  • Omeka Classic originally intended for small institutions – local historical societies, etc.
  • Larger institutions developed interest, needed an institution-wide IT solution to support numerous users and sites
  • WordPress multi-site segregates contents of each site; Omeka-S shares content across sites (available for all institutional users to see and use)

Components

  • Item (same concept as Omeka Classic)
    • Dublin Core metadata (title, description, etc.)
    • Not only text, but also links to other items, URIs, media (files or HTML code), dedicated YouTube video (with variable start/end time for shorter segments)
    • Items can be in multiple collections, not just one
  • Vocabulary
    • Set of established metadata labels (Dublin Core, Bibliographic Ontology)
  • Resource Template
    • Set of metadata labels the user wishes to utilize for a class of items
    • Labels of any resource template are drawn from one or more vocabularies
  • Item Set
    • Analogous to a collection in Omeka Classic
    • Any label applicable to an item can be applied to an item set
    • Allows large groups of items to be added to pages more easily
  • Site
    • More complicated and versatile than Omeka Classic
    • Content of each page built separately
    • Navigation between pages established independently

Major Tools

  • CSV Importer
  • Omeka-2 Importer
  • Custom Vocabularies
  • Zotero Importer

Permissions

  • Two levels of permissions – global (institution-wide) and specific to an individual site
    • Global Administrator – Site Administrator – Editor – Reviewer – Author – Researcher
    • Users can have different permission levels for different sites (except Global Administrators)

Tool-Sharing Session

–  voyant-tools.org

Provides word cloud, other tools for sorting and viewing key words.

 

– DH Press, UNC: digitalinnovation.unc.edu/projects/dhpress/

 

visualizingtheredsummer.com

 

www.theclio.com

 

curatescape.org

Walking tours.

 

openrefine.org

“A power tool for working with messy data.” Use for information in a spread sheet that needs to be cleaner.

 

docker.com-> hub.docker.com

Can be used to explore other tools.

 

thingiverse.com

3D maker

 

– http:// mapalist.com

Creates a map from a google spreadsheet.

 

feedburner.com

Automatic tweets. RSS feed.

 

-palladio Stanford

 

dataverse.org

Save and share data with others.

Introduction to Omeka

Omeka

www.omeka.net

(omeka.org has original version and enterprise version, which require more technical knowledge and webmaster capacity)

amandafrench.net/2013/11/12/introduction-to-omeka-lesson-plan/

Overview

  • Web publishing platform for sharing digital collections and creating media-rich online exhibits
  • More durable/useful than other means of posting information online
    • Material that has generally not been online previously
  • Server-based, subscription-based
  • Trial plan does not expire (fewer capabilities)

Components of an Omeka Site

  • Item
    • User-defined – could have multiple files attached
    • Each item has a single set of Dublin Core metadata (title, subject, description, source, date, legal rights, format, coverage, etc)
    • No files required – but could be images, video, audio, map with embedded links, etc.
  • Collection
    • Group of related items
  • Exhibit
    • Group of related items with substantial interpretive text, captions, deliberate visual arrangement
    • Multiple pages with navigation links may be used
  • Tag
    • Optional, additional keywords that can be applied to any item, collection, or exhibit
    • Visitors can search all items, collections, or exhibits, or can search by tags
  • Theme
    • Set of visual appearance preferences; aesthetic
  • Plugin
    • Functional capability
    • Subscription plans are differentiated by number of plugins available (20-32), number of themes (8-11), storage space (2-50GB), number of sites per account (2-∞)
    • Examples:
      • PDF Embed
      • CSV Import
      • Exhibit Builder
      • Google Analytics
      • Hide Elements
      • Geolocation
    • Multiple User Roles with varying permission levels
      • Super User – Administrator – Contributor – Researcher
        • Changes, by default, are not published until user with permission actively publishes

Acquiring Social Media Data

What is an API?

– API stand for Application Programming Interface

– allows software to interact with a website

– API calls consist of requests and response of structured data

 

Example: Twitter

– Collect tweets:

1) User timeline: GET statuses/user_timeline; gets most recent tweets posted by a year, limited to last 3,200 tweets, returns 200 at a time, so must page, rate limit: 900 tweets per 15 minutes. Examples: Collect individual news organizations, individual members of congress.

2) Search: GET search/tweets; search recent tweets (sample of tweets from last 7 days), returns up to 100 at a time, so must page, not the same as search on Twitter website, rate limit: 180 tweets per 15 minutes. Example: Get tweets from an event.

3) Filter Stream: POST statuses/filter; Real-time filtering of all public tweets; continue to receive additional tweets over a single call to API. (No paging.) Limits: when high volume, will not receive all tweets. One stream at a time per set of credentials. Example: Women’s March.

– You can never assume that you have all the data.

– Resources for Twitter data: DocNow (Tweet Catalog); TweetSets

-According to Twitter’s terms, you cannot share the complete tweets, you have to share the tweet IDs.

-Once a tweet has been deleted, it cannot be shared.

-How do you collect twitter data?

Twarc: github.com/docnow/twarc

Twurl: github.com/twitter.twurl

Social Feed Manager: go.gwu.edu/sfmgw

Tags: tags.hawksey.info

 

Example: Facebook

– Graph API Explorer

– JSON

– collect by node.

– can only collect public pages

Tropy

Tropy

www.tropy.org

Overview

Similar to Zotero, but for images (i.e., photographed documents). Features include ability to:

  • Upload all photos from a source into Tropy
  • Attach the basic info to them (which archive, collection, etc. do they come from?)
  • Add additional categories
  • Group photos into documents, annotate, add metadata (including long-form notes), categorize, search, export (to Omeka, Flickr)

Tropy v1.0 Characteristics

Strengths

  • Possible to edit metadata with multiple items highlighted
  • Also possible to “merge items,” (highlight & right click) apply metadata, then “explode” item
    • Same metadata (including title) applied to all individual images
  • Tags – you can color code them and can also tag items in bulk
    • More than one tag per item allowed
  • Once you click on a photo – to edit the notes etc – you can actually select a selection of the photo; useful to show material
  • An institution can create a Template for an institution saying what they want recorded about the material – forcing/encouraging researchers to think about these things – you can add different Properties, such as date
    • Preferences -> Template Editor; 3 basic templates: 1) Archives; 2) Archives Correspondence; 3) Art Objects
    • When changing templates for an image, you do not lose any fields that were filled in; you do lose fields which were blank
    • Includes the ability to create mandatory and read only fields (was created with institutions in mind); you cannot describe every folder/document in the box – but you can describe the box as an institution; and then use material for researchers
  • Global Search function – searches through tags, metadata, notes etc
  • Can export items (json format; cn be converted to .csv – you can also get rid of some fields in this way, because you can export just the metadata)
    • Planning to add plugins for Omeka
  • ArchivesSpace; Archivists Toolkit (digital assessment system) – plug ins being built for this type of system to move information from Tropy; use that information to populate finding aids
  • Possible to create lists and add photos from your main project there
    • Whatever you delete from the lists is not deleted (it’s just a link)
    • Metadata edits will be applied to items on lists though
  • Once portability is in place you will be able to share Tropy files – via cloud or otherwise
  • Simple features to make the photos more legible (e.g., more contrast) to be developed
  • You can add items without photos (you might not have the right to take a photo, etc.)
  • No theoretical maximum file size
  • Tropy – Zotero interaction under development (Tropy does not generate citations)

Limitations

  • Exclusively handles .jpg and .png formats at present
  • Desktop program only (no mobile app or plan for mobile app)
  • No automatic organization of images
  • Cannot have more than one project open at one time
  • Does not work with proprietary software (example: Scribner)
  • No plans for OCR or ways to link it with transcriber software
  • Tropy does copy your photos in a smaller size
    • It will use a lot of space; but still, maintain a connection with the original photo
    • If you move the photo on your hard drive Tropy 1.0 will be able to update the path

Other Notes

  • Interest has been expressed in building collaborations with archives/libraries
    • Asking researchers to share their photographs with the institution
    • Making sure researchers apply basic metadata
  • Some institutions plan to use Tropy internally to avoid scanning and re-scanning the same items over and over again
  • After generating templates for any collection, people could use those templates in Tropy
    • Once work is done, could send back to the archival institution/library

Teaching Intro to Digital Humanities Courses

Different Strategies to teaching DH

– Transcribing: convey to students that all internet content is created; National Archives-Citizen Archivist; assign students to transcribe documents for a specific project

– Start with the History of Computers; What is Data?; Spatial, Network, and Text Analysis; Final             Project: Write an NEH Grant

– Research v. Skills Oriented

– Collaborative Independent Study: History and Computer Science

– Method and content: One day of the week devoted to DH theory and historical content, the      other day devoted to hands-on work at computer/labs.

 

How do you measure progress?

–  Blog posts

– Group projects: rubric, students can see each other’s work for comparison; have students         evaluate themselves and their group members; students keep individual log books

 

How do you sell a DH course?

-part of another course: “digital” and “theory” in course title deters student enrollment

– class marketed as digital history, class theme introduced in the course

 

Resources

programminghistorian.org

flowingdata.com

Exploring Big Historical Data: The Historian’s Macroscope Shawn Graham, Ian Milligan, Scott Weingart

Using Digital Humanities in the Classroom Claire Battershill and Shawna Ross

miriamposner.com

dhbox.org

– Summer institutes: neh.gov.divisions/odh/institutes

Podcasts

I’m interested in a talk session—though I would be happy to include a portion of hands-on teach session as well if people would like—about history podcasts. I’m interested in discussing how to make them interesting and inclusive research products and/or useful for teaching, but also professional, high quality audio products. I’m also interested in discussing how to produce this type of work within a university structure or other settings, whether in teaching or as a scholarly product, especially within the current job market, tenure system, and funding structures.

Computational analysis of historical texts

I’m interested in developing effective strategies to locate, access in suitable formats, preprocess, and apply computational tools to historical texts available at multiple repositories including archive.org, Library of Congress, Hathi Trust etc. There are OCRC, format, and analysis challenges galore to overcome but I think this approach can provide students with very useable skills and build historical understanding.