Content Modeler

There was some great feedback during the session and I've started thinking about how we might simplify the content modeler.  Please add your comments to this post, post on the Islandora User Google Group, and/or submit tickets through the Islandora project on JIRA.

Note: Since this document was created, the Content Modeler module has been depreciated.

Developing a Content Model for an imaginary collection of typewritten letters

Starting point ... a scanned version of a typewritten letter ... which is a TIFF [ sample image ].


Image removed.

Questions to ask before using the Content Modeler Tool

  1. What kind of metadata schema will I use to describe each letter?
    1. Is Dublin Core sufficient or would MODS be more appropriate or EAD or ...?
      1. You'll need to review your content and select a schema that best matches your needs.  Avoid creating your own schema.
    2. You need to use the FormBuilder to create your metadata form.
  2. If the letters are more than a single page ... how will you deal with that?
    1. There a few options here:
      1. each letter is its own digital object and is related (using RELS-EXT or embedded in the metadata) to a 'collection object' that gathers the pages of the letter together
      2. a single letter object could have several several page datastreams
      3. Our preference would be to take an 'atomistic' approach and use method ... each page of a letter would be created as a digital object.
  3. How will your users view/search your collection of letters.
    1. Will you have a grid display of your letter images? Or a list view? Or both?
    2. Will you need a thumbnail for each of your letter images?
      1. If so you'll need to create a thumbnail datastream that is part of your letter object. What happens if you have many pages in the letter? Just the thumbnail the first page? What if in a search a user gets a list of letters/pages?
  4. What will the view a single letter look like?
    1. Will you display the metadata of the letter and web based image of the letter. (you may want to use some wireframing tools to sketch out your views ... eg. try the Pencil Project ( a plugin for Firefox.
  5. What derivatives will you need to provide the various views to your users?
    1. thumbnail
    2. web based image of the letter
    3. you could add tremendous value to your collection by extracting the text from the page image using an OCR program and include the resulting text in your index for search/discovery.

Based on the outline above we can start to determine the datastreams that will make up a typical letter digital object, which will then help us define the content model for this type of digital object.

Letter Digital Object

Image removed.

Here's a table of datastreams, the datastream ID that I've assigned, and the expected mimetype of the datastreams.

Datastream Label Datastream ID Mimetype Archival TIFF TIF image/tif, image/tiff JPG Image JPG image/jpg, image/jpeg Letter Thumbnail TN image/jpg Descriptive Metadata MODS text/xml Extracted Text OCR text/plain

Letter Content Model

Image removed.

When compared to the Letter Digital Object ... the content model seems a bit thin.  Much of the work of the content model is contained within the ISLANDORACM datastream. Here is a commented FOXML version of the demo:LetterCModel content model.Much of the work of the ISLANDORACM is performed by a variety of functions which are contained within .inc files (php files) in the islandora/plugins directory.

More information about Fedora Content Models are available on the Fedora Commons site.


Islandora Content Models

Existing Islandora Content Model Documentation

Sample .inc file code (code doesn't work ... was just a proof of concept)

Review the content models and the associated .inc files present in the Islandora Solution packs.


Note: Since this document was created, the Content Modeler module has been depreciated.

Ideas about the simplifying the content modeler.

Participants commented on the complexity of the content modeler and the many options that you needed to know about before making full use of it.  I think associating the datastreams with the ingest and display methods were the most problematic. As a user or developer how would you like to use a tool like the content modeler?

For example, a quick thought I had was presenting a user interface that was checkbox/select based on what you wanted to do to an image and where you wanted to store the output of the process.  The users selections would be associated with functions as they are now, but without the need to know that the createTN function was in a particular .inc file.

Image removed.

You can help us by providing feedback, documentation, ideas, drawings, code, etc.

Don Moses
updated: July 2, 2012