Islandora CLAW

Latest CLAW News

New Islandora CLAW Committers: Rosie Le Faive and Mark Jordan

manez's picture
Submitted by manez on

In recognition of their many contributions to the community and to the development of Islandora CLAW, the Islandora CLAW committers have asked Rosie Le Faive (UPEI) and Mark Jordan (SFU) to become a committers and we are pleased to announce that they have accepted!

Rosie has brought her dedication to user experience and documentation from the 7.x stack to Islandora CLAW, providing guidance on how to improve the front end and working as the convenor of the UI Interest Group (currently on hiatus) to develop a welcoming first experience for the Islandora CLAW sandbox environment. As co-convenor of the Metadata Interest Group, Rosie has also been integral to the process of plotting out our MODS to RDF mapping so that users of Islandora 7.x can make the move to Islandora CLAW with their MODS in tow.

Mark has joined the Islandora CLAW party more recently, but hit the ground running, developing tools such as RipRap, a fixity-auditing microservice that acts as a successor to Islandora Checksum Checker. Mark's focus on preservation tools fills an important gap in the CLAW ecosystem.

Both Rosie and Mark are also Committers on Islandora 7.x, and now join the short list of dual Committers.

Further details of the rights and responsibilities of being a Islandora committer can be found here:

https://github.com/Islandora/islandora/wiki/Islandora-Committers

Open for Revew - Technical Roadmap

dlamb's picture
Submitted by dlamb on

At our last Annual General Meeting, a list of strategic goals were approved by our membership. In order to fulfill:

Create a roadmap for the future of the Islandora platform, including tools and strategies for migration

The Technical Advisory Group (TAG for short) has been working on a prioritized list of upcoming features and improvements for Islandora CLAW to help guide its development. We're aiming to release Islandora CLAW (and drop the CLAW codename) after 7.x-1.12 is released. Once that happens, this roadmap will be used to set sprint goals and other development priorities. We're opening up the roadmap to review by the entire community, and are asking for your feedback. You can leave comments either in this google doc or in the individual Github issues. We've also ranked these issues using a Github project.

Priority was agreed upon following the general rule of providing "must-have" features before migration. In other words, features which, if missing, would prevent someone from adopting the software should receive higher priority. Documentation and examples involving migrations ranked first, with multi-site support following up second. Also on the list are features built around the Fedora API specification, UI/UX improvements, new derivatives, and a lot more.

If there's anything you think is missing that's high priority, feel free to leave a suggestion in the google doc, or create an issue on Github and give it the Roadmap label. This is your chance to help shape the development of the project, so if you really need something before migrating in, this is a good opportunity to have your voice heard. After this review, the finalized list of features will be presented to the Board of Directors for approval. Once approved, the roadmap will be prominently displayed on our web site to help give people a sense of the direction of the software.

7.x to CLAW Migration Sprint - Complete!

dlamb's picture
Submitted by dlamb on
The Islandora community has just wrapped up a very successful sprint dedicated to migrating from 7.x to Islandora CLAW. We at the Islandora Foundation want to give a big thanks to everyone who put in time during this sprint, as well as the organizations who lent us their talent on the company dime. We also want to give a special shout out to the Metadata Interest Group, who collectively put in a ton of time and tackled some intense questions for those who want to use a migration to Islandora CLAW as a chance to do metadata cleanup. During the course of two weeks, we managed to accomplish a lot. As of right now you can:
  1. Migrate over objects based on content type
  2. Migrate ALL the datastreams (except AUDIT, which is a special case)
  3. Extract metadata from any XML datastream and make it a Drupal field
  4. Model authorities such as people, organizations, and subjects
  5. Convert MODS to CSV using Cara Key's (LSU) XML2CSV tool
There's still some work left to do, though. On the horizon for the near term, be on the look out for:
  1. Migrating the AUDIT datastream
  2. Modeling more/different types of authorities
  3. Examples of extracting authorities from FOXML
  4. A workflow for those who want to use OpenRefine to reconcile linked data authorities during the migration process
Moving forward, this is an excellent chance for people to try out the tools we're developing and point them at their existing repositories. Our migration tool, originally developed by Jared Whiklo (University of Manitoba), is available on Github. And if you want to give modeling authorities a go, check out our new controlled_access_terms module, which was made by Seth Shaw (University of Nevada Las Vegas). If anyone has feedback/issues/questions, please feel free to create an issue or post a message on the mailing list. Here's a full list of all the people and organizations who helped make this once-considered-impossible feat a reality:
  • Benjamin Rosner - Barnard Collge, CU
  • Pat Dunlavey - Born-Digital
  • Andrija Sagic - Library "Milutin Bojic"
  • Ann McShane - Library Company of Philadelphia
  • Cara Key - Louisiana State University
  • Jason Peak - Louisiana State University
  • Jonathan Green - LYRASIS
  • Rachel Leach - Mount Holyoke College
  • Mark Jordan - Simon Fraser University
  • Adam Soroka - Smithsonian Institution
  • Rachel Tillay - Tulane University
  • Pete Clarke - University College Dublin
  • Jared Whiklo - University of Manitoba
  • Mike Bolam - University of Pittsburgh
  • Seth Shaw - University of Nevada Las Vegas
  • Paul Pound - University of Prince Edward Island
  • Rosie Le Faive - University of Prince Edward Island
  • Nat Kanthan - University of Toronto Scarborough
  • Marcus Barnes - University of Toronto Scarborough
  • Carolyn Moritz - Vassar College
Thanks to everyone involved! And if you missed out on this sprint, don't fret. We'll be holding another Islandora CLAW community sprint later this year after Islandora 7.x-1.12 is released.

Islandora and the COAR Next Generation Repositories Report

manez's picture
Submitted by manez on

Late last year, a working group of the Confederation of Open Access Repositories (COAR) released a report with recommendations to adopt "new technologies, standards, and protocols that will help repositories become more integrated into the web environment and enable them ​to ​play ​a ​larger ​role ​in ​the ​scholarly ​communication ​ecosystem." Islandora's own Institutional Repository Interest Group took up the report and measured Islandora against it, looking at both the current functionality available in Islandora 7.x, and how we can best shape Islandora CLAW to meet these recommendations for the future (complete with issues in the CLAW GitHub so we can track our progress). They have shared their own results, written up by convenor Bryan Brown:

 

#1: Exposing Identifiers

The brunt of the recommendation here seems to be implementing best practices listed at http://signposting.org/ regarding typed HTTP links. I’m not sure what Islandora 7.x is doing in terms of typed HTTP links, but I’m assuming nothing beyond whatever Drupal 7 does by default. It could certainly be doing more, but there’s a lot to chew on in the best practices in terms of deciding what actually needs to be done, and how this should be done for different types of objects. CLAW, being a linked data application that operates primarily via HTTP, should definitely be doing these things. I’ve made a use case for this at https://github.com/Islandora-CLAW/CLAW/issues/860.

 

#2: Declaring ​Licenses ​at ​a ​Resource ​Level

Very similar to Behavior #1 (Exposing Identifiers), this recommends using best practices from http://signposting.org/ to use typed HTTP links to expose the URI for the license that best describes a resource. Good in theory, but not all licenses have machine-readable URIs, and would require either migrating existing free-text licenses to ones that have a URI, or in the case of special one-off licenses, creating URIs for local licenses (which wouldn’t be very interoperable). COAR recommends using Creative Commons licenses since they have readily available URIs, but CC licenses aren’t really a good fit for scholarly works since publishing introduces a lot of issues that CC licenses don’t cover. As for the human readable part, that’s just a matter of your metadata and your theming. 7.x and CLAW both should be able to display human-readable rights statements, but neither can do the HTTP link part currently. CLAW use case at https://github.com/Islandora-CLAW/CLAW/issues/860.

 

#3: Discovery ​through ​Navigation

Even more emphasis on using the best practices at http://signposting.org/. 7.x’s Islandora Google Scholar module adds a link to the PDF for citation/thesis objects as an HTML meta tag, but that’s it. Its easy to see how adding this as a typed HTTP link, especially for compound objects would be helpful to let a machine know about the different parts of a larger meta-object. This feature would be nice for 7.x, but as a Linked Data Application CLAW should definitely have it. Covered again by https://github.com/Islandora-CLAW/CLAW/issues/860.

 

#4: Interacting ​with ​Resources ​(Annotation, ​Commentary ​and ​Review)

Members of the IR IG are not sold on this one for use in university IRs. Perhaps there are very specific types of repo systems where peer review, comments and annotations are useful, perhaps for aggregators or publishing platforms. In a university IR, it seems like it could actually hinder adoption because faculty might not want folks interacting with their scholarship, and would request mediation for such things which would slow down already overworked IR staff. Drupal already has tons of modules for things like this, so you could probably modify one to work with Islandora objects in 7.x, and in CLAW you wouldn’t even have to write any code, just turn the module on and configure it. Turning those annotations into linked data on the object would be a bit more difficult, but that difficulty would be more in deciding how the metadata should look than how to implement.

 

#5: Resource ​Transfer

This seems to be suggesting a modern form of OAI-PMH, but in a way that includes assets in the transfer. Strong recommendation for ResourceSync, which we have no experience with, but looks like it would do the job. 7.x will probably never have this, but CLAW should focus on it. Use case at https://github.com/Islandora-CLAW/CLAW/issues/857.

 

#6: Batch ​Discovery

We aren’t really not sure how this differs from Behavior #5 (Resource Transfer) since this seems to be a use case where someone used “Resource Transfer” technology to put all of your repo’s stuff in an aggregator so that it could be found in multiple places. You take care of #5, you already take care of #6. Covered by use case https://github.com/Islandora-CLAW/CLAW/issues/857.

 

#7: Collecting ​and ​Exposing ​Activities

This seems to be a mash-up of #4 and #5: capture interactions, turn them into metadata that you expose, and then push that metadata along with the rest of your data with ResourceSync. There are a LOT of recommendations for possible ways to do this, which underscores the fact that there’s not a clear standard for this and probably not a lot of consumers for this kind of data either. This seems like a “nice to have”, not a “have to have”.

 

#8: Identification ​of ​Users

This seems like a good idea, and ORCID seems like the obvious best choice in a scholarly context. We don’t know much about the other two ID systems involved (Social Network Identities and WebID), perhaps they would be good for folks who don’t have an ORCID, but then again perhaps this could be a good way to get people to use/understand ORCID. Use of ORCID could potentially lock out non-academic users, which may be a bug or a feature depending on your goals. Whichever you pick, the problem is going to be getting something that people use across the web in order to deliver on the promises outlined in this section. In an age where people are wary about privacy and the web knowing too much about you, we don’t think this one would get as much broad adoption as COAR thinks.

 

#9: Authentication ​of ​Users

We don’t understand how this is different from #8, it seems like the two go together to such a degree that separating them is only confusing.

 

#10: Exposing ​Standardized ​Usage ​Metrics

This is a nice dream, but much harder than it sounds. Current generation repositories are pretty close to doing all they can in terms of capturing views/downloads on objects, although client-side triggers are better than server-side ones in order to avoid problems with caching, and Piwik seems to be a winner in the international community due to its focus on privacy and flexibility (although it does require setting up your own Piwik server). Standardizing the way usage stats are exposed from the same repo is a good idea as well, but none of us have experience with SUSHI or COUNTER.

All this can be done to perfect aggregation of usage stats on the same repo, but aggregating/summing stats from external sources is not going to be a practical option until there is a centralized source that does this with a solid API.

 

#11: Preserving ​Resources

While we agree with the sentiment here, we’re not sure they are saying anything new. Fedora should take care of the actual preservation bits, and Islandora has always requested least-common-denominator open format file types for archival master datastreams and used derivative processes to spin into other formats.

Islandora CLAW modules on Drupal.org

dlamb's picture
Submitted by dlamb on

I've taken the liberty of putting CLAW's Drupal modules on drupal.org as sandbox projects.  It is my intention to promote these to full projects once CLAW is released so that our modules can be distributed through drupal.org and made available under the 'drupal' namespace on Packagist.  We've always been on the sidelines of the Drupal community, and this feels like a step in the right direction.  Not only will our modules be available somewhere other than just Github, but Islandora will also get exposure to the wider Drupal community.

This does not mean that we're adopting Drupal's workflow, as CLAW encompasses more than just Drupal modules.  As of now, there will be no impact on day to day development, which will continue as-is on Github.  However, the subtleties of its inclusion in the release process will need to get discussed and ironed out as we work through our initial release.

They're not much to look at, but here are the links if you're interested:

Pages

Subscribe to Latest CLAW News