In January, MediaGraph released significant improvements to its already great IPTC support. This includes support for almost all remaining IPTC fields. More importantly, we’ve expanded our support for “Named Entities.” Let me explain:
The Bottom Line
This post goes pretty deep into some theory and practice about metadata, so I’ll put the lede here: Named Entities make it easier to create reliable, unambiguous metadata to describe, well, nearly anything. Our new implementation makes it faster and easier to create relationships between media files and the real-world items they depict. It also makes it easier to find relationships with other named things, inside and outside of the account. Now here’s the deeper explanation:
The IPTC specification has fields to describe many facets of an image or video. This includes freestanding fields like the caption, or the location where the photo was made. Beyond that, some fields can actually describe other fields. For instance, the Creator has dedicated fields for contact information; the Creator is an “entity” that has its own metadata. This is a type of structured metadata.
Here’s a screenshot from Adobe Bridge. While the field names indicate that the address fields all describe to the Creator tag, they don’t actually belong to the Creator in Bridge.
Structured metadata helps to describe exactly what we mean by a specific tag. It can help you understand which John Smith the creator tag refers to. It can also tell you about a particular John Smith. This helps to provide both context and meaning for the tag, and the files it is attached to.
There are more than 10 structured entities that are part of IPTC. Each one has specific subfields that tell you about the entity. Entities include person pictured, event pictured, model release information, product information, and a number of licensing entities.
Taming “Wildcat” Metadata
Wildcat bank notes were uncontrolled currencies. Sometimes they had value, and sometimes not.
Corporate DAM systems will typically include lots of inconsistent metadata. This is because files are often created and tagged in other applications before submission. Creator names may be written in a variety of ways. Usage Rights will usually be written in an ad hoc manner. And structured metadata for person, product, event etc will often be missing or conflicting from file to file. Let’s lump all of this inconsistent and unstructured metadata under the term Wildcat Metadata (harkening back to the mid-19th century wildcat banking).
Wildcat metadata produces a poor user experience. Here are a few examples.
- Search and filter results will seem arbitrary if small differences in the tag are treated as entirely different tags. The creator tag “John Smith” and “© John Smith” may may be treated as two different creators when, in fact, they refer to the same person.
- If the contact information for John Smith is written multiple ways in different files, you won’t know what is correct.
- When Events are listed by an undefined title, it may group unrelated items together, such as all photos from the NCAA National Championship taken over multiple years.
One solution to wildcat metadata is the use of Controlled Vocabularies. Essentially this is an approved pick list for various tags. The application can store approved tags, and force the user to choose among these. This creates consistency.
MediaGraph has always had controlled vocabularies for keywords, people and events. Library Managers and above can add a term to the tag tree, and that helps to promote consistent tagging. Tag Suggesters can provide a targeted pick list for uploading. But single-field controlled vocabularies don’t really solve the problem for named entities.
Controlling for single fields won’t ensure that John Smith’s contact information is written consistently, unless it applies to the named entity as a whole.
MediaGraph’s Tag Tree allows you to display important tags for easy discovery by users. The tag tree also supports drag-and-drop tagging for fast and easy metadata creation.
Controlled Vocabularies for Named Entities
Our January release has expanded our support for Named Entity controlled vocabulary. Wherever the IPTC specifies a parent/child data structure, we allow you to create a verified Named Entity. This will be really useful for any clients wishing to reduce ambiguity and inconsistency. Once the Named Entity has been created, it can be applied very easily, and frequently by drag-and-drop. Applying the entity will also add all the child metadata at the same time.
Conversion of Wildcat Metadata to Verified Named Entities
One of the significant challenges for Named Entities is the conversion of Wildcat Metadata. An organization’s media collection will typically contain a lot of inconsistency, particularly for embedded metadata in incoming files. But you don’t want to ignore the wildcat stuff, since it may be correct. Our challenges here included the following:
- Display the Wildcat metadata whenever it exists.
- Allow an admin to convert a Wildcat Entity to Verified Entity.
- Allow an admin to replace a Wildcat tag with the Verified version.
- Let users know whether they are looking at Wildcat or Verified.
- Not force the Admin to fill out all fields immediately.
So here’s what we did.
- All embedded Named Entity metadata enters our system as Wildcat.
- Admins can save the Wildcat version, transforming it to Verified.
- All entities created in MediaGraph are Verified by default.
- We use blue checkmarks to distinguish Wildcat from Verified.
- Admins can add more data to Named Entities as time permits.
- We also created an automatic filter for files where the creator tag is blank.
Once the verified version of the named entity is created, it's easy to add appropriate new files by drag and drop. It's important that this does not happen automatically by name matching. This might automatically credit some John Smith files to the wrong John Smith.
Compatibility and Expandability
We have taken pains to build out named entities using IPTC fields whenever any appropriate one exists. This ensures that the information is not locked away inside MediaGraph. It is our core belief that any metadata you make in MediaGraph is portable, because it belongs to you.
There is another benefit that our Named Entity implementation offers. We have restructured the database so that our clients can create custom structures that serve their specific needs. If your organization needs a different type of Named Entity, you can make it. And you can add as many child fields as you need.
Example 1 - A Creator
Let’s say you hire Sam Kittner to make a library of photos for your organization. You want all the files to be tagged with correct contact information. You also want to make sure people in your organization know who Sam is, in case they want to hire a photographer. By creating a Named Entity for Sam Kittner, you can make sure that all files have the proper credit and contact information. You can even update the files with new information if his address changes.
Example 2 - Person Pictured
The identity of relevant people is one of the most important details in photos, videos, and other media. As with Creator above, you can save a Verified Entity for individuals. This creates consistency. And the description field and attached links allow you to specify exactly which John Smith the tag refers to.
The Person Pictured fields lets you provide unique identifiers and other important context about people in your media collection.
Example 3 - Events
Event Entities can make your metadata workflow much faster and more accurate. An Event Entity can contain lots of relevant information, including the name, a date, a description, a location, names of relevant people and organizations, as well as web links. These can be created in advance for scheduled events. And they can be created afterwards for events in the past. Assigning an event from a pick list allows you to add all the information with a single click.
Adding relevant information to the Event tag itself helps to provide context for your members, and to connect related events.
Example 4 - Rights Packages
Our Rights Packages are a really useful type of Named Entity. There are many different properties in a Rights Package, and they can all be assigned with a single click. Properties include the name of the entity, an administrative description, an IPTC Usage Rights statement, an upload agreement, an Expiration date, an attached contract, and more.
A Rights Package has some great side benefits. It created consistency in rights labeling. It also provides a place for the legal department to collaborate with the collection manager regarding specific terms and language.
Rights information is also a Named Entity. It helps to clarify your rights to the material in the account to prevent license violations. These Rights Packages can be filtered by category, such as "Unlimited Use", or 'Company Owned".
Example 5 - Artwork or Product Information
The artwork named entities in IPTC is the most comprehensive named entity in the system. It specifies a standard way to write a wealth of information about a work of art or other object. This can include the title, circa date of creation, description of the original object, museum or collection, inventory number and more. The product information is less comprehensive, but is very useful for manufacturers and retailers. The video below shows how Wildcat Metadata can be converted to a verified entry for Product info. It also shows how you can add additional details at a later date.