dotCMS diagnosis & cure for broken identifiers, orphaned inodes and versionless links
You ever see an error like this in your dotcms.log?
[14/11/14 08:52:54:570 MST] ERROR action.ViewLinksAction: Exception e =User com.liferay.portal.model.User@d523aa66 does not have permission to read null
com.dotmarketing.exception.DotSecurityException: User com.liferay.portal.model.User@d523aa66 does not have permission to read null
You might have broken identifiers, orphaned inodes and/or links w/o versions. There are a few other reasons for this happening, but for the sake of this article, we're going to demonstrate how you'd definitively determine that you indeed have broken identifiers, orphaned inodes or versionless links and how to fix them.
Broken identifiers and orphaned inodes are the symptom of basically the same problem: stuff that's supposed to be related in your dotCMS database isn't.
A word about identifiers and inodes. Every dotCMS piece of info boils down to an identifier. With dotCMS's versioning, one identifier can have unlimited versions. Inodes describe version numbers. So identifier:inodes relationship is one-to-many relationship.
If your inodes and identifiers are messed up, dotCMS won't run in an expected manner, those are pretty important little info bits there.
You're going to need access via a database client to diagnose and fix. The SQL here is of MySQL flavor but should easily translate in to any of the other DBs, if you get stuck w/ translation, hit me up in comments, I'm happy to help.
Before we lay in any of these fixes, you are forbidden from proceeding without a database backup.
For MySQL, dumping requires that we explicitly include stored procedures and triggers in the dump (backup) file. Without exception, I find myself regularly dumping the database in all of the dotCMS installations I work, here's a Bash shell script I keep handy to automate my database dumping needs.
Identification
How do you narrow down that you do or don't have broken identifiers, orphaned inodes and/or versionless links?
Versionless Links
Let's say for example your Links portlet (Admin > Website Browser > Links) is misbehaving, perhaps not loading at all and you're seeing the error above. Let's investigate possible versionless “links” data by running this:
SELECT COUNT(identifier) FROM links WHERE identifier NOT IN (SELECT identifier FROM link_version_info);
Here we're asking the db for a count of links
records that aren't associated with link_version_info
records. A number < 0 indicates we have some cleanup to do. I cover that in the "fix" section which follows.
Orphaned Inodes
SELECT * FROM inode WHERE type = 'links' AND inode NOT IN (SELECT inode FROM links);
We're asking the database, "From your links data, do we have any inode
records that don't match up?"
An affirmative response would be one or more returns, if you don't get any, that's good, your links
data is sound. If you do get one or more returns however, we have work to do (see the "fix" section to follow).
Broken Identifiers
SELECT COUNT(*) FROM identifier WHERE asset_type = 'links' AND id NOT IN (SELECT identifier FROM links);
In db speak: "From your links data, are you lacking any links
records that we have logged in the identifier
table?" Just like above, one or more returns means we have a problem (fix coming).
Another broken identifier check here:
SELECT COUNT(*) FROM identifier WHERE asset_type NOT LIKE 'links' AND id IN (SELECT identifier FROM links);
We're asking the db: "please count up 'non-link' identifiers that live in the links
table".
And one last broken identifiers check, here we check the parent_path
field:
SELECT COUNT(*) FROM identifier WHERE parent_path LIKE '%http%' AND id IN (SELECT identifier FROM links);
Again, like the checks above, any return > 0 indicates that we have work to do.
Data Fixes
You backed up your db, right? No moving forward without one.
Orphaned Inodes
DELETE FROM inode WHERE type = 'links' AND inode NOT IN (SELECT inode FROM links);
Broken Identifiers
DELETE FROM identifier WHERE asset_type = 'links' AND id NOT IN (SELECT identifier FROM links);
UPDATE identifier set asset_type = 'links' WHERE asset_type not like 'links' AND id IN (SELECT identifier FROM links);
Ensure data consistency
UPDATE identifier SET asset_type = 'file_asset' WHERE id IN (SELECT identifier FROM file_asset) AND asset_type NOT LIKE 'file_asset';
Parent Path Fix
If your parent path check above gave us a count > 0, we're going to need to fix with the following read:
SELECT * FROM identifier WHERE parent_path LIKE '%http%' AND id IN (SELECT identifier FROM links);
If you end up seeing parent paths that follow the pattern ([directory][fqdn/path],(FQDN = Fully Qualified Domain Name)) like in this example, we need this data to write a fix. With that data you just pulled, we'll need the identifier.id
, identifier.parent_path
and a quick + dirty find/replace in your editor of choice to remove that FQDN, like the following:
update identifier set parent_path = '/offices/studentaccounts/' where id = '1b7be8b4-2f3d-4815-a8f0-d0607c74a79e';
And you'll want to write an update statement for each record, like this.
One last versionless link check
Let's run one last check for links w/o version info, like this.
SELECT identifier, inode, title, mod_date FROM links WHERE identifier NOT IN (SELECT identifier FROM link_version_info) ORDER BY identifier ASC, mod_date DESC;
Returns indicate bad data. Consider this return. Note that the identifier is the same in each record. That means that we have one record to update. The fix would work like this:
INSERT INTO link_version_info(identifier, working_inode, live_inode, deleted, locked_by, locked_on, version_ts) VALUES('2ead13e4-5421-4962-bf7c-b66881d305c8', '1562e242-f6cc-4521-a6f5-9ae83d45e0e7', '1562e242-f6cc-4521-a6f5-9ae83d45e0e7', 0, null, CURRENT_TIMESTAMP, CURRENT_TIMESTAMP);
We're telling the link_version_info
table what the working/live version is for this record.
Conclusion
It's very possible that if you're looking at this, you have some corrupt or misbehaving data. It's probable too that it's messed up in a different way than described here. If this didn't get you pointed in the right direction, you still have questions or help in any way ironing out your dotCMS data issue, please either hit us up directly or leave us a note in comments.