Back in 2013, we wrote about all the errors and issues we discovered when attempting to use XBRL data for our models. Back then, even simple data points such as shares outstanding or dates were regularly wrong, or even missing, from company filings. In the past couple years, XBRL has definitely come a long way in terms of improving the quality of the data, but there’s a lot of work left to do.
We’ve been able to integrate XBRL into our models to automatically update certain quarterly data points, but even that took months of continuous mapping and testing in order to reconcile custom tags, data points that don’t sum up correctly, and errors made by reporters. For now, XBRL remains limited in its utility, difficult to use for those without significant time and expertise, and not totally reliable in terms of the accuracy of its data.
Custom Tags Undermine the Purpose Of XBRL
Custom tags are the unique descriptions companies create when they believe one of their data points doesn’t conform to the standard tags established by the XBRL GAAP taxonomy. Companies create custom tags on the fly. There’s no way to know they have created one until you try to read the XBRL filing and realize the company has used a tag that you’ve never seen before. When companies use custom tags, a human is required to intervene, investigate the tag, and figure out what the data point truly signifies and how it fits into the big picture of the company’s financials. Sounds a lot like what a human does when he/she reads a regular filing.
We’ve been following this issue ever since 2010, when the number of custom tags started to explode. Custom tags are a wrench in the gears of XBRL or any machine language. The foundational purpose of any kind of machine language is that human intervention is not required to collect or analyze the data. Custom XBRL tags require human interpretation and prevent XBRL from being a truly machine readable language.
There’s no need for all these custom tags. After all, the XBRL reporting taxonomy contains over 17,000 tags. You’d think companies wouldn’t have trouble finding a tag to fit almost any data point they might use. However, the massive number of tags actually contributes to the problem. With so many different possible tags—and with annual reports as long as some novels—it requires a massive amount of effort to search through all the tags to find the right one, so companies sometimes just use their own custom tags instead of going through all that effort.
The SEC’s research shows that larger companies, the ones that have the resources to spend on ensuring XBRL accuracy, have been using fewer custom tags, but the number keeps going up for smaller companies. Many of these companies contract the work of preparing XBRL statements to cheap third-party providers that have little interest in doing anything but the bare minimum to ensure compliance. For small filers, over 10% of the tags they use are custom tags.
In order to resolve the issue with custom tags, we had to leverage the massive amount of work our forensic accounting expert analysts have done in parsing thousands of filings every year over the past decade. With all the expert-verified data in our system, we have an expansive library of our own tags that we match up to custom tags in order to determine how those data points should be treated. We are not aware of any of entity with this capability.
Major Errors In Reporting
Custom tags are a pain, but they are far preferable to tags that are completely wrong. It’s no secret that XBRL data continues to have significant errors. It even formed a special committee recently to address the significant quality concerns. Many of these problems are simple errors that could easily be caught with software systems, such as incorrect positive/negative signs, required values not getting reported, or a data point getting attributed to the wrong corporate entity. Despite this fact, these errors still exist in large quantities, further undermining the utility of XBRL by increasing the dependence on humans to find and rectify errors.
Check out the huge number of errors that XBRL itself discovers. On August 10 alone it found 3,659 errors in 167 different filings. Some of these are relatively benign issues, but others are not. One of the filings, an annual 10-K report, had nine different instances of required values not being reported at all in the XBRL document.
There are many simple automatic software checks that can find and report these issues. Unfortunately, there’s almost no incentive for companies to take any concrete action relating to these errors, as there’s no real enforcement mechanism to punish them for their mistakes.
This creates a self-reinforcing feedback loop. Investors tend to ignore XBRL data because it is so inaccurate, which means the SEC doesn’t make cracking down on errors in XBRL data a priority, which means companies don’t go to a great deal of effort to improve the accuracy of their data. There have been some small improvements as companies gain more experience using the system, but without any strong motivation to correct their errors it’s hard to believe they’ll eliminate them in the future.
When we set out to integrate quarterly XBRL data into our models, these errors caused numerous headaches. In our first trial, roughly 50% of the filings we processed contained errors that prevented the data from summing correctly. Improperly tagged or categorized numbers would prevent, for instance, the various components of Property, Plant, and Equipment from adding up to the total listed on the balance sheet.
Where this occurred, we had to dive into the extension documents that accompany the filing to check the company’s calculations and figure out what went wrong. We’ve reached the point where we can automate most of this process, but even still we need our expert analysts to run manual checks to ensure data accuracy.
Still Limited Utility
So far, these limitations have prevented XBRL from seeing widespread use among financial professionals. Companies such as Calcbench have used it to make a great deal of data easily accessible, but the numbers are not integrated into any sophisticated models or analysis.
For our own part, it’s take a great deal of time and effort from our analysts and engineers—who have spent years automating the extraction of data from SEC filings—to use XBRL to update the most basic data points in our model from quarterly filings. For the sophisticated footnotes analysis and earnings adjustments, we still rely on our expert analysts and our patented research platform to analyze annual filings.
For the average investor, without significant technical expertise or experience, XBRL is essentially unusable for analyzing an individual company, much less trying to compare data across the broader market. Hudson Hollister, founder of the Data Transparency Coalition, put it best when he said, “the theoretical benefits of transforming government information from documents into searchable data have not been realized in the SEC’s disclosure regime.”
What little data exists on XBRL use doesn’t look good. A Columbia study from 2012 found that less than 10% of investment professionals use XBRL, and there’s nothing to suggest that number has improved in the past few years.
Unfortunately, the relative apathy of most investors towards XBRL means it might struggle to realize these theoretical benefits. A bill that recently passed the House of Representatives and has gone to the Senate would exempt companies with less than $250 million in revenue from having to use XBRL. The argument is that the cost of reporting in this manner is too expensive for smaller companies given the small number of investors that actually utilize it.
We hope that this bill does not become law. For all its flaws, XBRL still has the potential to revolutionize investment research and financial reporting. If the data becomes more reliable and easier to use, it would not just simplify our own business, it would make financial data more transparent and easier to understand for the broader market. Better data means a more efficient market and a better allocation of capital, which is critical to the long-term health of our economy.
We’re excited that the data has come far enough that we can start using it in our models, but it still needs to improve a great deal if it is going to have a real benefit to the broader market. Hopefully the SEC will put enough effort and resources into improving the taxonomy and enforcing accuracy by reporters so that we can all reap the rewards of this technology.
Disclosure: David Trainer and Sam McBride receive no compensation to write about any specific stock, sector, style, or theme.
Photo Credit: Kevan (Flickr)