Welcome to StockCentral’s Workshop about Data Sources.
This workshop is based on a report “Electronic Data: Different Sources, Different Numbers, All Correct,” by Ellis Traub. (Much of the text of this report is extracted directly from that report, and that report will be available in the StockCentral Learning Library) Ellis undertook the research that led to this report to examine the data service provided by StockCentral and to identify and understand differences between the results provided by three different versions of Take Stock.
Before digging into data, though, let us first consider all of the software that we, fundamental stock investors, use in our decision making. These software programs are available:
· From ICLUBcentral (and BetterInvesting)
o Investor’s Toolkit 5th Edition
o Stock Analyst 3
o Classic Plus
· From ICLUBcentral
o ICLUB Take Stock
o Take Stock Online (at StockCentral.com)
· From other companies
o Churr’s Stock Investment Guide (www.churr.com)
o Intuitec’s Stock Analysis + Portfolio Review(intuitecsoftware.com)
All of these programs require fundamental stock data, primarily in the form of SSG data files, to perform stock analysis.
In the past, investors used Value Line Reports, S&P Tear Sheets, or even annual reports to collect the data necessary to do the required analysis. In those days, astute investors would notice that a Value Line report did not always agree with an S&P Tear Sheet. Often, both of these sources of data would be found to differ from the data provided in an annual report. The question always was “Why?”
Fortunately, the days of manually transcribing printed data is over. We now have three sources of electronic data with which to work:
· NAIC’s OPS/SDS data, provided by S&P Compustat;
· StockCentral’s data, provided by Hemscott; and
· AAII’s Stock Investor Pro data, provided by Reuters.
Each of these services (they are subscription services) provides the data required for stock analysis. Sometimes the data sources are separate from the software. Sometimes a specific program is tied to a specific data source. This may be especially true for software programs that download data automatically from subscription services or that update data from online sources.
[Note that any of these programs can use any source of data, but this usually requires that the needed SSG data files have to be downloaded to the users computer, and then the software can import or update the data from the downloaded data files. For the purposes of this discussion, this cumbersome process is not considered because most users will not go to the lengths necessary to aquire the data in this manner.]
The table identifies the data source used by each software program.
|
Software Program
|
Data Source Used
|
|
Investor’s Toolkit 5th Edition
|
StockCentral data (automatic download)
BI Stock Data Service (automatic download)
|
|
Stock Analyst 3
|
BI Stock Data Service (automatic download)
|
|
Classic Plus
|
BI Stock Data Service (automatic download)
|
|
Take Stock (ICLUB version)
|
StockCentral data (automatic)
|
|
Take Stock ( Online at StockCentral.com)
|
StockCentral data (automatic)
|
|
Take Stock (BetterInvesting version)
|
BI Stock Data Service
|
Only Investor’s Toolkit 5th Edition offers a choice of data sources. The StockCentral data is also used in the production of both of StockCentral’s services which provide investors with a list of stocks suitable for consideration: the Complete Roster of Quality Companies (NAIC), the on-line Screener (StockCentral), and the various other reports and analyses provided at StockCentral.
Why isn’t data just data?
It was noted earlier that, even in the days of manual data entry, that astute investors often questioned the differences between data available from Value Line, S&P, and printed in annual reports. Today, electronic data has become more prevalent and competitive. The result of this increase in sources and products has caused considerable confusion because the user can now compare easily the results of using one source of data with those of another. And, the results are often different—in some cases radically different.
How can this be? If the data reported by companies is as strictly regulated as it seems to be, and if the SEC is as demanding as we would hope them to be, shouldn’t the numbers all be the same? And, therefore, shouldn’t the results be as well?
The answer is that the “strict” regulation by the Financial Accounting Standards Board (FASB), the body that sets the standards for accountants and analysts with respect to reporting financial results, provides considerable latitude in the manner in which data is reported. And there is considerable room for interpretation between the strict boundaries they set.
Compounding that opportunity for differences is the fact that each data provider has a slightly different profile for its typical client; and these clients run the gamut from the accounting profession (which is concerned with historical accuracy) to the financial analysts (whose concern is forecasting the future). Where the provider who serves primarily the former will diligently include every financial event in the final report, those serving the latter will work hard to exclude from the mainstream reporting those values that are not likely to recur on a regular basis in the future. The former type of client wants to see what actually happened, while the latter is more interested in pro forma figures.
There are also differences of opinion, from provider to provider, as to what data should and should not be considered likely to recur. And, there are also differences in which line items from the income statement or balance sheet should comprise revenues or which should be mapped to extraordinary or non-recurring items—each requiring different handling according to FASB rules.
As if that isn’t enough, the data providers do not produce the files that are used by the ultimate user. Those entities, NAIC, StockCentral, and AAII, have the responsibility for selecting the required data from the huge databases to which the providers give them access. And it is they who ultimately decide which of the data should appear in the SSG files, or the limited databases from which screening is performed.
While there is room for error at each of these levels, most of the data we get is correct, even when it differs from its competitor’s data. However, as you will learn, it isn’t all correct; and not all of it is even reliable. It is therefore prudent to be alert for both raw data and the results derived from that data that stand out and appear questionable. We strongly recommend that due diligence be exercised in all such cases and the data challenged rather than blindly accepted at face value.
The Problem
What first aroused our interest, and then provided the keen motivation for doing this study was the fact that the Complete Roster of Quality Companies (CRQC), using the NAIC’s data, differed so radically from the result using the StockCentral Screener to generate a list of companies that were either desirable or at least acceptable. What was so vastly different was the Quality Index (QI), a metric used in both cases to score the quality of a company on a scale from 1 to 10.
Being certain that the algorithms in each of the screening processes was the same—one, in fact, copied directly from the other—it could mean only that the data that produced the QI differed sufficiently to alter the result.
Since the data items that were used to produce that index were the most significant items in our data files, and since the calculations used to make up the components of the QI would tend to magnify any differences in the actual data points, we determined that a good sample of data to use for the study would be the data for those companies that were included in either the CRQC or the StockCentral Screening product.
Typically, there would be upwards of 120 companies in either group. However, combining the two lists, we found differences sufficient to make the list grow to more than 160 companies. This meant that approximately 40 companies made each list that did not appear on the other! Another way to view this was that fully a third of the companies on one list did not make the cut for the other. They were too small, too young, unavailable, or the data used in the calculation of the QI was sufficiently different to produce a poor result in one instance and a good result in the other.
This was a serious concern as it would undermine any normal user’s confidence in these products, in the validity of those lists, and ultimately in the quality of the data that was used for those lists. And, it was difficult to overcome a bias, since the NAIC data had been used for so long by so many—having replaced the Value Line data as the “gold standard” for most users. Thus, the StockCentral data, as the “new kid on the block,” would likely be viewed as the data most flawed.
It was necessary, therefore, to come up with a valid, unbiased, and completely objective way to measure the quality of each data source. This study is the result of that effort.
So, stay tuned, and tomorrow we will dive directly into the data analysis.
|