mountains of telecom data for crowd fun

Huge archives of files containing U.S. local-exchange telephone companies' service volumes, rates, and revenue from 1992 to 2009 are now available for collaborative reformatting, organizing, and analyzing.  U.S. local-exchange telephone companies that the Federal Communications Commission regulates via price caps publicly file annual tariff review data.  These data include service volume (demand) and rates for every interstate access services provided under price-cap regulation.  The data also include various aggregations of service revenues as well as price indexes used within price-cap regulation.

The source archives consists of standardized (by year) Tariff Review Plans (TRPs) and ad-hoc rate detail files.  The files are specific to a filing year (1992-2009) and a price-cap-regulated telephone company service area.   The filings consist of annual tariff review filings (usually from about June 15 of a given year), as well as some additional filings (revisions, restructurings) in the these formats.  Some filings for years prior to 2003 could not be located.  The original source files are in Lotus 1-2-3 .wk3 and .wk4 format.  The price-cap archive contains 2,946 Lotus 1-2-3- files (with a few other format files) and has a total uncompressed size of 4.04 gigabytes.  The rate-detail archive contains 2,473 Lotus 1-2-3 files and has a total uncompressed size of 1.39 gigabytes.

I've already made some of the data much more accessible, organized it, and done some analysis that illustrates its use.   I organized and categorized all the rate elements for Bell Atlantic and all the rate elements for US West from 1990 to 2009 and put them into tab-delimited text datasets.  I also created a tab-delimited text dataset of a section of the Tariff Review Plans (TRPs) for thirteen large,  historical telephone company service areas from 1992 to 2009.  You can find some of my analysis of the data in this blog's network connectivity category.

Much useful work remains to be done with the data.  One important task is to make all the source data more easily and universally accessible.  Neither OpenOffice.org Calc nor Microsoft Excel 2007 (nor the forthcoming 2010 Excel product) read Lotus 1-2-3 .wk3 and .wk4 files.  Microsoft Excel 2003 does open the files.  Lotus 1-2-3 can still be purchased, now at a suggested retail price of $313.  Of course, any conversion of the source files is likely to lose some data.  Note that the archives I've created are themselves neither official nor authoritative.   A useful format conversion for the data could aim only for the modest objectives of making the data more publicly accessible for exploratory analysis and for stimulating informed discussion of telecom policy.

The data would be more useful if it were better organized.  Structuring existing fields into records by company and year would enable many useful queries.  By following the models of the datasets I've already set up, anyone could make a small contribution by doing similar work for a small subset of companies and years.  Such individual contributions could easily be aggregated.  Since comparisons across companies and across years contributes to insight, individual contributions would be much more valuable in the aggregate.

Figuring out  and administering a good structure for managing the archives and contributions of work is also an important task.  This task seems similar to that of running an open-source software project.   The success of open-source software projects indicates both that the task is feasible and that expertise exists in doing it.

Many persons complain about telephone companies and criticize government regulation.   Here's an opportunity for these persons and anyone else to contribute to understanding better telephone companies and government regulation of these companies.  Many hands together could make quick work of reformatting, organizing, and analyzing these huge archives.

Tags: , ,

limitations of crowdsourcing

The brainpower of all human being around the earth is vastly underutilized. Organizing production to give more persons more opportunities to use their brains can make a huge contribution to the common good.

Crowdsourcing” describes some new production arrangements. An interesting example of crowdsourcing is InnoCentive. InnoCentive mediates between companies seeking solutions to R&D problems and persons around the world interested in solving problems. All kinds of persons with all kinds of training have succeeded in solving problems that have been difficult and costly for rigidly structured research organizations to solve.

This shouldn’t be surprising, notes Karim Lakhani, a lecturer in technology and innovation at MIT, who has studied InnoCentive. “The strength of a network like InnoCentive’s is exactly the diversity of intellectual background,” he says. Lakhani and his three coauthors surveyed 166 problems posted to InnoCentive from 26 different firms. “We actually found the odds of a solver’s success increased in fields in which they had no formal expertise,” Lakhani says. He has put his finger on a central tenet of network theory, what pioneering sociologist Mark Granovetter describes as “the strength of weak ties.” The most efficient networks are those that link to the broadest range of information, knowledge, and experience.
[from Wired]

Academic disciplines are largely cartels for dividing up the knowledge market, lessening intellectual competition, and facilitating symbolic claims to authority. Broader, more fluid organizations of intelligence can make a major contribution to creating replicable, instrumental solutions to practical problems.

This kind of production arrangement has some important limitations. In many cases, persons and organizations don’t recognize the most important problems that they need to solve. Defining the problem is nine-tenths of the solution. That’s a cliché. It’s also true. If you don’t understand what the key problem is, you can’t get someone to solve it. This situation is pervasive in the communications industry.

In addition, for many business problems, solutions are quite difficult to evaluate. Solutions to the generic problem, “how to make a lot money quickly,” can be intelligently dismissed with little effort. Recognizing neglected, decision-relevant knowledge for narrower problems of mundane human behavior (economics) can be simply a matter of logic. But recognizing such knowledge can also require wisdom. Crowdsourcing cannot solve the problem of distinguishing between wisdom of crowds, and folly of crowds.

Tags: , ,