knowing when to re-optimize your website

Google recently made website optimization freely available to everyone. Perhaps this truly is a key element in Google's plan for world domination. But in any case, it's good news for those times when you want to choose data analysis over whimsy and hope.

A danger of free, fast, and easy testing might be that its users will become complacent about the background structure of testing and optimization. Statistical tests always depend on background assumptions about data-generating processes. For example, when the number of users of a website increases by orders of magnitude, the attributes of its users (age, sex, internet experience, etc.) may change. What had been an optimal content design may no longer be optimal.

Economists have long pondered problems of behavioral estimates. To get a sense of the issues, suppose that you observe that over the past ten years both the price of widgets and aggregate sales of widgets rose strongly. That doesn't mean that raising the widget price will increase widget sales. More likely the opposite is true. Increasing demand for widgets may account for the correlation between higher prices and greater sales.

The problems for human behavioral estimates are far worse than those of omitted variable biases in other physical systems. Humans are good at collecting and sharing information, and humans often change their behavior in response to new information, new rules, and new incentives. In economics, the Lucas critique is a famous recognition of this reality for parameter estimation.

Among empirical economists, a commonly expressed goal is to estimate "structural parameters." Structural parameters are interpretable parameters relatively insensitive to plausible changes in the environment and rules of the game.[1] When you optimize with respect to such parameters, your optimization is more likely to be valid over time. Moreover, since you can interpret the parameters, you can anticipate changes in the environment that are likely to invalidate your optimization.

Economists have produced a huge, complex literature on structural estimation. But probably all you really need to know from it for website optimization are a few points:

  • Website optimization is not forever. Changes in circumstances (New Year's Day promotions won't work year-round), changes in user characteristics, changes in the experience of users with other websites, and changes in your business reputation may change the optimal configuration of your website.
  • Seek some understanding of what works for you. Testing can help you discover what works. But sound interpretation and knowledge remain valuable. Understanding why a particular configuration works can give you insights into when it might need to be changed (or re-tested). In addition, understanding what works might provide you with more general insights into your users.
  • Expect your users to learn and respond to what you do. If an aspect of your optimization involves baiting and screwing your users, they will learn about it and adapt to it. Your optimal screw of your users can turn to screw you.

Note:

[1] To economists, parameters describing preference functions (demand) and cost functions (supply) are structural parameters. That economists consider parameters describing preferences and technology to be relatively stable may seem laughable to persons engaged in viral marketing and rapid web service development.

Tags: , , ,

Wednesday's flowers

pink flowers blue sky

real indignities

Richard Evelyn Byrd held the title Rear Admiral in the U.S. Navy. He claimed the first North Pole flight in 1926, flew non-stop across the Atlantic in 1927 (less than two months after Charles Lindbergh's celebrated transatlantic flight), and flew over the South Pole in 1929. Byrd received the U.S. Medal of Honor.

A statue of Byrd stands across from the main gate for Arlington National Cemetery. The base of the statue declares "UPON THE BRIGHT GLOBE HE CARVED HIS SIGNATURE OF COURAGE".


Richard Byrd, famous aviator and explorer

Post-modern thinkers like to think that human beings make the world with their representations. But don't be deluded. To a real bird, a statue of Byrd is just another rock.

Tags: , , ,

mutation happens

flower mutation in the wild

I found this beautiful mutation in a bed of flowers gone wild.

different perspectives

Koi Currents, Reiko Sudo, from the inside

Koi Currents, Reiko Sudo

making art

recording prohibited at The Cinema Effect exhibition

The Cinema Effect: Illusion, Reality, and the Moving Image, Part I: Dreams, is on view at the Hirshhorn Museum through May 11, 2008. The exhibition brochure explains:

The cinema was the unrivaled art form of the twentieth century. Film, as well as later incarnations like television and the internet, has penetrated to the culture's core so that the very boundaries between "real life" and make-believe have become at least blurred, if not indecipherable.

No photography or recording allowed.

Today, the cinema is everywhere -- it is in the way we perceive our world, in the way we speak, in the way we dream. We have no need of entering a movie theater to experience cinema; life itself is just like a movie.

No photography or recording allowed.

The exhibition brochure includes a quote from Stephen Fry's book, Making History. Here is the quote, with a minor substitution:

When you walk along the street, you're in a little bottle of oil; when you have a row, you're in a little bottle of oil....When you skim stones over the water, buy a newspaper, park your car, line up in a McDonald's, stand on a rooftop looking down, meet a friend, joke in the pub, wake suddenly in the night or fall asleep dead drunk, you're in a little bottle of oil.

Decide what to do next weekend. Whatever you decide to do, and even if you don't decide to do anything, that weekend will be over soon.

Tags: , , ,

data representations

The government of Washington, DC, provides real-time data feeds on crime, building permits, housing code enforcement, public space permits, and property registrations. The data includes geo-codes so that it can be easily mapped.

The terms of use for these important data resources state:

Neither the District of Columbia Government nor the Office of the Chief Technology Officer (OCTO) makes any claims as to the completeness, accuracy or content of any data contained in this application; makes any representation of any kind, including, but not limited to, warranty of the accuracy or fitness for a particular use; nor are any such warranties to be implied or inferred with respect to the information or data furnished herein.

So you can look at the data it provides, but the DC government makes no claims as to the "content of any data." In presenting these data, the DC government does not make "any representation of any kind." If that were literally true, then there is no data. If there's no data, then there's no liability. That's a clever way to share data to foster a more informed public.

In the terms of use for this blog, I had set out a less artful disclaimer. I've now appended to it essentially the above text. Read it and weep for our legal culture.

Tags: , , ,

Thomas Jefferson's library

Hard-working volunteers have recently entered Thomas Jefferson's personal library into LibraryThing. The U.S. Congress purchased Jefferson's library in 1815 to replace to the library Congress lost when the British Army burned the Capitol in 1814.[1] Thus Jefferson's library on LibraryThing documents the library that formed Congress's new library in 1815.

Although Jefferson was at the forefront of intellectual and political life of his time, his library contained rather old books. In 1815, 90% of his library books were printed more than a decade earlier. Half of his library books were printed more than 35 years earlier, that is, prior to 1779. In terms of Jefferson's life (he was 72 years old in 1815), half the books in his library were printed before he reached 36 years of age. The formation of the U.S., the French Revolution, the rapid growth in book production, and Jefferson's two terms as president did not put a large share of books into his library.[2]

Jefferson's books were old in relation to movements in the book trade. In the U.S. at the beginning of the nineteenth century, book sellers kept a print edition for sale for perhaps a decade.[3] In both Britain and the U.S., book production, particularly that of fiction, grew strongly relative to macroeconomic trends from about 1780. In contrast, the number of books per publication year in Jefferson's library trends downward from the mid 1780s.

The publication dates of books in Jefferson's library indicate that as he grew older, he acquired a larger share of books addressing current affairs and applied technology. The table below gives the median publication dates of Jefferson's books by book categories. Apart from newspapers, agriculture formed the most current category. That's consistent with Jefferson's interest in fostering an agricultural nation of yeoman farmers. The most dated category consisted of ecclesiastical law and history. Human efforts to build a city of God did not interest Jefferson. His library also show relatively little interest in new developments in poetry and fine arts (Romanticism) and in fiction (novels). Jefferson, in short, was a founding policy wonk.

Categories of Books
in Thomas Jefferson's Library, 1815
Catogory of Books # Titles Median Pub. Year
Newspapers 47 1797
Agriculture 133 1795
Medicine 141 1791
Politics 1194 1790
Technical Arts 131 1788
Natural Philosophy 189 1786
Mathematics 123 1784
Astronomy 36 1770
Geography 335 1768
Polygraphical 44 1768
Tales and Fables 73 1766
Religion 260 1764
Ethics and Morals 211 1762
Law 654 1762
History 550 1761
Language 261 1760
Fine Arts 88 1757
Poetry 274 1756
Ecclesiastical 44 1700

Compared to when it purchased books for its library in 1800, Congress paid a higher price for older books when it purchased Jefferson's library. In 1800, Congress paid $2.97 per book for 740 books that it purchased from a London bookseller. This price includes the cost of packaging and shipping the books from London. The bookseller wrote:

we earnestly hope the books will arrive perfectly safe, great care having been taken in packing them. We judged it best to send trunks rather than boxes, which after their arrival would have been of little or no value. Several of the books sent were only to be procured second-handed, and some of them, from their extreme scarcity, at very advanced prices. We have in all cases sent the best copies we could obtain and charged the lowest prices possible.[4]

In 1815, Congress paid Jefferson $3.69 per volume for 6487 volumes, plus at least an additional $0.10 per volume for packing and shipping from Monticello, Virginia.[5] Adjusted for inflation, this price is about 20 cents higher per volume than the price per volume for the books purchased in 1800. Apparently all but several of the books purchased in 1800 were new books, which probably means that they had been printed within the previous decade. In contrast, 90% of Jefferson's books were printed more than a decade earlier.

Older, higher priced books are not necessarily worse than newer, lower-priced books. Older books might be more scarce than newer books, and hence more valuable. Book prices varied greatly depending on the size of the book, the quality of its binding, the paper used, and engravings included in the book. Average price and median age are important descriptions of a collection of books, not inverse measures of the attractiveness of purchasing a collection.

Congress's purchase of Jefferson's library was quite controversial. One Congressman opposed the bill to purchase the Jefferson's library with this argument:

the library contained irreligious and immoral books, works of the French philosophers, who caused and influenced the volcano of the French Revolution, which had desolated Europe and
extended to this country. [The Congressman stated that he] was opposed to a general dissemination of that infidel philosophy, and of the principles of a man [Jefferson] who had inflicted greater injury on our country than any other, except Mr. Madison. The bill would put $23,900 into Jefferson's pocket for about 6,000 books, good, bad, and indifferent, old, new, and worthless, in languages which many can not read, and most ought not; which is true Jeffersonian, Madisonian philosophy, to bankrupt the Treasury, beggar the people, and disgrace the nation.[6]

Representatives also put forward less partisan reasons for not approving the bill to purchase the library:

Others, among whom were a number of the political and personal friends of Mr. Jefferson, opposed the bill on the ground of the scarcity of money, and the necessity of appropriating it to purposes more indispensable than the purchase of a library; the probable insecurity of such a library placed here; the high price to be given for this collection; its miscellaneous and almost exclusively literary (instead of legal and historical) character, etc. [7]

The bill to purchase Jefferson's library was narrowly approved: 81 votes for, 71 votes against.

The accumulation of knowledge doesn't happen automatically. Jefferson's library testifies to the personal effort and political controversy associated with accumulating knowledge in the early nineteenth-century. Many signs in the twenty-first century indicate that personal effort and political controversies remain at the center of knowledge accumulation.

Additional data notes:

I calculated the publication year statistics given above by extracting the Jefferson library from LibraryThing, cleaning it slightly, and recoding the (first) category for each book. Here's a year-by-year summary of the publication dates and book counts. For more detail, here are the individual records with the tab-delimited fields author, title, publication year, original category, and recoded category. For records that included a publication year range, the year in this dataset is the average of the given range. For years that included a "-" (such as "178-"), I've replaced the dash was the midpoint of the dashed range. The recoded categories are historically appropriate terms, but are not necessarily consistent with Jefferson's hierarchical organization of categories.

The dataset excludes 101 records that do not include a publication year. These records do not appear to have a strong publication-year bias. Given the large total number of dated records (4788), excluding a 102-record sample with even some date bias probably wouldn't effect aggregate statistics much. Note, however, that the LibraryThing Jefferson stat page gives an average publication year of 1754. I calculate an average year of 1756. If the LibraryThing average uses a zero-value for records with no publication year, that might account for the lower LibraryThing average.

The median is a more easily interpretable summary statistic for publication years. The average can depend significantly on a few outliers, e.g. a few very old books. I suggest that LibraryThing replace on its member stat page the "average" publication year line with a "50% of your books were printed before" line.

As with any data source, you should cross-check and evaluate the datasets I have shared for possible mistakes. Sharing data helps to advance knowledge, and I encourage everyone to do so.

More historical data on libraries.

Reference notes:

[1] Some evidence indicates that many books were not destroyed in the fire, but were lost after being removed from the building in 1814. See Johnston (1904) pp. 66-7. For another example of losing valuable federal government property in a nineteenth century fire, see this discussion of the Smithsonian fire of 1865.

[2] When Jefferson was President, he suggested books to be purchased for the library of Congress. See Johnston (1904) p. 37. Thus government purchases may have substituted for Jefferson's personal purchases. Note also that some of Jefferson's books may not have been included in the library he sold to Congress in 1815. Jefferson described his library as containing "between nine and ten thousand volumes." See Johnston (1904) p. 70. The library he sold to Congress consisted of 6,487 volumes.

[3] Amory (2000) p. 198.

[4] Johnston (1904) p. 24, quoting a letter apparently from Cadell & Davies, the London bookseller. See id. pp. 24-5 for the total cost. The items procured for that sum probably also included three maps.

[5] On packing and shipping costs, see id. p. 104.

[6] Id. p. 86, quoting Representative Cyrus King of Massachusetts.

[7] Id., general text. Some argued that Jefferson's library was worth $50,000; others stated that such a library "might be bought in any of the large cities for half the money."

References:

Amory, Hugh (2000) "A Note on Imports and Domestic Production," in Hugh Amory and David D. Hall, The Colonial Book in the Atlantic World (Cambridge: Cambridge University Press, 2000)

Johnston, William Dawson (1904), History of the Library of Congress : volume I, 1800-1864 (Washington: GPO).

Tags: , , ,

an economist, a bureaucrat, and a poet

The economist:

the ideas of economists and political philosophers, both when they are right and when they are wrong, are more powerful than is commonly understood. Indeed the world is ruled by little else. ... I am sure that the power of vested interests is vastly exaggerated compared with the gradual encroachment of ideas. Not, indeed, immediately, but after a certain interval; ... soon or late, it is ideas, not vested interests, which are dangerous for good or evil. ... Practical men, who believe themselves to be quite exempt from any intellectual influences, are usually the slaves of some defunct economist.[1]

The bureaucrat:

A poet in our times is a semi-barbarian in a civilized community. He lives in the days that are past. His ideas, thoughts, feelings, associations, are all with barbarous manners, obsolete customs, and exploded superstition. The march of his intellect is like that of a crab, backward. ... But in whatever degree poetry is cultivated, it must necessarily be to the neglect of some branch of useful study: and it is a lamentable spectacle to see minds, capable of better things, running to seed in the specious indolence of these empty aimless mockeries of intellectual exertion. Poetry was the mental rattle that awakened the attention of intellect in the infancy of civil society: but for the maturity of mind to make a serious business of the playthings of its childhood, is as absurd as for a full-grown man to rub his gums with coral, and cry to be charmed to sleep by the jingle of silver bells.[2]

The poet:

Whilst the mechanist abridges, and the political economist combines labor, let them beware that their speculations, for want of correspondence with those first principles which belong to the imagination, do not tend, as they have in modern England, to exasperate at once the extremes of luxury and want. ... The rich have become richer, and the poor have become poorer; and the vessel of the State is driven between the Scylla and Charybdis of anarchy and despotism. Such are the effects which must ever flow from an unmitigated exercise of the calculating faculty. ... Poets are the hierophants of an unapprehended inspiration; the mirrors of the gigantic shadows which futurity casts upon the present; the words which express what they understand not; the trumpets which sing to battle, and feel not what they inspire; the influence which is moved not, but moves. Poets are the unacknowledged legislators of the world.[3]

The bell rings, the curtain drops, the conference ends, so much say so. But what is truth?

poetic crab

Google provides the means for useful study. The table below shows the number of search results returned for various search strings. The column "Top Match" gives the rank of the first result that directly references the relevant quote above. Some facts:

  • Poets far outdistance economists and bureaucrats in generating results. Poetry apparently is a common defense in human life. The relatively poor showing of bureaucrats suggests a need to increase public appreciation for bureaucrats.
  • "Economists ideas power 'vested interests'" tops "poets legislators." This result indicates that economists' words have been more fecund than poets'.
  • Poetry "useful study" shows few results, but that does not seem to be associated with results from "specious indolence."
  • Persons seeking symbolic results should put the poor before the rich. "poor poorer rich richer" delivers about ten times as many results as "rich richer poor poorer." The poet, lacking the calculating facility, lacked this insight.
Google Search String Search Results Top Match
economists 12,000,000  
economists ideas power "vested interests" 222,000 2
ideas power "vested interests" 287,000 1
ideas "more powerful" "vested interests" 19,200 5
bureaucrats 3,810,000  
poets intellect crabs 7,350 2
poetry "useful study" 20,000 4
mind poetry "specious indolence" 6 1
poets 38,900,000  
poets legislators 141,000 1
poor poorer rich richer 1,050,000 >10
rich richer poor poorer 117,000 >10

 

Notes:

[1] Keynes, John Maynard (1936), The General Theory of Employment, Interest, and Money, Chapter 24, Sec. 5. I have re-arranged the order of the quoted sentences.

[2] Peacock, Thomas Love (1820), "The Four Ages of Poetry," in Ollier's Literary Miscellany. Peacock worked for about 37 years as a clerk in the East India Company.

[3] Shelley, Percy Bysshe (1821), "Defense of Poetry," circulated in manuscript, but first published in 1840. Peacock and Shelley were close friends. Shelley's "Defense of Poetry" was a response to Peacocks "The Four Ages of Poetry."

Tags: , , , , ,

looking forward

nor looking behind, nor sideways for aye unsought

Next Page »